Categorical variables in a list

Do you have a set of categories for each observation but each category is part of a text list that’s in a column of your data set? Here’s a handy pandas trick I’ve recently been reminded of for handling this situation:

import pandas as pd df_with_dummies = pd.concat([df_without_dummies.drop('categories_list_column_name', 1) , df_without_dummies['categories_list_column_name'].str.get_dummies(sep=";")], 1)

In this example, the column where each entry is a list of variables is “categories_list_column_name” in the dataframe df_without_dummies. The separator within those lists is “;”.

Previous
Previous

Reading about Deep Neural Networks

Next
Next

Machine learning as a service