Categorical variables in a list
Do you have a set of categories for each observation but each category is part of a text list that’s in a column of your data set? Here’s a handy pandas trick I’ve recently been reminded of for handling this situation:
import pandas as pd df_with_dummies = pd.concat([df_without_dummies.drop('categories_list_column_name', 1) , df_without_dummies['categories_list_column_name'].str.get_dummies(sep=";")], 1)
In this example, the column where each entry is a list of variables is “categories_list_column_name” in the dataframe df_without_dummies. The separator within those lists is “;”.