Sometimes it’s just easier to work with a single-level index in a DataFrame. In this post, I’ll show you a trick to flatten out MultiIndex Pandas columns to create a single index DataFrame.
To start, I am going to create a sample DataFrame:
df = pd.DataFrame(np.random.randint(3,size=(4, 3)), index = ['apples','apples','oranges','oranges'], columns=['A','B','C']) df
data:image/s3,"s3://crabby-images/55a97/55a97478eb675ae44f74eded5c841ebf6309ce14" alt=""
Next, I am going to aggregate the data to create MultiIndex columns:
df_grouped = df.groupby(df.index).agg({'A':['sum','mean'],'B':'sum','C':'sum'}) df_grouped
data:image/s3,"s3://crabby-images/345b7/345b7c05f6bbd2ef767b9018d7c38d1e2601d389" alt=""
Running the .columns
function, we can see that we have a MultiIndex column.
data:image/s3,"s3://crabby-images/a679c/a679c32fe940a8461de9dcd9dc956f90035ed40b" alt=""
Finally, to flatten the MultiIndex columns, we can just concatenate the values in the tuples:
df_grouped.columns = ['_'.join(col) for col in df_grouped.columns.values]
The final result will look like this:
data:image/s3,"s3://crabby-images/ca937/ca937a2b95213b8b688f762c021ffbdfe3da7fad" alt=""
If your columns have a mix of strings and tuples, then you can use the following:
['_'.join(col) if type(col) is tuple else col for col in df.columns.values]