How to Flatten MultiIndex Columns into a Single Index DataFrame in Pandas

Sometimes it’s just easier to work with a single-level index in a DataFrame. In this post, I’ll show you a trick to flatten out MultiIndex Pandas columns to create a single index DataFrame.

To start, I am going to create a sample DataFrame:

df = pd.DataFrame(np.random.randint(3,size=(4, 3)), index = ['apples','apples','oranges','oranges'], columns=['A','B','C'])
df

Next, I am going to aggregate the data to create MultiIndex columns:

df_grouped = df.groupby(df.index).agg({'A':['sum','mean'],'B':'sum','C':'sum'})
df_grouped

Running the .columns function, we can see that we have a MultiIndex column.

Finally, to flatten the MultiIndex columns, we can just concatenate the values in the tuples:

df_grouped.columns = ['_'.join(col) for col in df_grouped.columns.values]

The final result will look like this:

If your columns have a mix of strings and tuples, then you can use the following:

['_'.join(col) if type(col) is tuple else col for col in df.columns.values]

Posted

November 4, 2020

Python

Paul

Tags: