The default print view for a Pandas DataFrame can be limiting for larger datasets and can get in the way of a thorough review of the data.
Pandas Display Options
If you have a DataFrame longer than 60 rows, you may have experienced an output like this:
This compressed view may work fine if you wanted to do a quick check of your DataFrame. However, this view will not work when you need to check more rows or you have longer text data that gets truncated in a cell, for example. With a few lines of code, we can get closer to a spreadsheet view.
Here are the most common parameters I use to change the default pandas view options:
pd.options.display.max_columns = 250 #Changes the number of columns diplayed (default is 20) pd.options.display.max_rows = 250 #Changes the number of rows diplayed (default is 60) pd.options.display.max_colwidth = 250 #Changes the number of characters in a cell so that the contents don't get truncated (default is 50)
One caveat to setting these values is that the number you put in as the argument to that code snippet has to be longer than your DataFrame. So if your DataFrame is 200 rows long, then you have to run: pd.options.display.max_rows = 201
Interactive Tables in Google Colab
Google Colab has a magic function that allows you to easily generate an interactive data table with one line of code. After you run the following code snippet, you’ll get an interactive data table next time you print out your DataFrame!
%load_ext google.colab.data_table
Here’s an example:
import pandas as pd import numpy as np %load_ext google.colab.data_table df = pd.DataFrame(np.random.randint(100,size=(1000, 3)),columns=['A','B','C']) df
This view adds the ability to:
- Page through all your data
- Add filters
- Sort by clicking on the column names
In addition, there are some options to further customize the interactive data table. This next code snippet allows you to remove the index and change the number of rows displayed per page:
from google.colab import data_table data_table.DataTable(df, include_index=False, num_rows_per_page=10)
Lastly, to return to the default DataFrame view, you can run this code snippet:
%unload_ext google.colab.data_table
To see a live example, check out this notebook created by the Google team.
Export to a CSV
Some Python programmers (like myself) started the data science journey in Excel and can be more comfortable playing around with the data in a spreadsheet. To get the same experience, simply export your entire or a portion of your DataFrame to a CSV and open it in Excel or Google Sheets.
df.to_csv('data.csv')
Python Packages
There are several packages that exist that enhance the experience of working with Pandas DataFrames. I don’t have experience using them but would love to hear your thoughts. Feel free to comment below with your favorite package!
Final Thoughts
Check out more Python tricks in this Colab Notebook or in my recent Python Posts.
Thanks for reading!