Using the built-in Pandas function to export a DataFrame to an HTML table is quick way to export your DataFrame but it does has some limitations. Using a for loop to create your HTML table allows you to add any custom styling or CSS classes for enhanced formatting. In this post, I’ll walk you through how to do that.
Converting a DataFrame to HTML using Pandas .to_html()
The pandas.DataFrame.to_html() allows you in one line of code to convert your DataFrame into an HTML table.
Here is a sample DataFrame:
import pandas as pd import os df = pd.DataFrame({'Fruit':['apples','oranges','pears','avocados'],'Price':[0.50, 1.12,0.85,1.90], 'Weight': [3.2, 5.6, 2.2, 3.1] }) df
To convert this to an HTML table, you can run: df.to_html('df.html',border=0)
Which results in an HTML table that looks like this when viewed in Chrome:
And will produce an HTML output that looks like this:
<table border="0" class="dataframe"> <thead> <tr style="text-align: right;"> <th></th> <th>Fruit</th> <th>Price</th> <th>Weight</th> </tr> </thead> <tbody> <tr> <th>0</th> <td>apples</td> <td>0.50</td> <td>3.2</td> </tr> <tr> <th>1</th> <td>oranges</td> <td>1.12</td> <td>5.6</td> </tr> <tr> <th>2</th> <td>pears</td> <td>0.85</td> <td>2.2</td> </tr> <tr> <th>3</th> <td>avocados</td> <td>1.90</td> <td>3.1</td> </tr> </tbody> </table>
Depending on your goals, this output may work great. However, I couldn’t find in the documentation how to add specific CSS classes to table rows or table data.
In the next section, I’ll walk you through how I solved for that.
Create an HTML Table from Scratch
One alternative the Pandas exporting to HTML is to loop through each cell of the DataFrame and build the HTML table yourself. This essentially is a way of creating an HTML document dynamically.
There are two parts to this strategy. The first is to dynamically build the HTML table. The second is to incorporate your table into a larger HTML document.
Loop Through each cell
In order to loop through each cell of a DataFrame dynamically, you can nest a for loop inside another for loop. The first loops through each row and the second loops through each column. Here’s an example.
for i in range(df.shape[0]): for j in range(df.shape[1]): print (i,j)
In this example, you can see how the variable i
will increment once we start a new row. That’s our queue for adding the HTML Table Row wrapper <tr></tr>
. The j
variable will cycle through each column. In my case, that’s my queue for adding classes because I want to align the columns in a specific way depending on their data types.
Next, I am going to use the for loops to create a function. I’ll explain the function in more detail below.
def create_html_table(x): row_data = '' for i in range(x.shape[0]): for j in range(x.shape[1]): if ((i % 2) != 0) & (j == 0): #not an even row and the start of a new row row_data += '\n<tr style="background-color:#f2f2f2"> \n <td class = "text_column">'+str(x.iloc[i,j])+'</td>' elif j == 0: #The first column row_data += '\n<tr> \n <td class = "text_column">'+str(x.iloc[i,j])+'</td>' elif j == 1: #second column row_data += '\n <td class = "number_column">'+str(x.iloc[i,j])+'</td>' elif j == 2: #third column and our last column in my example row_data += ' <td class = "number_column">'+str(x.iloc[i,j])+'</td> \n </tr>' else: row_data += '\n <td class = "number_column">'+str(x.iloc[i,j])+'</td>' return row_data
The row_data
variable on row 3 sets up an empty string to hold the HTML strings created by the for loops.
The for loop contains a series of conditionals and these are going to depend on how you want to format the data. For me, I want to have alternating row colors and align columns depending on their data types. I prefer to have my text data left aligned and my numerical data right aligned. Right aligning numerical data makes it a little easier to read when in a table because larger numbers expand to the left. Each column is represented by the variable j
so that’s what is used to apply my classes.
The nested for loop cycles through each cell of the DataFrame and will return the cell contents in the HTML Table Data wrapper <td></td>
. To pull the actual cell contents of the DataFrame, you’ll need to locate it in the original DataFrame by running x.iloc[i,j]
. The iloc
function is a way to select data based on its index position. The str
is added to in order to concatenate the numerical results to a string.
When j == 0
, that means we have reached the start of a new row. That’s why you’ll see I return the <tr>
wrapper in my conditional. When j ==
2 or, alternatively, when j + 1 == shape[1]
, that means we have reached the end of the data in the row and the conditional will return the closing table row wrapper </tr>
.
Next, you can run the function on your DataFrame. Then concatenate the results back into a larger HTML file. Here is an example of that:
html_table = create_html_table(df) html_file = ''' <!DOCTYPE html> <html> <head> <style> table { border-collapse: collapse; width: 30%; } th, td { padding: 6px; font-family: Helvetica, Arial, Helvetica; font-size: 12px; } .header { color: white; background-color: black; border-bottom:1pt solid black; } .text_column { text-align: left; } .number_column { text-align: right; } .even_row { background-color: #f2f2f2; } </style> </head> <body> <h1>HTML Table From a Pandas DataFrame!</h1> <div style="overflow-x:auto;"> <table> <thead> <tr class = "header"> <th class = "text_column">'''+df.columns[0]+'''</th> <th class = "number_column">'''+df.columns[1]+'''</th> <th class = "number_column">'''+df.columns[2]+'''</th> </tr> </thead> <tbody> ''' +html_table+ ''' </tbody> </table> </div> </div> </body> </html> '''
HTML Output as a File
To write the HTML table as a file, you can run this:
with open('df_from_loop.html', 'a') as f: f.write(html_file)
HTML Output to an Email
If you want to embed the HTML output into an email, you can use the below code. For more information on sending emails with Python, check out this post – Sending an HTML Formatted Email with Attachments through Gmail using Python
import os import smtplib import imghdr from email.message import EmailMessage email = os.environ.get('email_address') password = os.environ.get('gmail_python_app_password') from email.mime.multipart import MIMEMultipart #pip install email-to from email.mime.text import MIMEText from email.mime.image import MIMEImage from email.mime.application import MIMEApplication contacts = ['pauldesalvo@gmail.com'] msg = EmailMessage() msg['Subject'] = 'HTML Table From a Pandas DataFrame!' msg['From'] = email msg['To'] = (", ").join(contacts) msg.set_content('HTML Table From a Pandas DataFrame!') text_part = msg.iter_parts() text_part msg.add_alternative(html_file, subtype='html') with smtplib.SMTP_SSL('smtp.gmail.com', 465) as smtp: smtp.login(email, password) smtp.send_message(msg)
You can check out the full Jupyter Notebook on GitHub.
Thanks for reading!