In this post, I’ll show you how to connect Python to a remote MySQL database that has SSH encryption.
Applications like SequelPro, MySQL Workbench, etc. are great interfaces for your database. However, the applications will only show query results in a tabular form. With Python, SQL results can be joined to other external data sets and the results can be visualized using the MatPlotLib library.
I’ve added comments that correspond to the SequelPro connection fields. Line 31 contains the string used to write your query.
Instead of putting your credentials directly into your code, I highly recommend making them as environment variables. That way, you can hide your credentials from displaying in your code. Check out this post on Creating Environmental Variables.
import pymysql import paramiko import pandas as pd from paramiko import SSHClient from sshtunnel import SSHTunnelForwarder from os.path import expanduser import os home = expanduser('~') mypkey = paramiko.RSAKey.from_private_key_file('/path_to_key',os.environ.get('rsa_password')) #if your private key is not password protected, then remove everything after the path to your key # comments are the fields in SequelPro MySQL_hostname = '' #MySQL Host sql_username = 'readonly'#Username sql_password = os.environ.get('sql_password') #Password sql_main_database = '' #Database Name sql_port = 3306 ssh_host = '' #SSH Host ssh_user = '' #SSH Username ssh_port = 22 sql_ip = '1.1.1.1.1' with SSHTunnelForwarder( (ssh_host, ssh_port), ssh_username=ssh_user, ssh_pkey=mypkey, remote_bind_address=(MySQL_hostname, sql_port)) as tunnel: conn = pymysql.connect(host='127.0.0.1', user=sql_username, passwd=sql_password, db=sql_main_database, port=tunnel.local_bind_port) query = '''select * from company limit 5;''' data = pd.read_sql_query(query, conn) conn.close()
If successful, you can print(data
) and your query results will be returned as a dataframe!
Make The Connection a Function
Instead of copying and pasting the above code into every project, I save it as a function and import the .py
file into my project. Not only does it keep me from having the code in every project I’m working on, it also keeps my code tidy.
To turn this into a function, you can add def run_query(query_text):
, indent the body and addreturn data
at the end. I also make the actually query a parameter in the function.
def run_query(query_text): import pymysql import paramiko import pandas as pd from paramiko import SSHClient from sshtunnel import SSHTunnelForwarder from os.path import expanduser import os home = expanduser('~') mypkey = paramiko.RSAKey.from_private_key_file('/path_to_key',os.environ.get('rsa_password')) #if your private key is not password protected, then remove everything after the path to your key # comments are the fields in SequelPro MySQL_hostname = '' #MySQL Host sql_username = 'readonly'#Username sql_password = os.environ.get('sql_password') #Password sql_main_database = '' #Database Name sql_port = 3306 ssh_host = '' #SSH Host ssh_user = '' #SSH Username ssh_port = 22 sql_ip = '1.1.1.1.1' with SSHTunnelForwarder( (ssh_host, ssh_port), ssh_username=ssh_user, ssh_pkey=mypkey, remote_bind_address=(MySQL_hostname, sql_port)) as tunnel: conn = pymysql.connect(host='127.0.0.1', user=sql_username, passwd=sql_password, db=sql_main_database, port=tunnel.local_bind_port) query = query_text data = pd.read_sql_query(query, conn) conn.close() return data
Save this file in the same directory as your other projects. Running import db_connection.py
(or whatever you called your file) will import the function. Running df = db_connection.run_query('select * from accounts')
, for example, is all you need to do to generate the dataframe.
Final Notes
To find the path where your private key is, you just type ~/.ssh
into chrome and it will show the directory where it is saved.