In this post, I’ll walk you through pulling Jira issues from the API using JQL with Python Pandas.
Check out the final complete code in this notebook.
Import Libraries
Here are the libraries we will need:
import requests from requests.auth import HTTPBasicAuth import json import pandas as pd import datetime as dt import time import numpy as np import re
Add Credentials
Here is where you would enter in your credentials in JSON format:
creds = { "url":"", "username":"", "api_token":"" }
The url
variable will look like this: something.atlassian.net. Once you are logged into Jira, you’ll see the url in the address bar.
If you don’t have an API token, you can create one by following these directions.
The Request Function
Here are the variables you will need to change
JQL
– Enter in the JQL you would like to use. You can test out your JQL in the Jira search functionalitystart_at
– The JQL will return a number of results and this parameter is set up to work like theLIMIT
function in SQLend_of_stream
– This is set to False and is needed to make the while loop work since we have to keep running the function utnil we get all the issuesissue_lst
– Empty list variable to hold the results with each run since we can only retrieve 100 issues at a timeretries
– variable that holds that number of retries until a threshold is reached. Max retries is set to 3 times as noted on line 7 below. This is just to make sure you don’t keep hitting the API and returning errors
def get_issues(): start_at = 0 end_of_stream = False issue_lst = [] retries = 0 while (not end_of_stream) and (retries < 4): url = f"{creds['url']}rest/api/3/search?startAt={start_at}&maxResults=100" headers = { "Accept": "application/json" } auth = HTTPBasicAuth(creds["username"], creds["api_token"]) query = { #TODO: add custom JQL here 'jql': '' } response = requests.request( "GET", url, headers=headers, params=query, auth=auth ) if response.status_code == 429: #rate limit hit print("Rate limited. Waiting 60 seconds...") retries += 1 time.sleep(60) elif response.status_code == 401: #unauthorized: print("Unathorized. Please update credentials") end_of_stream = True elif response.status_code == 403: #forbidden: print("Forbidden") end_of_stream = True elif response.status_code == 200: if response.json()['issues'] == []: end_of_stream = True else: issue_lst.extend(response.json()['issues']) start_at = response.json()['startAt'] + response.json()['maxResults'] # print(start_at) # print(end_of_stream) total_issues = response.json()['total'] time.sleep(0.2) else: raise Exception(f"Error code {response.status_code}: {response.json()}") end_of_stream = True return issue_lst
Create a Pandas DataFrame
Now that we have a Pandas DataFrame, we can easily convert the results using pd.json_normalize()
.
#Run the function issues = get_issues() #Create a Pandas DataFrame df = pd.json_normalize(issues) df.head()
Getting Custom Field Names
If you use custom field names, then the get request returns a bunch of data fields with customfield_123
. We have to run a separate request to get those:
url = f"{creds['url']}rest/api/3/field" headers = { "Accept": "application/json" } auth = HTTPBasicAuth(creds["username"], creds["api_token"]) response = requests.request( "GET", url, headers=headers, auth=auth )
Clean up column Names and Apply to DataFrame
This next code snippet allows us to convert the custom fields names into a DataFrame so that we can replace the column names with a more readable version.
custom_fields_dict = pd.json_normalize(response.json()).query("id.str.contains('customfield')", engine='python')[['id','name']].set_index('id').to_dict()['name'] custom_fields_dict
Now we replace the DataFrame ones:
new_columns = [] for i in df.columns: new_columns.append(text_replacement(i)) df.columns = new_columns df.head()
Final Thoughts
Check out more Python tricks in this Colab Notebook or in my recent Python Posts.
Thanks for reading!