Automate Smarter. Scale Faster.

August 24, 2023

Why Software Engineers Should Stop Stuffing Everything in MySQL

Aggregating data from multiple sources into a centralized place can be a challenging task when creating reports. In the early stages, many software engineering teams tend to rely on familiar tools, often their application databases. Since the majority of data for tech startups is generated from their apps, it may seem logical to incorporate additional…
Read More
July 31, 2023

Navigating SQL Hierarchies: Finding the Ultimate Parent

Untangling the web of parent-child relationships across multiple hierarchical levels can be challenging, yet it’s crucial for insightful data analysis. Frequently, we need to identify the apex of these hierarchies, the ‘ultimate parent’, in order to group data for analysis. However, the unpredictable number of levels within these hierarchies can complicate this task. In this…
Read More
June 30, 2023

Enhancing Data Accuracy: How to Fill Missing Date Gaps in Analysis with Python

Data gaps can occur when data is organized into time intervals but observations are missing for certain intervals. For example, let’s say you are tracking sales of snow shovels by month. Snow shovels are typically only in demand during winter months, so it is likely that there will be months with no sales at all.…
Read More
May 6, 2023

Predicting the Future of Business Intelligence: AI-Driven Innovations on the Horizon

Traditional BI approaches have primarily centered around manual report generation, focusing on historical numerical data. This often leaves business teams longing for insights and grappling with the complexities of unstructured text data. However, AI-powered tools are poised to reshape how businesses gather, analyze, and interpret data. In this blog post, I will dive into four…
Read More
April 26, 2023

Choosing Your Path in Data Engineering: The Buy vs. Build Dilemma Explained

As an application scales, data volumes and complexity grow, necessitating the need for scalable data infrastructure. Faced with this challenge, the decision between building a custom solution or purchasing a ready-made service is more than just a technical choice; it’s a strategic dilemma that significantly affects operational agility, cost efficiency, and long-term scalability. In this…
Read More
March 25, 2023

Exporting Database Tables to Parquet Files Using Python and Pandas

Managing MySQL databases can often be costly and time-consuming. If you’re working with databases containing static data, an effective alternative is to convert your database tables into individual Parquet files. By storing these files and leveraging Python for direct querying, you’ll maintain your existing querying capabilities and benefit from improved query performance, cost reduction, and…
Read More
March 14, 2023

From Data to Impact: 5 Vital Lessons for Startup Data Engineers

Working as a data engineer at a small startup can be an exciting, yet challenging, experience. The dynamic nature of startups requires data engineers to be agile and adapt quickly to ever-changing requirements. In this blog post, I will share five important lessons I’ve learned during my time as a data engineer at a small…
Read More
February 17, 2023

How to Quickly and Easily Translate Code to Different Languages with ChatGPT

A Data Engineer is commonly working across multiple data applications that require knowledge of SQL, Python, and Excel, to name a few languages. However, switching between these languages can be time-consuming, especially when it comes to translating complex Excel formulas to SQL statements, for example. I have been really impressed with how well ChatGPT can…
Read More
December 22, 2022

Deploy Your Next Flask App Instantly and for Free Using Replit

Replit is a free tool that makes it easy to write Flask code and deploy it instantly. They handle of all the underlying infrastructure, allowing you to focus on building and refining your app without worrying about setup and maintenance.
Read More