Data Analytics Simplified
Imagine that you are the host of “Help! I Wrecked My House!” But instead of navigating through the debris of a DIY home renovation gone awry, you’re diving headfirst into a chaotic world of spreadsheets, rogue data streams, and a jumble of mismatched tools. The mission? To declutter, organize, and automate workflows, laying down the solid groundwork necessary for efficient reporting and data science. This is the life of a data analytics engineer.
Here on this blog, I’ll share insights, tips, tricks, and a robust framework designed to tackle the ever-evolving challenges of data engineering and analytics. Given that every company and initiative comes with its unique set of requirements, and considering the dynamic nature of data, you won’t find any one-size-fits-all guides here. Instead, I aim to share my thought process and problem-solving strategies to help you identify the most effective processes and tools for your projects.
You might find yourself here because you:
No matter your situation, I’m here to equip you with the essential tools for your data analtyics toolkit, tailored specifically for the lean tech startup environment. Welcome!
A window function allows you to concisely compare rows in a single table.
In this post, I’ll walk through how to convert a Pandas column that is in seconds and convert it to a datetime or a formatted string.
This is a little Flask web app I made to get recommendations for things to do when traveling.
The Pandas package in Python allows you to generate a list of dates dynamically and then extract their attributes with various datetime functions.
This is a little trick I used to append new rows to a Pandas DataFrame. This method is similar to appending a new item to a list.
A Data Engineer’s primary focus is to assist companies in scaling their reporting capabilities beyond the limitations of spreadsheets. Automated systems are implemented to replace manual processes and import data from various sources, which is then transformed for easy visualization or use in data science models.
Pandas allow for almost anything as a column header and I’ll show you how to get your columns parquet and database ready.
Having consistent schemas between two Pandas DataFrames is essential when saving to Parquet and for merging operations.
Using pandas and the datetime module, you can dynamically get the last day of the month.