A wide-angle image of an expansive road and highway network representing data engineering infrastructure. The scene features interconnected highways, bridges, and on-ramps, bustling with cars and trucks symbolizing data packets moving efficiently to their destinations. The design focuses purely on the roads and their complexity, with no text or labels, emphasizing seamlessness, reliability, and the integration of the network to deliver value effectively.

Insights, Not Infrastructure: The True Goal of Data Engineering

“No one wants to use software. They just want to catch Pokémon.” This quote from The Staff Engineer’s Path nails a key truth: people don’t care about the tools, just the results. In data engineering, this couldn’t be more relevant.

Business teams don’t want to wrestle with raw data or learn SQL; they want clear, actionable insights to guide decisions. As data engineers, our job is to build the pipelines and leverage technology as the means to an end—delivering real insights and driving value. Whether it’s boosting a marketing campaign, improving user experience, or uncovering customer trends, everything we build should drive real outcomes.

In this post, I’ll break down two key ideas: (1) People Want Insights, Not Raw Data, and (2) Technology Is a Means to an End. By focusing on these, we can build data systems that actually help people achieve their goals.

Technology Is a Means to an End – An Analogy

Building data systems is like designing a public transit network. Riders (end users) don’t care about the engineering details of the buses, trains, or tracks. They just want to get from Point A to Point B quickly and comfortably. In the same way, end users don’t want to know about your Spark jobs or ETL pipelines—they want reliable insights that help them make decisions.

For a transit network, success isn’t measured by the type of buses used or the length of the track; it’s about whether the system works, is reliable, and is cost-effective. The same is true for data systems. Pipelines and tools are just the infrastructure. Their value lies in enabling users to reach their goals without delays or complications.

As data engineers, our role is to ensure the systems we build are as seamless and dependable as a well-designed transit system. That means focusing on efficiency, scalability, and user-centric design. If a transit rider shouldn’t have to worry about how the train is powered, then our users shouldn’t have to think about how their dashboards are populated—they just need to trust that the insights will be there when they need them.

Turning Raw Data into Insights Is Hard

The easiest thing to provide to stakeholders is a table of raw or aggregated data. You’re giving them what they asked for, right? Sure, but now they have a bunch of homework to do. The real question to ask is: What are you trying to answer with this data? How is this data supposed to help with a decision?

Analytics is about taking raw data and turning it into something that answers specific questions. It should highlight anomalies or valuable trends that guide decisions. For example, when working with a customer support team, it’s crucial to identify spikes in tickets and the reasons why users are reaching out. Additionally, seeing how data trends compare to staffing levels is critical to ensure SLA compliance. Providing stakeholders with a raw list of tickets and hundreds of columns for slicing and dicing the data isn’t solving the problem—it’s giving them more work and risks the insights being overlooked.

However, providing actionable insights is challenging due to both organizational and technical hurdles:

Organizational Hurdles:
  1. Business stakeholders often struggle to define requirements and articulate clear business questions, leading them to request direct access to primary databases as a fallback.
  2. Stakeholders may lack the technical literacy to understand what data is available or feasible to analyze. Instead of focusing on the business question they’re trying to answer, they get caught up in the technical details of the request.
  3. Misaligned priorities between business and technical teams can lead to delays or a focus on the wrong metrics.
Technical Hurdles:
  1. Where to start? With countless ways to slice and analyze data, knowing where to begin can feel overwhelming. The challenge grows as more data sources are added to a warehouse or lake, creating endless possibilities for joining and analyzing data to uncover trends. 
  2. What’s important constantly changes. At one moment, analyzing how a new feature is performing is the top priority; the next, people lose interest and move on. Meanwhile, a random feature or setting suddenly spikes in usage, but the team only notices after it takes down the system. Reporting often turns into a reactive cat-and-mouse game.
  3. The data set is too narrow. I worked on a virtual events platform where participants had video chats with recruiters. We only knew that a video chat happened, how long it lasted, and the candidate’s rating. Without knowing the content of those chats, we couldn’t analyze recurring themes or address candidate concerns effectively.

Turning raw data into insights requires not just technical skills but a deep understanding of the business context. By addressing these hurdles, data engineers can move beyond delivering raw data and start delivering true value.

How to Avoid Sending Raw Data

  • Ask the right questions. Start by asking the business stakeholder, “What specific business questions are you trying to answer with this data?” Don’t settle for vague answers—peel back the onion to uncover the real problem. Repeatedly asking “Why?” can help get to the root of their needs.
  • Put yourself in their shoes. Before delivering data, ask yourself, “If I received this, would it be useful or would I need additional context?” If the data wouldn’t make sense to you, chances are it won’t make sense to the requester either. Always aim to deliver actionable insights, not just raw numbers.
  • Invest in a flexible reporting layer. Real-time collaboration is key. Use dashboards with colleagues to verify if they’re answering the right questions. Quick iterations make it easier for stakeholders to provide feedback and refine the output. Lengthy feedback loops risk losing stakeholder interest and lead to decisions being made without data.

Proactive Insights: The Future 

It’s worth noting that this entire process of telephone exists because creating BI reports and getting insights out of data is completely manual. There are companies that are trying to be proactive and scan your data to provide insights which is being accelerated with Generative AI but the reality is that it’s not there yet and this technology has to be proven out further before companies turn data over to 3rd parties especially in the world of post GDPR. 

In an ideal world, data insights would emerge directly from the data layer, seamlessly integrated and ready when needed. Waiting for business requests or manual report configurations should be a thing of the past. 

Conclusion

Data engineering isn’t about building pipelines for the sake of it; it’s about empowering teams with the insights they need to make impactful decisions. By focusing on delivering clear, actionable insights instead of raw data, and by treating technology as a means to an end, we can bridge the gap between technical complexity and business value.

The future lies in proactive systems where insights are integrated seamlessly into workflows. While we’re not there yet, every step we take toward simplifying the process, understanding business needs, and fostering collaboration brings us closer to that ideal. As data engineers, our success isn’t measured by the tools we use but by the value we deliver.

Thanks for reading!


Posted

in

by

Tags:

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *