Skip to content

Is Your Data Team Stuck Fighting Fires? Maybe They Should Be.

If hiring more data scientists and promoting self-serve reporting isn’t working, there’s an organizational structure that might.

As a data scientist, when you work in a line of business where certain factors are out of your control, it’s easy to be stuck in a position where it feels like you’re always fighting fires. Competitors can release new features requiring product teams to scramble for data to make an appropriate response. Fraudsters can breach your mitigations requiring a quick response to prevent the loss of your customer’s money. Even internally, strategy decisions made by executives can require sudden unplanned bursts of work.

If you’re experiencing things like this in your day-to-day work, you’re not alone. One of the most common complaints I’ve heard from colleagues over the years is that they are always in a state of scrambling to keep on top of new requests that are coming in, not giving them time to do the long-term work that they feel would really move the organization forward. Their hopes are that someone else could be fighting these fires instead of them. Often, self-serve BI tools and reverse ETL are seen as the solutions to this problem by enabling employees outside of the data function to do reporting. But I think this misses the bigger picture.

The truth is that in any size of organization, there is going to be unplanned work, and lots of it. While in some instances this might point toward disorganization and reflect underlying problems in leadership, often this isn’t the case. The factors mentioned in the opening paragraph are unavoidable parts of the job, and being able to respond quickly is a competitive advantage if it can be done in an organized way. Who should handle these situations, then? People that enjoy it!

Because most people don’t like the idea of running into a burning building, it can be hard to understand the mindset of those that do. Yet, plenty of people pick high-pressure careers that involve literally or figuratively doing exactly this. Similarly, while the average data scientist or analyst might experience severe stress from handling high-priority requests all the time, there are those that thrive on it. Maybe it’s the sense of helping others when they really need it, being close to the action, or the thrill of solving a new set of important problems almost every day that makes them enjoy it. Whatever the reason is, it’s important to find these people and let them do what they like to do.

How do you fit these people into the overall organization, and is the role different from a “regular” data science position? The answer is that it’s not a different role at all, but instead a different way of grouping existing roles together. One approach I’ve seen that works well is to have separate groups handle the important components of what areas you need covered:

  1. The Rapid Response team: a group of data analysts and scientists dedicated to finding and responding to issues as quickly as possible. Short-term solutions produced by this team are OK, as the emphasis is on speed using whatever tools are currently available.
  2. The Preemption team: a team of data scientists and analytics engineers focused on building models to reduce the amount of work that the Rapid Response team needs to do. Emphasis here is on long-term solutions that generalize and that are well built, high-performing and easy to maintain.
  3. The Engineering team: their job is to identify the tooling bottlenecks that impact both the Rapid Response and Preemption teams and fill those gaps. Every time they ship something the other teams get better.

With this structure, each team can hire people that enjoy that particular craft, and furthermore, each has more control over what they work on. Instead of “fires” taking up all the time of a single, unified data science team, those problems are contained within the Rapid Response team, which is staffed with data scientists and others that enjoy this kind of work. They can prioritize their tasks within their capacity and provide feedback to the Preemption and Engineering teams about where they need more coverage. Meanwhile, these other teams can stay laser-focused on building for the long term to ensure the Rapid Response team doesn’t burn themselves out trying to fix too many problems on their own.

Does this mean there is no role for self-serve analytics? Absolutely not. Reverse ETL is still a valuable tool at your disposal to ease your team’s workload. Just think of it as the “regular water lines” that run into people’s homes that split off from the water mains that feed the hydrants. These are important additions to the work that data teams do, but are not meant to fully replace the role that experienced data teams play within their organizations for critical decisions and models when things go beyond what basic reporting can handle.

While the example I gave above works well in a medium-sized company, looking back at other teams I’ve worked on, I could also see it working for different situations such as in fast growing start-ups or in high-growth areas of larger corporations. As always, you may need to tweak this model to fit your particular context, but it’s a good starting point if you find yourself overwhelmed and hiring more people or trying to offload the work onto other organizations isn’t solving the problem for you.

Article was originally published on Towards Data Science.