The Definitive Data Operations Report

The industry’s first and most comprehensive data operations survey is back, with benchmarks on data operations adoption and best practices. In this second-annual Definitive Data Operations Report, we surveyed hundreds of data professionals to understand how they are building data teams, what those teams are focused on, and where the challenges lie.
DataOps isn’t an IT trend—it’s a business trend. As the number of workers who need to use data in their everyday jobs increases, it no longer makes sense to relegate data to the purview of IT only.
This survey was commissioned by Nexla and conducted by independent research firm Pulse Q&A.

Key Data Operations Findings:

85% of respondents say their companies have teams working on ML or AI. This is up from 70% last year
73% of respondents say their company has plans to hire in DataOps in the next year
Data professionals are only spending 14% of their time on analysis. The rest of the time is spent on required but low value-add tasks like data integration, data cleanup, and troubleshooting.
Data engineers spend 18% of their time on troubleshooting. That works out to 9.3 weeks a year!
Data pros are longing for automation in their jobs. We asked data pros what tasks in their current role would benefit from automation:
The majority, 56%, unsurprisingly said that data clean up would benefit from automation. Analysis was the second-most cited task, at 47%
Data integration was close behind at 46% and building data pipelines at 41%

These findings highlight the need and desire for more scalable, automated processes to maximize value from data.

What is Data Operations?

DataOps is an organization-wide data management practice that controls the flow of data from source to value, with the goal of speeding up the process of deriving value from data. The outcome is scalable, repeatable, and predictable data flows for data engineers, data scientists, and business users. Data Operations is as much about people as it is about tools and processes.
Tactically speaking, Data Operations takes care of the grunt work typically placed on IT or data engineers. This includes integrating with data sources, performing transformations, converting data formats, and writing or delivering data to its required destination. Data Operations also encompasses the monitoring and governance of these data flows while ensuring security.
A Data Operations practice can open data access to more stakeholders within an organization, further increasing capacity for scale. The more “data leverage” you can create in an organization, the more likely you are to be successful.
Ultimately, Data Operations is not just about tools and processes. It represents a greater cultural shift that breaks down the silos between what has traditionally been viewed as “data backend” that produces usable data and “data frontend” that derives value from data”. Only by enabling more users within their data systems can companies realize the economic benefits of becoming data-driven.

Snapshot of DataOps at the Average Company

Based on the survey responses, we can paint a picture of the average company and the challenges they face in Data Operations. The average company has only one backend data engineer for every 5 frontend data users—including analysts, data scientists, and business users. These limited data engineering resources must manage data that is growing by 2.7 TB every day, across an average of 4,300 data sets. Here data sets refer to database tables, file headers, or APIs. That’s a lot of data for small teams of engineers to manage. On top of all that, the average company in ingesting additional data, from 4.4 partners on average. Inter-company data can present more data challenges given the heterogeneity of data formats and delivery mechanisms.

The Data Operations Team

We found the average data team has a ratio of 1 backend data engineer for every 5 frontend users. There are outliers of course, with the minimum ratio at 0.5 (or two backend engineers for every frontend pro) and a maximum ratio of 29. That’s 29 frontend data users for every backend engineer— a ratio that is unlikely to be sustainable.
The more favorable ratios are found in smaller teams, where there are less than 10 frontend users. It appears there is a minimum of 1 – 5 data engineers, even on the smallest teams. But as frontend users grow, the ratios get larger because the data engineering does not scale as quickly.

We asked data pros about the size of their data teams. They told us how many “frontend” data users, such as analysts, data scientists, and business users they worked with and also how many data engineers, or “backend users.”
We then asked data pros if they thought there were enough backend resources and data engineers to support data needs. 50% said “No” — there are not enough backend resources to support the company’s data needs. Not surprisingly, respondents with more folks on the front end team were more likely to believe they didn’t have enough resources.

AI/ML

Almost all data pros report that their company is working on artificial intelligence and machine learning. This is up significantly from 2017 when “only” 70% of respondents reported their companies were working on ML or AI.

Hiring in Data Operations

It should come as no surprise that the majority of respondents reported their companies have plans to hire in DataOps in the next 12 months. When looking at the 73% of respondents who said they are planning to hire, two-thirds reported they did not think there were enough backend resources. A perceived lack of backend resources seems to be a trigger for DataOps investment, which makes intuitive sense.
Read last year’s report here.

Learn more about the state of Data Operations, including how to assess your company’s own Data Operations practice, by downloading the report. Enter your email to have it sent to you instantly!