Five Data Management Predictions (Observations) from the Trenches

Welcome 2021!

Mankind wasn’t prepared for 2020! Not from a healthcare perspective, education perspective, or a technology perspective. No business in any industry will say they saw what was coming in 2020.

Things have finally started to turn a corner on most fronts, and as the future begins to look brighter, it becomes clear that technology can create a huge advantage for companies that embrace it. The challenging business environment forced whole industries to leapfrog technology adoption by years. Whether it’s education, food delivery, ecommerce, or remote fitness, innovators are positioned to do more than just recover.

As practitioners in the data ecosystem, we are sharing the key shifts that we believe are coming to the data space in 2021.

1. Convergence of Data Tools

For years, we have seen data tool vendors confuse the customers with shiny toy syndrome. New types of shiny yet incomplete data tools have landed in the market. For example, integrations, ETL, ELT, iPaas, Governance, Catalogs, Data Prep, Data Quality, Data Operations, Data Observability, Data Exchange, Data-as-a-Service, Data Compliance, and Data Wrangling. Each of the data tools claim to be category-defining and that has confused customers as well as industry analysts. This has resulted in a proliferation of data categories that industry analysts now track in a siloed manner. As a result, data users are struggling to understand what they really need because many of these tools overlap in a big way and piecing them together creates even more complexity.

2021 will start us on the path of convergence. Data users want to find data, connect to it, modify it, and then use it in their favorite applications in a secure and governed manner. They want to do this with speed because in 2021, more than any other time in technology history, Speed = Survival.

2. Data Heterogeneity and the Rise of Data Exchanges

Year-after-year the data volume continues to increase at an accelerated rate. However, for data users, it is not the Zettabytes that matter as much as the fact that data heterogeneity has been increasing at the same scale.

Data Heterogeneity

The combination of data velocity, data formats, and data schema combined with the data system (S3, API, GCS, FTP, cloud, on-premise etc) create a literally infinite set of variations also known as data heterogeneity. The only way to handle these variations has been code and technology. As companies look to leverage the data from their ecosystem alongside their internal data, 2021 will see an unprecedented rise in data heterogeneity.

Data Exchanges

Data Exchanges are good at acting as the intermediary between disparate systems, but the approach has so far been focused on static company-to-company data. 2021 will be the year when Data Exchange could become a way that can easily move data between systems with security and governance. Expect more and more companies to create exchange solutions for data across companies and even internal teams. If the progress goes as fast as we hope, we fear that by the end of 2021 we will start seeing cracks in the ability of exchanges to solve heterogeneity, as the need for dynamic real-time data will start weighing on these static exchange solutions. That will be a topic to watch in 2022.

3. No-code Data Tool is the Answer

No-code for Data is already here, but in 2021 expect it to accelerate in a big-big way. Many startups will get funded in this area and existing companies will try to catch onto the trend and start to claim that they do no-code data. The no-code phenomenon really stems from companies realizing three things:

To scale organizations, you need to scale people.
Waiting on data engineers is a terrible bottleneck for data teams’ productivity.
Most data users are capable and prefer doing certain data tasks themselves with guidance, templates, and controls.

The segment of people who understand data, but don’t understand data systems is growing fast. It is a win-win when the set of simpler tasks that engineers do today can be pre-configured and templatized for the masses. We expect more Data Engineers to also push back against the mundane data engineering tasks that take up bulk of their time. Instead they will choose to focus on the harder problems, letting data users self-serve their common data.

4. No-code Data Tool is NOT the answer (Well, then what is?)

As much as we would want no-code to be the silver bullet, the challenges of increasing data heterogeneity will make “no-code only” approach impractical. Companies will still struggle even after stretching no-code approaches to their limits.The real answer ultimately derives from Teamwork and Collaboration.

In 2021, we expect to see the emergence of collaboration tools for data. Collaboration is essential in any enterprise segment, but none more than in Data. Why? Data is one aspect of the business that touches and will touch more individuals than any other aspect. Finance, HR, Sales, Product, Marketing all have their silos, but the need for data cuts across them all.

What is a Data Lake? Nothing but a place to store and organize massive amounts of data to be provisioned to different data applications. That line is blurring more and more with data warehouses that provide massive storage, ability to store structured and semi-structured data, with tools to easily provision the data for use.

5. Beginning of the End of Data Lake

In 2021, more companies will realize that Data Warehouses provide the scale to meet their needs, and that they don’t really need to go the way of Hadoop-style Data Lakes. This will push Data Lakes to become a niche solution for companies whose data scale justifies the added complexity of running, managing, and governing a Data Lake.

We often overestimate what we can achieve in a year and underestimate what we can do in a decade. This applies here as well. Keep a close watch on the data revolution that is underway. If you blink, you might miss it.

Here’s to the new beginnings – Welcome 2021!

Five Data Management Predictions (Observations) from the Trenches

Welcome 2021!

1. Convergence of Data Tools

2. Data Heterogeneity and the Rise of Data Exchanges

3. No-code Data Tool is the Answer

4. No-code Data Tool is NOT the answer (Well, then what is?)

5. Beginning of the End of Data Lake

Enhancing LLMs with Private Data: A Comprehensive Tutorial using Nexla, Pinecone & OpenAI

Nexla Receives the High Rating in Gartner® Peer Insights™ for the Second Year in a Row

What is a Data Product?

Unify your data operations today!