Applied Machine Learning

Let’s Talk DataOps: Adam Leary, VP Applied Machine Learning Group at CBSi

Jarah Euston
Jarah Euston

Welcome back to Let’s Talk DataOps!

In this series, we interview the leading thinkers in Data Operations to discuss the state of DataOps from their point of view. Learn about what they do, their biggest challenges, and how they are utilizing DataOps to drive their businesses.

In our latest installment, we interviewed Adam Leary, vice president of the new Applied Machine Learning Group at CBS Interactive (CBSi). Adam built out the first data products team and lead CBSi through a comprehensive data infrastructure transformation. His team became a centralized data service team managing tracking, pipelines, and data products.

In this interview, he explains how he built a new, “future-proof” data infrastructure at CBSi and discovered humans are the biggest data challenge.

Nexla: You’re the vice president of the Applied Machine Learning Group at CBSi, and were previously the vice president of Data Science. Tell us a bit about what that entails. What are you building?

Adam Leary: We’re becoming a center of excellence for applying machine learning and artificial intelligence solutions on data. We’re essentially building out our own framework to deploy these systems across the company. Our focus is on building consumer-facing applications, providing services for product and engineering teams to help provide a greater experience for end-users.

Fighting Cognitive Bias

We’re humans at the end of the day. If you don’t understand how people are going to react to change, it becomes counterintuitive.

N: What is the biggest data challenge CBSi has faced internally so far? How did you resolve it?

AL: The biggest data challenges have been more on the human side than the technology side. You really have to do a lot of work as this field of data changes so quickly. The thinking and approach on how you might use data doesn’t keep up.

One of the bigger challenges is “data-driven decision making.” It sounds great, but when it actually happens, it’s really difficult. We’re humans at the end of the day. If you don’t understand how people are going to react to change, it becomes counterintuitive. People will find reasons to undermine the change. Not that they are purposely trying to, but it’s just part of how humans work. I think a lot about the psychology and how people will react to changes. It’s really hard because it makes you look at yourself and ask “What am I doing that’s not right here?”

To resolve it, we started building [data] teams that are aware of cognitive biases in terms of decision making. We have internal seminars on decision bias, recency bias, and all of the things you have to be aware of when it comes to bias. Especially because data is so important. But if you’re in sales, marketing, or product, it’s also important to be aware of the biases that people have. This will help you be successful in your data work.

If you present tools another way, some people get it right away. Some people are more visual, some need more explanations. It’s a people problem. We know we built this great tool, but why aren’t people using it? Through a lot of observation, we stumbled upon decision bias.

This led to a discussion around data operations. I think one of the things data operations helps you solve is, if you build implicit trust in a data system, you’ll know what the normal operating parameters are. It really helps end users think that they can use it, and trust it. There is a psychological effect when people have trust in the system.

DataOps: The “Nervous System” Around Your Pipeline

Instead of reinventing the wheel, we started to think about it systematically—of having a nervous system around your pipeline.

N: Media companies are not particularly well known for being early technology adopters. But you’ve created some pretty innovative data infrastructure at CBSi. Can you tell us about it?

AL: Let’s start with how we built it. We took what we had in place, in terms of systems, and determined where the business needs to go in regards to data. Just like a lot of other non-media companies, data is extremely important to not only running the business, but building the product. We reached our limit of what we could do with our system, which led me to rethink data infrastructure for the future.

One of the things we wanted to take advantage of was getting out of batch mode as much as possible. One of the advantages is that you can do more streaming data when it comes in, and the data is fresh for immediate use.

There are still many things that happen outside of our control. But this is where DataOps comes into play. I didn’t really know DataOps was a thing until I started thinking, “Hey, why don’t we have this? Is the data delayed? Why is it delayed?” I looked at it purely from a process point of view. Instead of reinventing the wheel, we started to think about it systematically— of having a nervous system around your pipeline.

This is the thinking around our data infrastructure going forward, not only to support knowledge and change, but to be more turnkey. As a media company, you want to think about all the platforms, devices, and ways people are interacting with content we produce. Obviously, that’s driven a lot of complexity in media. I think that’s why there’s a lot of convergence between media and tech companies.

N: What are the business motivations for streaming capabilities? What did you guys want to do that you couldn’t do with batch?

AL: A lot of the motivation had to do with products like livestream video. We have to support a lot of live events like the GRAMMY Awards, or the Super Bowl. In the beginning, problems would happen and then go away, but now we have product development around live stream. With CBSN [a streaming video news channel], it’s not just occasional. As an online 24 hour news channel, we have a lot of livestream.

These complexities emerged when we tried to figure out how to deal with this information—not just for powering the product, but for business production. This led to data streaming. We could get the data in and process it immediately rather than waiting for batch jobs to push it.

N: DataOps is becoming something of a buzzword. How do you define “Data Operations”?

AL: I define it simply as “Data Awareness.” DataOps is an infrastructure that runs more data systems, giving you the ability to understand when or why an issue arises. It also allows you to apply proper operational standards across the board when you implement pipelines or standards, and make sure those standards are equal.

I’m borrowing this term from the manufacturing world, but it ends up being a little bit of “quality control charting.” That’s part of the definition for me, and something we built out early on. We learned what our norms are, what our norms are not, the alerts, but most importantly our normal operating parameters.

The Next 10 Years: More Data, More Discovery

A lot of standard platforms will start to go away and become more a part of people’s living spaces.

N: Predictions: Ten years ago, “the cloud” was born. As you look into your crystal ball, where do you see data taking the media industry in the next 10 years?

AL: I see continued convergence of media and tech in ways you can consume, use, or create content. I think a lot of standard platforms will start to go away and become more a part of people’s living spaces and environments. I don’t know what that looks like, but I think that’s where it can go.

I also think a lot of that depends on the political climate in the U.S. A lot of that determines whether that happens now or later. Especially with net neutrality. If we don’t have net neutrality then we won’t see a lot of change happening within the next 10 years.

In terms of where data is taking the media industry, I think there’s going to be a big push around personalization. There’s obviously dangers around such a push, but I hope for more content discovery and savvy consumption around media. By savvy consumption, I mean using data about content not to create filter bubbles but to create rich contexts of experience. We need a richer sense of connection with each other and our world. I think using AI techniques which challenge our cognitive biases are important as well as respecting privacy. Ultimately, I would hope to see more seamless voice, web, transit experiences with media.

Thanks for tuning in! Do you or anyone you know have a lot to say about DataOps? We’d love to chat! Drop us a note at info@nexla.com

Can’t get enough DataOps? Check out our first interview here.