Thinking about building your own ML monitoring solution?

SUPERWISE

April 19, 2021
12:36 pm

“We already have one!” That’s the first sentence most of our customers said when we met to discuss AI assurance solutions. Most AI-savvy organizations today have some form of monitoring. Yet, as they scale their activities, they find themselves at a crossroads: should they invest more in their homegrown solution or receive support from vendor solutions? And if they do choose to invest more, for how long will their DIY solution be “good enough”?

In this blog, we explore how far homegrown solutions can take you and what you need to think about when planning to scale your use of machine learning.

DIY tools are (only) a start when monitoring your AI‍

Data science teams spend months researching and training their best models. The production phase and the necessary MLOps/monitoring phase sometimes only come as an afterthought. In this context, many data science and engineering teams develop initial AI monitoring tools in-house. But while DIY tools may be a decent approach for businesses with a contained use of AI when the time comes to expand the use of modeling, homegrown tools fail short of supporting the diversity and complexity of the models and the data used. Here is a shortlist of some of the lessons learned that we have witnessed with customers scaling their AI.

As they grow, the number of models and use cases grow

Guess what? Homegrown solutions don’t scale in sync with the models and require more and more maintenance, tweaks, and attention…This is especially true as organizations adopt AI for various use cases: from marketing to core activities embedded in their product.

Models monitoring is not a one-off task. As organizations adopt new models, they need to create a new monitoring paradigm that caters to the different types of data – structured, text, image, video, etc..; all of which require different measures and techniques to analyze the incoming data for the process. In other words, what works for a classification model probably won’t work for a regression/clustering one, and a new set of tools will need to be developed. And even for specific structured use cases, different features of the model require different KPIs to analyze the health of the process: numerical/categorical/time/etc…

Regardless of the sophistication of the models, monitoring is an ongoing task that requires 25%-40% of a data science team’s time. The inefficiency and the frustration that comes with the heavy investment in homegrown monitoring solutions are among the first reasons that push organizations to turn to vendor solutions. Along with the fact that they would much rather their teams focus on creating models that have an impact on the business.

You don’t know what you don’t know

This is perhaps the most critical point. For organizations that have already engineered a solution that computes specific KPIs for your models, they find themselves struggling to proactively understand when concept drifts happen or when biases start to develop. More often than not, homegrown solutions tend to look at the things that are already known, and the issues that were already anticipated, thus realizing too late when events occur that are beyond this scope. This is often the point where organizations realize the limitations of their own solution, however sophisticated they engineered it to be, as it fails to bring value to the whole ML process.

In environments where data is extremely dynamic, assuring the health of models in production is about leveraging the expertise and best practices to be proactive: be alerted on issues that pertain to the health of the models, gain insights, and diagnose issues promptly.

Multiple stakeholders

As mentioned in a previous post, scaling AI poses the question of who owns it when it’s in production: data science teams? data engineering? business analysts? hybrid creatures? Ultimately, as AI use grows, the stakeholders involved also change, regardless of the number of models. Think about the fraud detection and cybersecurity space where analysts are the predominant users of the AI predictions and need to make sure the models are always tuned to a very dynamic data landscape.

For a monitoring solution to be useful, all the stakeholders involved need to derive insights and an understanding of the health of the predictions:

Data science teams need to understand if/when/how they should retrain the model, and the cases in which the model doesn’t perform well,
Business analysts want to know what drives decisions and get alerted as soon as there is high uncertainty regarding the model decision quality,
Data engineers need to know about the quality of the data streaming through the system, and whether it has outliers missing values or strange data distributions

To do so organizations need to create and maintain a view of the ML predictions that everyone involved can access and extract value from, without creating unnecessary noise. Beyond determining if there are sufficient resources, there is also a matter of skill set as all stakeholders often have different perceptions that need to be bridged under one enterprise-wide view. Ultimately, the complexity of these tasks is what drives AI practitioners to scale their activities to select a best-of-breed solution for assuring their models in production.

The amount of data is exponential!

In industries such as Adtech where models process TBs of data each day, the velocity of the data is a challenge to obtaining a clear picture. Do you have the time and tools necessary to continuously extract, compare, and analyze statistical metrics for your ML process, without impacting your core activities?

Scaling your AI? Here’s what you need to ask yourself

Here’s a quick list of considerations you may want to think over as you consider the best way to assure the health of your models in production. At the end of the day, it boils down to a question of resources management and efficiency: how much time should you invest in developing a set of tools to monitor your models in production, today? And what will it cost you tomorrow as you add more and more models and use cases?

How much will a homegrown solution cost?
How efficient will this be in the long run?
Is it really what my team needs to focus on, or is it better to buy and use such a capability?
How can I foster an enterprise-wide understanding of the models’ health?
How can I make my monitoring solution a proactive one?

Don’t play around with your growth

At Superwise, we specialize in accompanying our customers as they transition from using homegrown solutions–or even nothing!–to a rich model observability solution that helps them achieve business impact and grow their AI practice. Enabling them to focus on what they do best: developing and deploying models that help their business grow.