Stories from the ML trenches

Oren Razon

May 12th, 2021 min read

May 12th, 2021

min read

Your ML story

What led us to create the #MLTalks initiative

Back in February when we were on our 3rd lockdown, my team and I regrouped to think about our next steps. As we are in a fortunate position to meet with dozens of leading DS teams every week to brainstorm and discuss their challenges with scaling ML, we realized there was a need to give a structure to these voices and to create a repository of best practices and “stories from the field.”

At Superwise we see ourselves as a team of engineers and data scientists who bear the scars of putting ML models in the real world and learning from our mistakes. These scars have led us to create a solution that automates the assurance of ML models to help others scale their use of AI in a way that is safer and easier. The ML talks initiative is only a continuation of those efforts, and while the data science and MLOps community is a very vocal one with a wealth of information out there in the shape of blogs, Gits, and Slack channels, there is still a real need to consolidate the experience from the trenches, the real stories of the women and men, who have been awake at 3 AM on a Saturday to understand what really is happening with their models.

So far we have interviewed 5 (and counting!) rockstars in the ML world, and have learned something new from every conversation.

Here are some of our key takeaways:

1 – It takes a village

Scaling AI is about making sure that everyone is on board with it. Each and every one of our interviewees mentioned the necessity to facilitate adoption by being transparent to the downstream users. As Maya Bercovitch, Director of Engineering & Data Science, Anaplan notes: “we create a glass-box, not a black-box”. Clearly, scaling AI is about making it accessible for all stakeholders. What’s more, in our discussion with Matt Shump, VP Data, Chownow, he notes: “I have not met a sales leader or a marketing leader who’s willing for me to black-box automate a lead scoring model for them. They want to know what’s going on underneath the hood.” From data science, data engineering and operational users, each stakeholder in the organization needs to be aligned on how the models are doing to facilitate adoption and ROI.

2 – Visibility is paramount

In order to avoid delays and errors, the ability to understand how the data fluctuates and how the model behaves is paramount. Yet, the use of in-house tools or solutions that are not dedicated to machine learning tasks often fails to deliver the right results – especially as the number of models grows. As Maxim Khalilov, Head of R&D, Glovo notes:“ The nearest priority in terms of time is the monitoring. Because we don’t have enough visibility into the technical characteristics of the models, but primarily on what happens with the data, how the data flows through our pipeline, and most importantly, how our model behaves, and how it reacts and changes in the data.”

3 – MLOps and automation are at the top of everyone’s mind

When asked about what was at the top of her mind, Nufar Gaspar, Head of operational AI, product & strategy, Intel Corporation answers: “A lot of MLOps, as everyone. […] The ability to have one MLOps across different verticals and different organizations and to ease the access to MLOps for teams without high proficiency in machine learning is key.”

One of the top best practices that Dino Bernicchi, Head of Data Science, Homechoice notes is: “Develop your own AutoML pipelines and systems to deploy and manage solutions in production. This will allow you to rapidly test and deploy models.”

I hope you enjoy reading these as much as we enjoyed conducting them. I want to thank all those who participated. We are only just getting started. So please feel free to contact me if you want to take part in the ML Talks, recommend a co-worker, or share some questions that you would want us to investigate!

Everything you need to know about AI direct to your inbox

Superwise Newsletter

Superwise needs the contact information you provide to us to contact you about our products and services. You may unsubscribe from these communications at any time. For information on how to unsubscribe, as well as our privacy practices and commitment to protecting your privacy, please review our privacy policy.

Featured Posts

Drift in machine learning
May 5, 2022

Everything you need to know about drift in machine learning

What keeps you up at night? If you’re an ML engineer or data scientist, then drift is most likely right up there on the top of the list. But drift in machine learning comes in many forms and variations. Concept drift, data drift, and model drift all pop up on this list, but even they

Read now >
Everything you need to know about drift in machine learning
July 12, 2022

Concept drift detection basics

This article will illustrate how you can use Layer and Amazon SageMaker to deploy a machine learning model and track it using Superwise.

Read now >
Concept drift detection basics
Data Drift
August 31, 2022

Data drift detection basics

Drift in machine learning comes in many shapes and sizes. Although concept drift is the most widely discussed, data drift is the most frequent, also known as covariate shift. This post covers the basics of understanding, measuring, and monitoring data drift in ML systems. Data drift occurs when the data your model is running on

Read now >
Data drift detection basics