Tell us a bit about yourself, your background, where you work, and what you do there.
I am Piotr, and I am the CEO of neptune.ai. My day-to-day, apart from running the company, is focused on the product side of things. Strategy, planning, ideation, getting deep into user needs and use cases. I really like it.
My path to ML started with software engineering. I always liked math and started programming when I was 7. I got into algorithmics and programming competitions in high school and loved competing with the best. That got me into the best CS and Maths program in Poland, which, funny enough, today specializes in machine learning.
I did internships at Facebook and Google and was offered to stay in the Valley. But something about being a FAANG engineer didn’t feel right. I had this spark to do more and build something myself.
So with a few of my friends from the algo days, we started Codilime, a software consultancy, and later a sister company Deepsense.ai a machine learning consultancy, where I was a CTO.
How did Neptune come to life? What was bothering you badly enough that a startup was born?
When I came to the ML space from software engineering, I was surprised by the messy experimentation practices, lack of control over model building, and a missing ecosystem of tools to help people deliver models confidently. It was a stark contrast to the software development ecosystem, where you have mature tools for DevOps, observability, or orchestration to execute efficiently in production.
And then, one day, some ML engineers from Deepsense.ai came to me and showed me this tool for tracking experiments they built during a Kaggle competition, Right Whale Recognition competition, 2016 (Fun fact: not only did we win, but our winning score of 0.596 has never been reproduced or exceeded except by us. We’ve hit 0.7 if you’re curious), and I felt this could be big. I asked around, and everyone was struggling with managing experiments. I decided to spin it off as a VC-funded product company; the rest is history. Neptune is an ML metadata store that helps ML engineers and data scientists log, organize, compare, register, and share models and experiments in a single place.
What’s your take on the citizen data scientist trend? How do you see these personas using experiment tracking/ model registry?
We want to be focused on helping “someone do something.” In our case, specifically, we see that ML engineers and data scientists get a lot of value from Neptune when they are developing many models and experimenting a lot, and they need to keep track of this iterative process. So the main personas we support are data scientists and ML engineers working in production ML teams. So citizen data scientists are not folks we are helping very much.
However, citizen data scientists are an important group in the ecosystem. The best way to think about this is the same way we think about web development. Yes, we have WordPress or Shopify in web development. But when you need something custom, you go build the website yourself with React or another dev component. Like at Neptune, we have our main www set up on WordPress – quick and easy! But our core technology, Neptune servers, WebUI, python client, are built by really experienced backend, frontend, and DevOps engineers. Each tool has its place and serves important users.
There are a ton of “citizen web developers” using OTS tools and solving their needs with them. ML will probably be similar, and both hands-on data scientists and citizen data scientists will have their place, very often in the same company.
You’ve probably seen tons of MLOps stacks over the last few years – What is your advice to practitioners/companies putting together their stack?
From our research into how reasonable scale teams set up their stacks, here’s what we’ve seen in common the most:
- Engage quickly with the DevOps/delivery team and think about ML-fueled software products, not ML products.
- Don’t automate what you are doing for the first time -> be pragmatic about what you need right now vs. what you may need in the future.
- When you know you need something -> Leverage tools wherever you can/need to -> better pay for a tool than build/maintain if you can, of course.
Every MLOps stack is different, don’t try to fit that fancy whitepaper into your problem. You are not Google, and you don’t need everything they have. They very much do. Recently we interviewed a few ML practitioners about setting up MLOps. Lots of good stuff in there, but there was this one thought I just had to share with you:
“My number 1 tip is that MLOps is not a tool. It is not a product. It describes attempts to automate and simplify the process of building AI-related products and services. Therefore, spend time defining your process, then find tools and techniques that fit that process. For example, the process in a bank is wildly different from that of a tech startup. So the resulting MLOps practices and stacks end up being very different too.”
– Phil Winder, CEO at Winder Research.
So before everything, be pragmatic and think about your use case, your workflow, your needs. Not “industry best practices.”
Jacopo Tagliabue, Head of AI at Coveo, in his pivotal blog post, Jacopo suggests a mindset shift that we think is crucial (especially early in your MLOps journey). “..to be ML productive at a reasonable scale, you should invest your time in your core problems (whatever that might be) and buy everything else.” He suggests that their focus on automation and pragmatically leveraging tools wherever possible were key to doing things efficiently in MLOps.
One more place with great insights about setting up your MLOps is one of the MLOps community meetups with Andy McMahon, titled “Just Build It! Tips for Making ML Engineering and MLOps Real”. Andy talks about:
- Where to start when you want to operationalize your ML models?
- What comes first – process or tooling?
- How to build and organize an ML team?
- …and much more
What’s your take on the debate over MLOps tooling fragmentation?
First of all, I think we should think of MLOps as an extension of the DevOps ecosystem, not something separate from it. Even today, it seems to be treated separately somehow. So we should be building tools that complement/extend what is already out there—used in production workflows.
While most companies in the MLOps space try to go wider and become platforms that solve all the problems of machine learning teams, Neptune’s strategy is to go deeper and become the best-in-class tool for model metadata storage and management.
“In a more mature software development space, there are almost no end-to-end platforms. So why should machine learning, which is even more complex, be any different? We believe that by focusing on providing the best developer experience for experiment tracking and model registry, we can become the foundation of any MLOps tool stack.”
And experiment tracking and model registry are not a part of the DevOps space. They have a special purpose that traditional software delivery doesn’t need. So things around managing training/evaluation/testing history, model lineage, experiment debugging, comparison, and optimization.
Yeah, I believe there will be standalone components that you can plug into your deep learning frameworks and MLOps stacks. For example, let’s take data warehouses – do they come with built-in BI/visualization components? No, we have standalone platforms because the data visualization problem is enormous and requires the product team to focus on it. And some teams don’t even need any BI/visualization.
Model metadata management is similar. You should be able to plug it into your MLOps (DevOps) stack. I think it should be a separate component that integrates rather than a part of a platform.
When you know you need solid experiment-tracking capabilities, you should be able to look for a best-in-class point solution and add it to your stack.
It happened many times in software, and I believe it will also happen in ML. We’ll have companies providing point solutions with great developer experience. It won’t make much sense to build it yourself unless you have a custom problem. Look at Stripe (payments), Algolia (search and recommendations), and Auth0 (authentication and authorization).
But even in ML today. Imagine how weird it would be if every team built its own model training framework like PyTorch. Why is experiment tracking, orchestration, or model monitoring any different?
I don’t think it is.
And so, I think we’ll see more specialization around those core MLOps components. Perhaps at some point, adjacent categories will merge into one, just as we are witnessing with experiment tracking and model registry merge into one metadata storage and management category.
What aspect of the industry, either an opportunity or a risk, isn’t getting enough attention?
Human in the loops MLOps.
We saw fashion designers sign up for our product.
WHAT? It was a big surprise, so we took a look.
It turns out that SMEs are involved in evaluation/testing a lot.
Especially teams at AI-first product companies want their SMEs in the loop of model development.
It’s a good thing.
Not everything can be tested/evaluated with a metric like AUC or R2. Sometimes, people just have to look to see if things improved and not just metrics got better.
It turns out that this human-in-the-loop MLOps system is actually quite common among our users:
- Greensteam in shipping: SME audits the results of new models
- Respo Vision in sports analytics: Data scientists are looking at various metrics and visual outputs to evaluate the pipeline performance
- Continuum Industries in industrial optimization: devs look at the results over the entire test suite before approving PRs
So I’d add humans into the MLOps feedback loops.
That seems to be the reality today and probably even more so going into the future when more ML-fueled products are built.
But even more generally, testing things manually before you automate is absolutely normal. And even when you are testing things automatically, you should do Q/A testing or user research to find things you didn’t know you didn’t know. This is where manual testing/evaluation by SMEs is crucial.
What career advice would future you give to your past self?
Learn (spend a decent amount of time on it) how to hire A-players. A good starting point would be reading “Who.”
I’ve been running my companies for the last 12 years, and of course, I’ve made a lot of decisions, some brilliant, some stupid. But there is no single brilliant decision that, in the long run, in the long run, would be even comparable to the value impact of bringing A-players to the company.
Another good thing about A-players is that they are very often very easy to manage (actually work with), and you don’t need to be an expert in human management.
Another piece of advice, once you have A-players with you, do not hide problems -; share them. A-players love challenges. They will not be scared; they will be excited!