I am the Director of Engineering & Data Science at Anaplan. My team’s mission is to democratize data science to Anaplan’s non-data scientist customers.
We are responsible for the ML algorithmic core of Anaplan’s products. We develop ML-based engines in the field of classification, segmentation, and time series forecasting. Our customers leverage these ML-backed engines to optimize different business needs without relying on a data scientist. Prior to my current role, I was leading the data algorithms team at Mintigo (which was acquired by Anaplan in 2019). I hold an M.Sc. in Information System Engineering from Ben Gurion University of the Negev.
Please share your main operational challenges and best practices to scale the use of AI
Anaplan sees a great opportunity in AI, so I have strong support and our work is receiving the managerial attention it requires, which allows the team that I am leading to fully utilize its potential.
The main challenges that I encounter are around adoption from business users.
Customers are usually transitioning away from more traditional methods of forecasting or pipeline prioritization. They are comfortable with these methods even if they aren’t the most helpful, so a transition to leveraging AI can cause hesitancies. But ultimately, I think a lot of customers realize the potential and that makes the transition easier.
We are also working to generalize our ML-based solutions to fit a large variety of use cases across different industries, like food suppliers, car manufacturers, clothing firms, and medical services centers for instance. We’ve developed several concepts to do this, including:
– Auto Analyst: we are developing automated processes that analyze the data, identify its main characteristics and use transformations, data manipulation techniques, hyperparameters tuning and AutoML to fit the best model for each specific data set.
– Explainability: researching and developing tools for the customer to be able to understand the outcome and trust it.
– Provide different levels of control to the users: all the way from a fully automated process to be able to tune some parameters or select the algorithm the system will be using.
Do you see a difference in the different verticals that you serve about the importance of explainability?
We see the need for explainability across all verticals. Our platform takes an action that was done manually and replaces it with an AI-based automated solution.
We want our customers to trust the system, and this will only be achieved if they feel they understand the process. We create a “glass box”, not a “black box”.
We want our platform to be as clear as possible. By implementing explainability, our customers can better understand how to improve their forecast, and sometimes even improve their demand in the market.
In your customer’s organization, who is in charge of assuring the health of models in production?
At Anaplan, we see it as our responsibility to assure the health and integrity of our customers’ models. We allow our customers to outsource the technical depth needed to support their AI-based systems. We also have several models being trained every day by different customers, different forecasts being created, and records being scored. We are creating dashboards and alerting systems that allow us to monitor and identify cases where performance is not as expected. These systems also alert the customer whenever there is a need for their input or action. The system’s performance is also being monitored over time to identify degradation or change in behavior.
What’s your view on The “Full Stack Data Scientist”?
I believe that a product’s algorithmic core should be developed by a Data Scientist who spent time researching and defining the optimal solution to the challenge at hand. In my view, a DS team should focus on the algorithmic core of a product but must be surrounded by a great backend, frontend, infra, and DevOps teams that develop the other parts of a system, making sure each team is assigned to the tasks it is most suited for.
Having full control over the algorithmic core is super important. This gives us the ability to be agile, react quickly where needed and being independent in the places we bring a clear advantage.
This also requires us to hire people that love to combine research and development and enjoy them both. I think that data scientists nowadays are talented people and can also code at the highest level if they are trained appropriately and have the required technical support where needed.
What are the main improvements and investment areas planned for your AI/ML activities in the next two years?
Having the right people on your team is critical for success and Anaplan is invested in hiring globally. I plan to focus on building out a diverse team with people that come from different backgrounds and have different experience and expertise.
We have many new and exciting ideas for AI-based solutions that can help Anaplan’s customers optimize their business processes. We are also developing the innovative concept of Data Science QA, by creating sophisticated test suites that cover a large variety of use cases, including datasets and edge-cases simulations to better estimate and track the system performance and limitations.
As we keep hearing about more and more “AI fails”. What’s keeping you up at night?
Usually, I sleep very well 😉 But of course, there are those cases where a customer encounters a strange behavior that requires an in-depth analysis. Locating the root cause of such behavior in a complex cutting-edge AI-based system can be a challenge. This is simply another reason for us to continue developing explainability concepts that are great not only for customer adoption but also for allowing us to provide our customers with better and faster support.
Do you remember the first model that you productized and do you know if it’s still in production?
Yes. I remember! It was a bidding optimization model for online real-time advertising auction. I was a junior data scientist straight out of the university. I remember it was for one of the biggest customers of the company, so I was really anxious, but luckily it all went well. It was in use for several years until it was replaced by an improved model.
Assuming you have all the resources in the world, what is the model that you would want to create?
I would like to forecast the changes created by the COVID-19 pandemic: how it will affect the world in the next few years in many different aspects such as the global economy, health systems, the education system, and our workplace environment.