MLflow & Superwise integration

SUPERWISE

May 12, 2022
6:27 am

Part I: Connecting Superwise with MLflow

MLflow and Superwise are two powerful MLOps platforms that assist in the process of managing ML models’ training, monitoring, and logging. The two systems have different capabilities: MLflow Experiments is mostly good for tracking metrics, while Superwise offers a more deep and comprehensive analysis of your models and data. This post describes the concepts of Integrating MLflow with Superwise. Check out our documentation on how to integrate and our end-to-end tutorial notebook.

–

Machine learning development can become disorganized very quickly, and to try to address this issue, different tools for each phase of the machine learning lifecycle have been and still are being developed. MLflow is one of the popular libraries that helps organize the process of machine learning development. It consists of 3 main components: Tracking, Projects, and Models; This post is only focusing on the Tracking part.

MLflow Tracking provides both an API and a UI that help to visualize metadata related to training sessions. For each experiment, you can log and store: hyperparameters, metrics, artifacts, the source code used to build the model, and the trained model. With a few lines of code, you can quickly compare and view the output from multiple training sessions to quickly have an understanding of which model performed the best.

Integrate Superwise with MLflow

Superwise offers a simple way to integrate with MLflow. To establish the integration, you just need to match 3 types of parameters between the two systems:

Matching the names of the models.
Setting MLflow Tags to match versions with Superwise.
Use MLflow to track the experiments:
1. Automatic collection of loss and parameters will continue to be done with MLflow’s auto logging.
2. Specific logging of Superwise calculated metrics will be sent as custom metrics.

Model name

In order to be able to reference the models identically on the two platforms, it’s necessary to make sure that the names of the models are the same. To ensure that, define a global name for the model, then pass the model name both to Superwise and MLflow as this snippet shows:

# Setting up global names
model_name = "Diamond Model"
 
# Using the model name in MLflow’s experiment
mlflow.set_experiment(f"/Users/{databricks_username}/{model_name}")
 
# Using the same model name to create a Superwise model
from superwise.models.model import Model
sw_model = Model(
     name=model_name,
     description="..."
 )

The models on both platforms will have the same name, making them easily identifiable, as the screenshots show:

Model version

Similarly to setting the model_name in both platforms, use the model version defined in the previous step and pass it as a tag to MLflow when starting the run as follows:

superwise_version_name = "version_1"
 
tags = {"Superwise_model": model_name,
  "Superwise_version": superwise_version_name}
mlflow_run = mlflow.start_run(tags=tags)

Custom metrics

During the experiment run, MLflow will easily collect some of the metrics using artifacts and logs that the flavor you are using is generating. You should also add custom metrics to the run using the log_metrics function.

# Calculating the weighted average of feature drift from Superwise
for feature in features['name'].tolist():
 	importance = features.set_index('name').loc[feature]['feature_importance']
 input_drift = results_df[results_df['entity_name'] == feature]['value'].mean()
 input_drift_value += importance * input_drift
 
input_drift_value /= features['feature_importance'].sum()
 
# Logging the calculated metric to MLflow
mlflow.log_metrics({"input_drift": input_drift_value})

In your MLflow experiment tracking dashboard, the tags and metrics will appear, allowing you to both identify the models and versions as well as benefit from new types of metrics.

Part II: Detect data corruption in an ML experiment

Let’s put into practice the concepts we showed in part I and use the integration to generate more insightful experiment tracking. In our demo notebook, we build a simple ML pipeline to predict the price of diamonds based on a set of numeric and categorical features.

The notebook tells the story of a machine learning experiment that went bad due to data corruption.

An ML team was working on the diamonds dataset and trained their first model. They logged the training data and the predictions to Superwise and used MLflow to track the experiments.

After a while, the team wanted to train another model, only that this time some of the features got corrupted accidentally 😱 . Thankfully enough, the team was using Superwise together with MLflow. They synced the input_drift metric, and in the experiment page in MLflow, they could easily see that an elevation in the input drift was related to the reduced accuracy of the newly trained model.

Using this information, they ran a more comprehensive analysis of each of the features and found the culprit immediately. The feature drift for “carat” showed significantly higher numbers than the rest of the features.

Conclusion

In order to perform a deep and comprehensive analysis of the training process, it’s important to log and analyze not only loss metrics but also the data. Superwise is optimized for storing training and production data and offers a continuous analysis of crucial metrics like drift and feature importance. With a few method calls using our client – it’s easy to enrich MLflow with Superwise metrics and better understand the differences in model performance.

Other integrations you might like:

Ready to get started with Superwise?

Head over to the Superwise platform and get started with easy, customizable, scalable, and secure model observability for free with our community edition.

Prefer a demo?

Request a demo and our team will show what Superwise can do for your ML and business.