MLOps - spinningideas/resources GitHub Wiki

Machine Learning Operations (MLOps)

MLOps is all the engineering pieces that come together to deploy, run, and train AI models.

MLOps is “a practice for collaboration and communication between data scientists and operations professionals to help manage production ML (or deep learning) lifecycle. Similar to the DevOps or DataOps approaches, MLOps looks to increase automation and improve the quality of production ML while also focusing on business and regulatory requirements.”

https://ml-ops.org/

Providers

iterative.ai
- DVC
- MLEM
neptune.ai
neu.ro
Datarobot
clear.ml
cnvrg.io
Dataiku
Flyte
Iguazio
- allows you to run locally, helps you avoid vendor lock like in Metaflow or Kubeflow, easy to learn and ramp up, deploys everywhere, you can literally run DL experiments and collaborate with your team through the cloud/Argo/Kubeflow/Airflow etc.
JetBrains Datalore
Kubeflow
Metaflow
Paperspace Gradient via https://www.paperspace.com/
seldon.io
Valohai
dagshub
chassis

DVC (Data Version Control)

DVC is built to make ML models shareable and reproducible. It is designed to handle large files, data sets, machine learning models, and metrics as well as code.

https://dvc.org/

MLEM

MLEM helps you productize your models after the experimentation phase

MLEM is a tool that helps you deploy your ML models. It’s a Python library + Command line tool.

MLEM can package an ML model into a Docker image or a Python package, and deploy it to, for example, Heroku.

MLEM saves all model metadata to a human-readable text file: Python environment, model methods, model input & output data schema and more.

MLEM helps you turn your Git repository into a Model Registry with features like ML model lifecycle management.

MLEM philosophy is that MLOps tools should be built using the Unix approach - each tool solves a single problem, but solves it very well. MLEM was designed to work hands on hands with Git - it saves all model metadata to a human-readable text files and Git becomes a source of truth for ML models. Model weights file can be stored in the cloud storage using a Data Version Control tool or such - independently of MLEM.

To deploy the model you need to know a lot of things about Python environment (what packages to install with pip), methods (what method should service call), input/output data schema (how to check the incoming data is ok). You could specify that yourself somehow before deploy, but since MLEM model already has all of this written down, MLEM deployment part just uses that information to deploy or build a Docker Image.

Of course, MLEM doesn't reinvent the wheel with deployment. It's just integrated with tools can do that (e.g. Heroku) or export models in a serveable format (e.g. Docker) and provide some machinery to make that deploy/export easy.