Possible options to store and serve ML models #1

vemonet · 2020-07-06T08:09:53Z

DVC and CML (Continuous Machine Learning)

Build models using GitHub Actions or GitLab CI: https://cml.dev/

Version models with https://dvc.org/
See https://determined.ai/blog/building-an-enterprise-deep-learning-platform-2/

DVC is significantly more lightweight than Pachyderm, running locally and adding versioning on top of your local storage solution. DVC simply integrates into existing Git repositories to track the version of data that was used to run experiments. ML teams can also define and execute transformation pipelines with DVC; however, the biggest drawback of DVC is that those transformations run locally and are not automatically scaled to a cluster. Notably, DVC does not handle the storage of data, simply the versioning.

See also: https://developer.download.nvidia.com/video/gputechconf/gtc/2019/presentation/s9264-how-to-build-efficient-ml-pipelines-from-the-startup-perspective.pdf

Explore Kubeflow?

OpenML to share models

Concepts: https://openml.github.io/OpenML/#concepts

See how to publish a dataset:
https://openml.github.io/openml-python/master/examples/30_extended/datasets_tutorial.html#sphx-glr-examples-30-extended-datasets-tutorial-py

Should we also publish tasks?

Scann lib from google

https://ai.googleblog.com/2020/07/announcing-scann-efficient-vector.html

Clipper AI API

Clipper AI: Serve ML models (tensorflow, pytorch, sklearn...) through a HTTP REST API (no OpenAPI support builtin)

Pachyderm

Build, train, and deploy your data science workloads on whatever Kubernetes deployment you call home.

https://www.pachyderm.com/getting-started/

Machine Learning model databases

ModelDB

Open Source ML Model Versioning, Metadata, and Experiment Management

https://github.com/VertaAI/modeldb

Video presentation: https://databricks.com/fr/session/modeldb-a-system-to-manage-machine-learning-models

MLDB

https://mldb.ai/

MLDB (Machine Learning Database) is an open-source database designed for machine learning. You can install it wherever you want and send it commands over a RESTful API to -store data, explore it using SQL, then train machine learning models and expose them as APIs

vemonet · 2023-01-10T11:26:29Z

Starting from release v0.1.0, dvc and mlem have been implemented to save and load models

vemonet added enhancement New feature or request question Further information is requested labels Jul 6, 2020

vemonet self-assigned this Jul 6, 2020

vemonet changed the title ~~Serve ML models using Clipper AI~~ Possible options to store and serve ML models Sep 22, 2020

This was referenced Sep 23, 2020

Look into CML to build models and DVC to version them #2

Closed

Look into scann library #3

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Possible options to store and serve ML models #1

Possible options to store and serve ML models #1

vemonet commented Jul 6, 2020 •

edited

Loading

vemonet commented Jan 10, 2023

Possible options to store and serve ML models #1

Possible options to store and serve ML models #1

Comments

vemonet commented Jul 6, 2020 • edited Loading

DVC and CML (Continuous Machine Learning)

OpenML to share models

Scann lib from google

Clipper AI API

Pachyderm

Machine Learning model databases

ModelDB

MLDB

vemonet commented Jan 10, 2023

vemonet commented Jul 6, 2020 •

edited

Loading