Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possible options to store and serve ML models #1

Open
vemonet opened this issue Jul 6, 2020 · 1 comment
Open

Possible options to store and serve ML models #1

vemonet opened this issue Jul 6, 2020 · 1 comment
Assignees
Labels
enhancement New feature or request question Further information is requested

Comments

@vemonet
Copy link
Member

vemonet commented Jul 6, 2020

DVC and CML (Continuous Machine Learning)

Build models using GitHub Actions or GitLab CI: https://cml.dev/

Version models with https://dvc.org/
See https://determined.ai/blog/building-an-enterprise-deep-learning-platform-2/

DVC is significantly more lightweight than Pachyderm, running locally and adding versioning on top of your local storage solution. DVC simply integrates into existing Git repositories to track the version of data that was used to run experiments. ML teams can also define and execute transformation pipelines with DVC; however, the biggest drawback of DVC is that those transformations run locally and are not automatically scaled to a cluster. Notably, DVC does not handle the storage of data, simply the versioning.

See also: https://developer.download.nvidia.com/video/gputechconf/gtc/2019/presentation/s9264-how-to-build-efficient-ml-pipelines-from-the-startup-perspective.pdf

Explore Kubeflow?

OpenML to share models

Concepts: https://openml.github.io/OpenML/#concepts

See how to publish a dataset:
https://openml.github.io/openml-python/master/examples/30_extended/datasets_tutorial.html#sphx-glr-examples-30-extended-datasets-tutorial-py

Should we also publish tasks?

Scann lib from google

https://ai.googleblog.com/2020/07/announcing-scann-efficient-vector.html

Clipper AI API

Clipper AI: Serve ML models (tensorflow, pytorch, sklearn...) through a HTTP REST API (no OpenAPI support builtin)

Pachyderm

Build, train, and deploy your data science workloads on whatever Kubernetes deployment you call home.

https://www.pachyderm.com/getting-started/

Machine Learning model databases

ModelDB

Open Source ML Model Versioning, Metadata, and Experiment Management

https://github.com/VertaAI/modeldb

Video presentation: https://databricks.com/fr/session/modeldb-a-system-to-manage-machine-learning-models

MLDB

https://mldb.ai/

MLDB (Machine Learning Database) is an open-source database designed for machine learning. You can install it wherever you want and send it commands over a RESTful API to -store data, explore it using SQL, then train machine learning models and expose them as APIs

@vemonet vemonet added enhancement New feature or request question Further information is requested labels Jul 6, 2020
@vemonet vemonet self-assigned this Jul 6, 2020
@vemonet vemonet changed the title Serve ML models using Clipper AI Possible options to store and serve ML models Sep 22, 2020
@vemonet
Copy link
Member Author

vemonet commented Jan 10, 2023

Starting from release v0.1.0, dvc and mlem have been implemented to save and load models

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request question Further information is requested
Projects
None yet
Development

No branches or pull requests

1 participant