You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
DVC is significantly more lightweight than Pachyderm, running locally and adding versioning on top of your local storage solution. DVC simply integrates into existing Git repositories to track the version of data that was used to run experiments. ML teams can also define and execute transformation pipelines with DVC; however, the biggest drawback of DVC is that those transformations run locally and are not automatically scaled to a cluster. Notably, DVC does not handle the storage of data, simply the versioning.
MLDB (Machine Learning Database) is an open-source database designed for machine learning. You can install it wherever you want and send it commands over a RESTful API to -store data, explore it using SQL, then train machine learning models and expose them as APIs
The text was updated successfully, but these errors were encountered:
DVC and CML (Continuous Machine Learning)
Build models using GitHub Actions or GitLab CI: https://cml.dev/
Version models with https://dvc.org/
See https://determined.ai/blog/building-an-enterprise-deep-learning-platform-2/
See also: https://developer.download.nvidia.com/video/gputechconf/gtc/2019/presentation/s9264-how-to-build-efficient-ml-pipelines-from-the-startup-perspective.pdf
Explore Kubeflow?
OpenML to share models
Concepts: https://openml.github.io/OpenML/#concepts
See how to publish a dataset:
https://openml.github.io/openml-python/master/examples/30_extended/datasets_tutorial.html#sphx-glr-examples-30-extended-datasets-tutorial-py
Should we also publish tasks?
Scann lib from google
https://ai.googleblog.com/2020/07/announcing-scann-efficient-vector.html
Clipper AI API
Clipper AI: Serve ML models (tensorflow, pytorch, sklearn...) through a HTTP REST API (no OpenAPI support builtin)
Pachyderm
Build, train, and deploy your data science workloads on whatever Kubernetes deployment you call home.
https://www.pachyderm.com/getting-started/
Machine Learning model databases
ModelDB
Open Source ML Model Versioning, Metadata, and Experiment Management
https://github.com/VertaAI/modeldb
Video presentation: https://databricks.com/fr/session/modeldb-a-system-to-manage-machine-learning-models
MLDB
https://mldb.ai/
MLDB (Machine Learning Database) is an open-source database designed for machine learning. You can install it wherever you want and send it commands over a RESTful API to -store data, explore it using SQL, then train machine learning models and expose them as APIs
The text was updated successfully, but these errors were encountered: