A Distribution is where APIs and Providers are assembled together to provide a consistent whole to the end application developer. You can mix-and-match providers -- some could be backed by local code and some could be remote. As a hobbyist, you can serve a small model locally, but can choose a cloud provider for a large model. Regardless, the higher level APIs your app needs to work with don't need to change at all. You can even imagine moving across the server / mobile-device boundary as well always using the same uniform set of APIs for developing Generative AI applications.
Distribution | Llama Stack Docker | Start This Distribution | Inference | Agents | Memory | Safety | Telemetry |
---|---|---|---|---|---|---|---|
Meta Reference | llamastack/distribution-meta-reference-gpu | Guide | meta-reference | meta-reference | meta-reference; remote::pgvector; remote::chromadb | meta-reference | meta-reference |
Meta Reference Quantized | llamastack/distribution-meta-reference-quantized-gpu | Guide | meta-reference-quantized | meta-reference | meta-reference; remote::pgvector; remote::chromadb | meta-reference | meta-reference |
Ollama | llamastack/distribution-ollama | Guide | remote::ollama | meta-reference | remote::pgvector; remote::chromadb | remote::ollama | meta-reference |
TGI | llamastack/distribution-tgi | Guide | remote::tgi | meta-reference | meta-reference; remote::pgvector; remote::chromadb | meta-reference | meta-reference |
Together | llamastack/distribution-together | Guide | remote::together | meta-reference | remote::weaviate | meta-reference | meta-reference |
Fireworks | llamastack/distribution-fireworks | Guide | remote::fireworks | meta-reference | remote::weaviate | meta-reference | meta-reference |