-
Notifications
You must be signed in to change notification settings - Fork 37
2. Services
Various services that are integrated with Harbor. The link in the service name will lead you to a dedicated page in Harbor's wiki with details on getting started with the service.
This section covers services that can provide you with an interface for interacting with the language models.
-
Open WebUI
widely adopted and feature rich web interface for interacting with LLMs. Supports OpenAI-compatible and Ollama backends, multi-users, multi-model chats, custom prompts, TTS, Web RAG, RAG, and much much more. -
ComfyUI
The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface. -
LibreChat
Open-source ChatGPT UI alternative supporting multiple AI providers (Anthropic, AWS, OpenAI, Azure, Groq, Mistral, Google) with features like model switching, message search, and multi-user support. Includes integration with DALL-E-3 and various APIs. -
HuggingFace ChatUI
A chat interface using open source models, eg OpenAssistant or Llama. It is a SvelteKit app and it powers the HuggingChat app on hf.co/chat. -
Lobe Chat
An open-source, modern-design AI chat framework. Supports Multi AI Providers( OpenAI / Claude 3 / Gemini / Ollama / Azure / DeepSeek), Knowledge Base (file upload / knowledge management / RAG ), Multi-Modals (Vision/TTS) and plugin system. -
hollama
A minimal web-UI for talking to Ollama servers. -
parllama
TUI for Ollama -
BionicGPT
on-premise LLM web UI with support for OpenAI-compatible backends -
AnythingLLM
The all-in-one Desktop & Docker AI application with built-in RAG, AI agents, and more. -
Chat Nio
Comprehensive LLM web interface with built-in marketplace
This section covers services that provide the LLM inference capabilities.
-
Ollama
Get up and running with Llama 3.2, Mistral, Gemma 2, and other large language models. -
llama.cpp
LLM inference in C/C++ -
vLLM
A high-throughput and memory-efficient inference and serving engine for LLMs -
TabbyAPI
An OAI compatible exllamav2 API that's both lightweight and fast -
Aphrodite Engine
Large-scale LLM inference engine -
mistral.rs
Blazingly fast LLM inference. -
openedai-speech
An OpenAI API compatible text to speech server using Coqui AI's xtts_v2 and/or piper tts as the backend. -
Parler
Inference and training library for high-quality TTS models. -
text-generation-inference
Inference engine from HuggingFace. -
AirLLM
70B inference with single 4GB GPU (very slow, though) -
SGLang
SGLang is a fast serving framework for large language models and vision language models. -
ktransformers
A Flexible Framework for Experiencing Cutting-edge LLM Inference Optimizations -
Whisper
an OpenAI API-compatible transcription server which uses faster-whisper as its backend. -
Nexa SDK
Nexa SDK is a comprehensive toolkit for supporting ONNX and GGML models.
Additional services that can be integrated with various Frontends and Backends to enable more features.