2. Services

Various services that are integrated with Harbor. The link in the service name will lead you to a dedicated page in Harbor's wiki with details on getting started with the service.

Frontends

This section covers services that can provide you with an interface for interacting with the language models.

Open WebUI
widely adopted and feature rich web interface for interacting with LLMs. Supports OpenAI-compatible and Ollama backends, multi-users, multi-model chats, custom prompts, TTS, Web RAG, RAG, and much much more.
ComfyUI
The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface.
LibreChat
Open-source ChatGPT UI alternative supporting multiple AI providers (Anthropic, AWS, OpenAI, Azure, Groq, Mistral, Google) with features like model switching, message search, and multi-user support. Includes integration with DALL-E-3 and various APIs.
HuggingFace ChatUI
A chat interface using open source models, eg OpenAssistant or Llama. It is a SvelteKit app and it powers the HuggingChat app on hf.co/chat.
Lobe Chat
An open-source, modern-design AI chat framework. Supports Multi AI Providers( OpenAI / Claude 3 / Gemini / Ollama / Azure / DeepSeek), Knowledge Base (file upload / knowledge management / RAG ), Multi-Modals (Vision/TTS) and plugin system.
hollama
A minimal web-UI for talking to Ollama servers.
parllama
TUI for Ollama
BionicGPT
on-premise LLM web UI with support for OpenAI-compatible backends
AnythingLLM
The all-in-one Desktop & Docker AI application with built-in RAG, AI agents, and more.
Chat Nio
Comprehensive LLM web interface with built-in marketplace

Backends

This section covers services that provide the LLM inference capabilities.

Ollama
Get up and running with Llama 3.2, Mistral, Gemma 2, and other large language models.
llama.cpp
LLM inference in C/C++
vLLM
A high-throughput and memory-efficient inference and serving engine for LLMs
TabbyAPI
An OAI compatible exllamav2 API that's both lightweight and fast
Aphrodite Engine
Large-scale LLM inference engine
mistral.rs
Blazingly fast LLM inference.
openedai-speech
An OpenAI API compatible text to speech server using Coqui AI's xtts_v2 and/or piper tts as the backend.
Parler
Inference and training library for high-quality TTS models.
text-generation-inference
Inference engine from HuggingFace.
lmdeploy
AirLLM
70B inference with single 4GB GPU (very slow, though)
SGLang
SGLang is a fast serving framework for large language models and vision language models.
ktransformers
A Flexible Framework for Experiencing Cutting-edge LLM Inference Optimizations
Whisper
an OpenAI API-compatible transcription server which uses faster-whisper as its backend.
Nexa SDK
Nexa SDK is a comprehensive toolkit for supporting ONNX and GGML models.

Satellite services

Additional services that can be integrated with various Frontends and Backends to enable more features.

Home | CLI Reference | Services | Adding New Service | Compatibility

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

2. Services

Frontends

Backends

Satellite services

Clone this wiki locally