Skip to content

2. Services

av edited this page Nov 24, 2024 · 34 revisions

Various services that are integrated with Harbor. The link in the service name will lead you to a dedicated page in Harbor's wiki with details on getting started with the service.

Frontends

This section covers services that can provide you with an interface for interacting with the language models.

  • Open WebUI
    widely adopted and feature rich web interface for interacting with LLMs. Supports OpenAI-compatible and Ollama backends, multi-users, multi-model chats, custom prompts, TTS, Web RAG, RAG, and much much more.

  • ComfyUI
    The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface.

  • LibreChat
    Open-source ChatGPT UI alternative supporting multiple AI providers (Anthropic, AWS, OpenAI, Azure, Groq, Mistral, Google) with features like model switching, message search, and multi-user support. Includes integration with DALL-E-3 and various APIs.

  • HuggingFace ChatUI
    A chat interface using open source models, eg OpenAssistant or Llama. It is a SvelteKit app and it powers the HuggingChat app on hf.co/chat.

  • Lobe Chat
    An open-source, modern-design AI chat framework. Supports Multi AI Providers( OpenAI / Claude 3 / Gemini / Ollama / Azure / DeepSeek), Knowledge Base (file upload / knowledge management / RAG ), Multi-Modals (Vision/TTS) and plugin system.

  • hollama
    A minimal web-UI for talking to Ollama servers.

  • parllama
    TUI for Ollama

  • BionicGPT
    on-premise LLM web UI with support for OpenAI-compatible backends

  • AnythingLLM
    The all-in-one Desktop & Docker AI application with built-in RAG, AI agents, and more.

  • Chat Nio
    Comprehensive LLM web interface with built-in marketplace

Backends

This section covers services that provide the LLM inference capabilities.

  • Ollama
    Get up and running with Llama 3.2, Mistral, Gemma 2, and other large language models.

  • llama.cpp
    LLM inference in C/C++

  • vLLM
    A high-throughput and memory-efficient inference and serving engine for LLMs

  • TabbyAPI
    An OAI compatible exllamav2 API that's both lightweight and fast

  • Aphrodite Engine
    Large-scale LLM inference engine

  • mistral.rs
    Blazingly fast LLM inference.

  • openedai-speech
    An OpenAI API compatible text to speech server using Coqui AI's xtts_v2 and/or piper tts as the backend.

  • Parler
    Inference and training library for high-quality TTS models.

  • text-generation-inference
    Inference engine from HuggingFace.

  • lmdeploy

  • AirLLM
    70B inference with single 4GB GPU (very slow, though)

  • SGLang
    SGLang is a fast serving framework for large language models and vision language models.

  • ktransformers
    A Flexible Framework for Experiencing Cutting-edge LLM Inference Optimizations

  • Whisper
    an OpenAI API-compatible transcription server which uses faster-whisper as its backend.

  • Nexa SDK
    Nexa SDK is a comprehensive toolkit for supporting ONNX and GGML models.

Satellite services

Additional services that can be integrated with various Frontends and Backends to enable more features.

Clone this wiki locally