Skip to content

Latest commit

 

History

History
24 lines (17 loc) · 732 Bytes

README.md

File metadata and controls

24 lines (17 loc) · 732 Bytes

Local Document Retriever with MixedBread Embeddings

A document retrieval system using ChromaDB and MixedBread embeddings for efficient semantic search capabilities. This project focuses on local deployment with the mixedbread-ai/mxbai-embed-large-v1 model.

Features

  • Local embedding generation using mixedbread-ai/mxbai-embed-large-v1
  • Document preprocessing with markitdown
  • Efficient document storage and retrieval using ChromaDB
  • Metadata filtering support
  • Fully offline capable

Prerequisites

  • Python 3.8+
  • Local copy of mixedbread-ai/mxbai-embed-large-v1 model
  • Sufficient storage for document embeddings

Installation

  1. Clone this repository:
git clone <repository-url>
cd document-retriever