Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: Update docs #32

Merged
merged 13 commits into from
Dec 1, 2024
128 changes: 32 additions & 96 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
<div align="center">

# Vicinity: The Lightweight Vector Store
# Vicinity: Lightweight Nearest Neighbors

</div>

Expand All @@ -17,18 +17,20 @@
</a>
<a href="https://github.com/MinishLab/vicinity/blob/main/LICENSE"><img src="https://img.shields.io/badge/license-MIT-green" alt="License - MIT"></a>
</h2>

[Quickstart](#quickstart) •
[Main Features](#main-features) •
[Supported Backends](#supported-backends)

</div>


## Table of contents
Vicinity is a light-weight, low-dependency vector store. It provides a simple and intuitive interface for nearest neighbor search, with support for different backends and evaluation.

- [Quickstart](#quickstart)
- [Main Features](#main-features)
- [Supported Backends](#supported-backends)
- [Backend Parameters](#backend-parameters)
- [Usage](#usage)
There are many nearest neighbors packages and methods out there. However, we found it difficult to compare them. Every package has its own interface, quirks, and limitations, and learning a new package can be time-consuming. In addition to that, how do you effectively evaluate different packages? How do you know which one is the best for your use case?

Vicinity is the lightest-weight vector store. Just put in some vectors, calculate query vectors, and off you go. It provides a simple and intuitive API for nearest neighbor search, with support for different backends.

This is where Vicinity comes in. Instead of learning a new interface for each new package or backend, Vicinity provides a unified interface for all backends. This allows you to easily experiment with different indexing methods and distance metrics and choose the best one for your use case. Vicinity also provides a simple way to evaluate the performance of different backends, allowing you to measure the queries per second and recall.

## Quickstart

Expand All @@ -42,34 +44,49 @@ The following code snippet demonstrates how to use Vicinity for nearest neighbor
```python
import numpy as np
from vicinity import Vicinity
from vicinity.datatypes import Backend
from vicinity.datatypes import Backend, Metric

# Create some dummy data
items = ["triforce", "master sword", "hylian shield", "boomerang", "hookshot"]
vectors = np.random.rand(len(items), 128)

# Initialize the Vicinity instance (using the basic backend)
vicinity = Vicinity.from_vectors_and_items(vectors=vectors, items=items, backend_type=Backend.BASIC)
# Initialize the Vicinity instance (using the basic backend and cosine metric)
vicinity = Vicinity.from_vectors_and_items(vectors=vectors, items=items, backend_type=Backend.BASIC, metric=Metric.COSINE)

# Query for nearest neighbors with a top-k search
# Create a query vector
query_vector = np.random.rand(128)

# Query for nearest neighbors with a top-k search
results = vicinity.query([query_vector], k=3)

# Query for nearest neighbors with a threshold search
results = vicinity.query_threshold([query_vector], threshold=0.9)
```

# Save the vector store
Saving and loading a vector store:
```python
vicinity.save('my_vector_store')

# Load the vector store
vicinity = Vicinity.load('my_vector_store')
```

Evaluating a backend:
```python
# Use the first 1000 vectors as query vectors
query_vectors = vectors[:1000]

# Evaluate the Vicinity instance by measuring the queries per second and recall
qps, recall = vicinity.evaluate(
full_vectors=vectors,
query_vectors=query_vectors,
)
```

## Main Features
Vicinity provides the following features:
- Lightweight: Minimal dependencies and fast performance.
- Flexible Backend Support: Use different backends for vector storage and search.
- Serialization: Save and load vector stores for persistence.
- Evaluation: Easily evaluate the performance of different backends.
- Easy to Use: Simple and intuitive API.

## Supported Backends
Expand Down Expand Up @@ -109,87 +126,6 @@ NOTE: the ANN backends do not support dynamic deletion. To delete items, you nee



## Usage

<details>
<summary> Creating a Vector Store
</summary>
<br>

You can create a Vicinity instance by providing items and their corresponding vectors:


```python
from vicinity import Vicinity
import numpy as np

items = ["triforce", "master sword", "hylian shield", "boomerang", "hookshot"]
vectors = np.random.rand(len(items), 128)

vicinity = Vicinity.from_vectors_and_items(vectors=vectors, items=items)
```

</details>

<details>
<summary> Querying
</summary>
<br>

Find the k nearest neighbors for a given vector:

```python
query_vector = np.random.rand(128)
results = vicinity.query([query_vector], k=3)
```

Find all neighbors within a given threshold:

```python
query_vector = np.random.rand(128)
results = vicinity.query_threshold([query_vector], threshold=0.9)
```
</details>

<details>

<summary> Inserting and Deleting Items
</summary>
<br>

Insert new items:

```python
new_items = ["ocarina", "bow"]
new_vectors = np.random.rand(2, 128)
vicinity.insert(new_items, new_vectors)
```

Delete items:

```python
vicinity.delete(["hookshot"])
```
</details>

<details>
<summary> Saving and Loading
</summary>
<br>

Save the vector store:

```python
vicinity.save('my_vector_store')
```

Load the vector store:

```python
vicinity = Vicinity.load('my_vector_store')
```
</details>

## License

MIT