Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds section on product quantization for docs #6926

Merged
merged 38 commits into from
Apr 16, 2024
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
38 commits
Select commit Hold shift + click to select a range
4f1bd63
Adds section on product quantization for docs
jmazanec15 Apr 9, 2024
0370b85
Update knn-vector-quantization.md
vagimeli Apr 10, 2024
548805b
Update knn-vector-quantization.md
vagimeli Apr 10, 2024
167cb96
Update knn-vector-quantization.md
vagimeli Apr 10, 2024
be1c836
Update _search-plugins/knn/knn-vector-quantization.md
jmazanec15 Apr 10, 2024
255a25b
Update _search-plugins/knn/knn-vector-quantization.md
vagimeli Apr 12, 2024
08194cb
Update _search-plugins/knn/knn-index.md
vagimeli Apr 12, 2024
0a0e105
Update _search-plugins/knn/knn-vector-quantization.md
vagimeli Apr 12, 2024
83503a8
Update _search-plugins/knn/knn-vector-quantization.md
vagimeli Apr 12, 2024
1e413bc
Update _search-plugins/knn/knn-vector-quantization.md
vagimeli Apr 12, 2024
050064e
Update _search-plugins/knn/knn-vector-quantization.md
vagimeli Apr 12, 2024
f2f42ee
Update _search-plugins/knn/knn-vector-quantization.md
vagimeli Apr 12, 2024
10c35eb
Update _search-plugins/knn/knn-vector-quantization.md
vagimeli Apr 12, 2024
22508e2
Update _search-plugins/knn/knn-vector-quantization.md
vagimeli Apr 12, 2024
6d8e9d1
Update _search-plugins/knn/knn-vector-quantization.md
vagimeli Apr 12, 2024
bda3d4f
Update _search-plugins/knn/knn-vector-quantization.md
vagimeli Apr 12, 2024
7e4e956
Update knn-index.md
vagimeli Apr 12, 2024
3cf22dd
Update knn-vector-quantization.md
vagimeli Apr 15, 2024
28acaac
Merge branch 'main' into knn-pq-improved-docs
vagimeli Apr 15, 2024
5231b87
Update _search-plugins/knn/knn-index.md
vagimeli Apr 16, 2024
be1e447
Update _search-plugins/knn/knn-index.md
vagimeli Apr 16, 2024
c78bc75
Update _search-plugins/knn/knn-index.md
vagimeli Apr 16, 2024
036db27
Update _search-plugins/knn/knn-index.md
vagimeli Apr 16, 2024
08a9857
Update _search-plugins/knn/knn-index.md
vagimeli Apr 16, 2024
55bf87c
Update _search-plugins/knn/knn-vector-quantization.md
vagimeli Apr 16, 2024
38aca8d
Update _search-plugins/knn/knn-vector-quantization.md
vagimeli Apr 16, 2024
5591d88
Update _search-plugins/knn/knn-vector-quantization.md
vagimeli Apr 16, 2024
2b51288
Update _search-plugins/knn/knn-vector-quantization.md
vagimeli Apr 16, 2024
2c389a3
Update _search-plugins/knn/knn-vector-quantization.md
vagimeli Apr 16, 2024
5b0f258
Update _search-plugins/knn/knn-vector-quantization.md
vagimeli Apr 16, 2024
d054f3f
Update _search-plugins/knn/knn-vector-quantization.md
vagimeli Apr 16, 2024
9cc412f
Update _search-plugins/knn/knn-vector-quantization.md
vagimeli Apr 16, 2024
ff0bebc
Update _search-plugins/knn/knn-vector-quantization.md
vagimeli Apr 16, 2024
5b68721
Update _search-plugins/knn/knn-vector-quantization.md
vagimeli Apr 16, 2024
b15ac62
Update _search-plugins/knn/knn-vector-quantization.md
vagimeli Apr 16, 2024
072f1e6
Update _search-plugins/knn/knn-vector-quantization.md
vagimeli Apr 16, 2024
ead048e
Update _search-plugins/knn/knn-vector-quantization.md
vagimeli Apr 16, 2024
2721679
Update knn-vector-quantization.md
vagimeli Apr 16, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions _search-plugins/knn/knn-index.md
Original file line number Diff line number Diff line change
Expand Up @@ -204,7 +204,7 @@ Encoder name | Requires training | Description
:--- | :--- | :---
`flat` (Default) | false | Encode vectors as floating-point arrays. This encoding does not reduce memory footprint.
`pq` | true | An abbreviation for _product quantization_, it is a lossy compression technique that uses clustering to encode a vector into a fixed size of bytes, with the goal of minimizing the drop in k-NN search accuracy. At a high level, vectors are broken up into `m` subvectors, and then each subvector is represented by a `code_size` code obtained from a code book produced during training. For more information about product quantization, see [this blog post](https://medium.com/dotstar/understanding-faiss-part-2-79d90b1e5388).
`sq` | false | An abbreviation for _scalar quantization_. Starting with k-NN plugin version 2.13, you can use the `sq` encoder to quantize 32-bit floating-point vectors into 16-bit floats. In version 2.13, the built-in `sq` encoder is the SQFP16 Faiss encoder. The encoder reduces memory footprint with a minimal loss of precision and improves performance by using SIMD optimization (using AVX2 on x86 architecture or Neon on ARM64 architecture). For more information, see [Faiss scalar quantization]({{site.url}}{{site.baseurl}}/search-plugins/knn/knn-vector-quantization#faiss-scalar-quantization).
`sq` | false | An abbreviation for _scalar quantization_. Starting with k-NN plugin version 2.13, you can use the `sq` encoder to quantize 32-bit floating-point vectors into 16-bit floats. In version 2.13, the built-in `sq` encoder is the SQFP16 Faiss encoder. The encoder reduces memory footprint with a minimal loss of precision and improves performance by using SIMD optimization (using AVX2 on x86 architecture or Neon on ARM64 architecture). For more information, see [Faiss scalar quantization]({{site.url}}{{site.baseurl}}/search-plugins/knn/knn-vector-quantization#faiss-16-bit-scalar-quantization).
vagimeli marked this conversation as resolved.
Show resolved Hide resolved

#### PQ parameters

Expand Down Expand Up @@ -322,7 +322,7 @@ If you want to use less memory and index faster than HNSW, while maintaining sim

If memory is a concern, consider adding a PQ encoder to your HNSW or IVF index. Because PQ is a lossy encoding, query quality will drop.

You can reduce the memory footprint by a factor of 2, with a minimal loss in search quality, by using the [`fp_16` encoder]({{site.url}}{{site.baseurl}}/search-plugins/knn/knn-vector-quantization/#faiss-scalar-quantization). If your vector dimensions are within the [-128, 127] byte range, we recommend using the [byte quantizer]({{site.url}}{{site.baseurl}}/field-types/supported-field-types/knn-vector/#lucene-byte-vector) in order to reduce the memory footprint by a factor of 4. To learn more about vector quantization options, see [k-NN vector quantization]({{site.url}}{{site.baseurl}}/search-plugins/knn/knn-vector-quantization/).
You can reduce the memory footprint by a factor of 2, with a minimal loss in search quality, by using the [`fp_16` encoder]({{site.url}}{{site.baseurl}}/search-plugins/knn/knn-vector-quantization/#faiss-16-bit-scalar-quantization). If your vector dimensions are within the [-128, 127] byte range, we recommend using the [byte quantizer]({{site.url}}{{site.baseurl}}/field-types/supported-field-types/knn-vector/#lucene-byte-vector) in order to reduce the memory footprint by a factor of 4. To learn more about vector quantization options, see [k-NN vector quantization]({{site.url}}{{site.baseurl}}/search-plugins/knn/knn-vector-quantization/).
vagimeli marked this conversation as resolved.
Show resolved Hide resolved

### Memory estimation

Expand Down
123 changes: 108 additions & 15 deletions _search-plugins/knn/knn-vector-quantization.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,22 +10,42 @@

# k-NN vector quantization

By default, the k-NN plugin supports the indexing and querying of vectors of type `float`, where each dimension of the vector occupies 4 bytes of memory. For use cases that require ingestion on a large scale, keeping `float` vectors can be expensive because OpenSearch needs to construct, load, save, and search graphs (for native `nmslib` and `faiss` engines). To reduce the memory footprint, you can use vector quantization.
By default, the k-NN plugin supports the indexing and querying of vectors of type `float`, where each dimension of the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please fix the line break formatting of lines 13--16.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made the line breaks so that editing would be easier and it doesnt impact rendering (i.e. it wouldnt be one line that rolls out of the screen). Is this incorrect to do?

Copy link
Contributor

@vagimeli vagimeli Apr 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it's incorrect to enter line breaks. The site and OpenSearch Project doc team follow a specific formatting guide. I'll handle formatting the doc before moving it into editorial. https://github.com/opensearch-project/documentation-website/blob/main/FORMATTING_GUIDE.md

vector occupies 4 bytes of memory. For use cases that require ingestion on a large scale, keeping `float` vectors can be
expensive because OpenSearch needs to construct, load, save, and search graphs (for native `nmslib` and `faiss` engines
). To reduce the memory footprint, you can use vector quantization.

In OpenSearch, there are many varieties of quantization supported. In general, the level of quantization
will provide a tradeoff between the accuracy of the nearest neighbor search and the size of the memory footprint the

Check failure on line 19 in _search-plugins/knn/knn-vector-quantization.md

View workflow job for this annotation

GitHub Actions / style-job

[vale] reported by reviewdog 🐶 [OpenSearch.SubstitutionsError] Use 'trade-off' instead of 'tradeoff'. Raw Output: {"message": "[OpenSearch.SubstitutionsError] Use 'trade-off' instead of 'tradeoff'.", "location": {"path": "_search-plugins/knn/knn-vector-quantization.md", "range": {"start": {"line": 19, "column": 16}}}, "severity": "ERROR"}
vagimeli marked this conversation as resolved.
Show resolved Hide resolved
vagimeli marked this conversation as resolved.
Show resolved Hide resolved
vector search system will consume. The supported types include: Byte vectors, 16-bit scalar quantization, and
vagimeli marked this conversation as resolved.
Show resolved Hide resolved
Product Quantization (PQ).
vagimeli marked this conversation as resolved.
Show resolved Hide resolved

## Lucene byte vector

Starting with k-NN plugin version 2.9, you can use `byte` vectors with the `lucene` engine in order to reduce the amount of required memory. This requires quantizing the vectors outside of OpenSearch before ingesting them into an OpenSearch index. For more information, see [Lucene byte vector]({{site.url}}{{site.baseurl}}/field-types/supported-field-types/knn-vector#lucene-byte-vector).
Starting with k-NN plugin version 2.9, you can use `byte` vectors with the `lucene` engine in order to reduce the amount
of required memory. This requires quantizing the vectors outside of OpenSearch before ingesting them into an OpenSearch
index. For more information, see [Lucene byte vector]({{site.url}}{{site.baseurl}}/field-types/supported-field-types/knn-vector#lucene-byte-vector).

## Faiss scalar quantization
## Faiss 16-bit scalar quantization

Starting with version 2.13, the k-NN plugin supports performing scalar quantization for the Faiss engine within OpenSearch. Within the Faiss engine, a scalar quantizer (SQfp16) performs the conversion between 32-bit and 16-bit vectors. At ingestion time, when you upload 32-bit floating-point vectors to OpenSearch, SQfp16 quantizes them into 16-bit floating-point vectors and stores the quantized vectors in a k-NN index. At search time, SQfp16 decodes the vector values back into 32-bit floating-point values for distance computation. The SQfp16 quantization can decrease the memory footprint by a factor of 2. Additionally, it leads to a minimal loss in recall when differences between vector values are large compared to the error introduced by eliminating their two least significant bits. When used with [SIMD optimization]({{site.url}}{{site.baseurl}}/search-plugins/knn/knn-index#simd-optimization-for-the-faiss-engine), SQfp16 quantization can also significantly reduce search latencies and improve indexing throughput.

SIMD optimization is not supported on Windows. Using Faiss scalar quantization on Windows can lead to a significant drop in performance, including decreased indexing throughput and increased search latencies.
Starting with version 2.13, the k-NN plugin supports performing scalar quantization for the Faiss engine within
OpenSearch. Within the Faiss engine, a scalar quantizer (SQfp16) performs the conversion between 32-bit and 16-bit
vectors. At ingestion time, when you upload 32-bit floating-point vectors to OpenSearch, SQfp16 quantizes them into
16-bit floating-point vectors and stores the quantized vectors in a k-NN index. At search time, SQfp16 decodes the
vector values back into 32-bit floating-point values for distance computation. The SQfp16 quantization can decrease the
memory footprint by a factor of 2. Additionally, it leads to a minimal loss in recall when differences between vector
values are large compared to the error introduced by eliminating their two least significant bits. When used with
[SIMD optimization]({{site.url}}{{site.baseurl}}/search-plugins/knn/knn-index#simd-optimization-for-the-faiss-engine), SQfp16 quantization can also significantly reduce search latencies and improve indexing
throughput.

SIMD optimization is not supported on Windows. Using Faiss scalar quantization on Windows can lead to a significant drop
in performance, including decreased indexing throughput and increased search latencies.
{: .warning}

### Using Faiss scalar quantization

To use Faiss scalar quantization, set the k-NN vector field's `method.parameters.encoder.name` to `sq` when creating a k-NN index:
To use Faiss scalar quantization, set the k-NN vector field's `method.parameters.encoder.name` to `sq` when creating a
k-NN index:

```json
PUT /test-index
Expand Down Expand Up @@ -60,14 +80,22 @@
```
{% include copy-curl.html %}

Optionally, you can specify the parameters in `method.parameters.encoder`. For more information about `encoder` object parameters, see [SQ parameters]({{site.url}}{{site.baseurl}}/search-plugins/knn/knn-index/#sq-parameters).
Optionally, you can specify the parameters in `method.parameters.encoder`. For more information about `encoder` object
parameters, see [SQ parameters]({{site.url}}{{site.baseurl}}/search-plugins/knn/knn-index/#sq-parameters).

The `fp16` encoder converts 32-bit vectors into their 16-bit counterparts. For this encoder type, the vector values must be in the [-65504.0, 65504.0] range. To define how to handle out-of-range values, the preceding request specifies the `clip` parameter. By default, this parameter is `false`, and any vectors containing out-of-range values are rejected. When `clip` is set to `true` (as in the preceding request), out-of-range vector values are rounded up or down so that they are in the supported range. For example, if the original 32-bit vector is `[65510.82, -65504.1]`, the vector will be indexed as a 16-bit vector `[65504.0, -65504.0]`.
The `fp16` encoder converts 32-bit vectors into their 16-bit counterparts. For this encoder type, the vector values must
be in the [-65504.0, 65504.0] range. To define how to handle out-of-range values, the preceding request specifies the
`clip` parameter. By default, this parameter is `false`, and any vectors containing out-of-range values are rejected.
When `clip` is set to `true` (as in the preceding request), out-of-range vector values are rounded up or down so that
they are in the supported range. For example, if the original 32-bit vector is `[65510.82, -65504.1]`, the vector will
be indexed as a 16-bit vector `[65504.0, -65504.0]`.

We recommend setting `clip` to `true` only if very few elements lie outside of the supported range. Rounding the values may cause a drop in recall.
We recommend setting `clip` to `true` only if very few elements lie outside of the supported range. Rounding the values
may cause a drop in recall.
{: .note}

The following example method definition specifies the Faiss SQfp16 encoder, which rejects any indexing request that contains out-of-range vector values (because the `clip` parameter is `false` by default):
The following example method definition specifies the Faiss SQfp16 encoder, which rejects any indexing request that
contains out-of-range vector values (because the `clip` parameter is `false` by default):

```json
PUT /test-index
Expand Down Expand Up @@ -133,15 +161,17 @@
```
{% include copy-curl.html %}

## Memory estimation
### Memory estimation

In the best-case scenario, 16-bit vectors produced by the Faiss SQfp16 quantizer require 50% of the memory that 32-bit vectors require.
In the best-case scenario, 16-bit vectors produced by the Faiss SQfp16 quantizer require 50% of the memory that 32-bit
vectors require.

#### HNSW memory estimation

The memory required for HNSW is estimated to be `1.1 * (2 * dimension + 8 * M)` bytes/vector.

As an example, assume that you have 1 million vectors with a dimension of 256 and M of 16. The memory requirement can be estimated as follows:
As an example, assume that you have 1 million vectors with a dimension of 256 and M of 16. The memory requirement can be
estimated as follows:

```bash
1.1 * (2 * 256 + 8 * 16) * 1,000,000 ~= 0.656 GB
Expand All @@ -151,9 +181,72 @@

The memory required for IVF is estimated to be `1.1 * (((2 * dimension) * num_vectors) + (4 * nlist * d))` bytes/vector.

As an example, assume that you have 1 million vectors with a dimension of 256 and `nlist` of 128. The memory requirement can be estimated as follows:
As an example, assume that you have 1 million vectors with a dimension of 256 and `nlist` of 128. The memory requirement
can be estimated as follows:

```bash
1.1 * (((2 * 256) * 1,000,000) + (4 * 128 * 256)) ~= 0.525 GB
```

## Faiss product quantization

Product quantization is a technique that allows users to represent a vector in a configurable amount of bits. In
general, it can be used to achieve a higher level of compression compared to byte and scalar quantization. Product
quantization works by breaking up vectors into _m_ subvectors, and encoding each subvector with _code_size_ bits. Thus,

Check failure on line 195 in _search-plugins/knn/knn-vector-quantization.md

View workflow job for this annotation

GitHub Actions / style-job

[vale] reported by reviewdog 🐶 [OpenSearch.SpacingWords] There should be once space between words in 'and encoding'. Raw Output: {"message": "[OpenSearch.SpacingWords] There should be once space between words in 'and encoding'.", "location": {"path": "_search-plugins/knn/knn-vector-quantization.md", "range": {"start": {"line": 195, "column": 64}}}, "severity": "ERROR"}
the total amount of memory for the vector ends up being `m*code_size` bits, plus overhead. For more details about the
parameters of product quantization, see
[PQ parameters]({{site.url}}{{site.baseurl}}/search-plugins/knn/knn-index/#pq-parameters). Product quantization is only
supported for the _Faiss_ engine and can be used with either the _HNSW_ or the _IVF_ ANN algorithms.

### Using Faiss product quantization

In order to minimize the loss in accuracy, product quantization requires a _training_ step that builds a model based on
the distribution of the data that will be searched over.

Under the hood, the product quantizer is trained by running k-Means clustering on a set of training vectors for each
sub-vector space and extracts the centroids to be used for the encoding. The training vectors can either be a subset
of the vectors to be ingested, or vectors that have the same distribution and dimension as the vectors to be ingested.
In OpenSearch, the training vectors need to be present in an index. In general, the amount of training data will depend
on which ANN algorithm will be used and how much data will go into the index. For IVF-based indices, a good number of

Check failure on line 210 in _search-plugins/knn/knn-vector-quantization.md

View workflow job for this annotation

GitHub Actions / style-job

[vale] reported by reviewdog 🐶 [OpenSearch.SubstitutionsError] Use 'indexes' instead of 'indices'. Raw Output: {"message": "[OpenSearch.SubstitutionsError] Use 'indexes' instead of 'indices'.", "location": {"path": "_search-plugins/knn/knn-vector-quantization.md", "range": {"start": {"line": 210, "column": 93}}}, "severity": "ERROR"}
training vectors to use is `max(1000*nlist, 2^code_size * 1000)`. For HNSW-based indices, a good number is

Check failure on line 211 in _search-plugins/knn/knn-vector-quantization.md

View workflow job for this annotation

GitHub Actions / style-job

[vale] reported by reviewdog 🐶 [OpenSearch.SubstitutionsError] Use 'indexes' instead of 'indices'. Raw Output: {"message": "[OpenSearch.SubstitutionsError] Use 'indexes' instead of 'indices'.", "location": {"path": "_search-plugins/knn/knn-vector-quantization.md", "range": {"start": {"line": 211, "column": 82}}}, "severity": "ERROR"}
`2^code_size*1000` training vectors. See [Faiss's documentation](https://github.com/facebookresearch/faiss/wiki/FAQ#how-many-training-points-do-i-need-for-k-means)
for more details on how these numbers are arrived at.

For product quantization, the two parameters that need to be selected are _m_ and _code_size_. _m_ determines how many
sub-vectors the vectors should be broken up into to encode separately - consequently, the _dimension_ needs to be
divisible by _m_. _code_size_ determines how many bits each sub-vector will be encoded with. In general, a good place to
start is setting `code_size = 8` and then tuning _m_ to get the desired tradeoff between memory footprint and recall.

Check failure on line 218 in _search-plugins/knn/knn-vector-quantization.md

View workflow job for this annotation

GitHub Actions / style-job

[vale] reported by reviewdog 🐶 [OpenSearch.SubstitutionsError] Use 'trade-off' instead of 'tradeoff'. Raw Output: {"message": "[OpenSearch.SubstitutionsError] Use 'trade-off' instead of 'tradeoff'.", "location": {"path": "_search-plugins/knn/knn-vector-quantization.md", "range": {"start": {"line": 218, "column": 73}}}, "severity": "ERROR"}

For an example of setting up an index with product quantization, see [this tutorial]({{site.url}}{{site.baseurl}}/search-plugins/knn/approximate-knn/#building-a-k-nn-index-from-a-model).
vagimeli marked this conversation as resolved.
Show resolved Hide resolved

### Memory Estimation

Check failure on line 222 in _search-plugins/knn/knn-vector-quantization.md

View workflow job for this annotation

GitHub Actions / style-job

[vale] reported by reviewdog 🐶 [OpenSearch.HeadingCapitalization] 'Memory Estimation' is a heading and should be in sentence case. Raw Output: {"message": "[OpenSearch.HeadingCapitalization] 'Memory Estimation' is a heading and should be in sentence case.", "location": {"path": "_search-plugins/knn/knn-vector-quantization.md", "range": {"start": {"line": 222, "column": 5}}}, "severity": "ERROR"}
vagimeli marked this conversation as resolved.
Show resolved Hide resolved

While product quantization is meant to represent individual vectors with `m*code_size` bits, in reality the indices

Check failure on line 224 in _search-plugins/knn/knn-vector-quantization.md

View workflow job for this annotation

GitHub Actions / style-job

[vale] reported by reviewdog 🐶 [OpenSearch.SubstitutionsError] Use 'indexes' instead of 'indices'. Raw Output: {"message": "[OpenSearch.SubstitutionsError] Use 'indexes' instead of 'indices'.", "location": {"path": "_search-plugins/knn/knn-vector-quantization.md", "range": {"start": {"line": 224, "column": 109}}}, "severity": "ERROR"}
take up more space than this. This is mainly due to the overhead of storing certain code tables and auxilary data

Check failure on line 225 in _search-plugins/knn/knn-vector-quantization.md

View workflow job for this annotation

GitHub Actions / style-job

[vale] reported by reviewdog 🐶 [OpenSearch.Spelling] Error: auxilary. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks. Raw Output: {"message": "[OpenSearch.Spelling] Error: auxilary. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks.", "location": {"path": "_search-plugins/knn/knn-vector-quantization.md", "range": {"start": {"line": 225, "column": 101}}}, "severity": "ERROR"}
structures.

Some of the memory formulas depend on the number of segments present. Typically, this is not known beforehand but a good
default value is 300.
{: .note}

#### HNSW memory estimation

The memory required for HNSW with PQ is estimated to be `1.1*(((pq_code_size / 8) * pq_m + 24 + 8 * hnsw_m) * num_vectors + num_segments * (2^pq_code_size * 4 * d))` bytes.

As an example, assume that you have 1 million vectors with a dimension of 256, `hnsw_m` of 16, `pq_m` of 32,
`pq_code_size` of 8 and 100 segments. The memory requirement can be estimated as follows:

```bash
1.1*((8 / 8 * 32 + 24 + 8 * 16) * 1000000 + 100 * (2^8 * 4 * 256)) ~= 0.215 GB
```

#### IVF memory estimation

The memory required for IVF with PQ is estimated to be `1.1*(((pq_code_size / 8) * pq_m + 24) * num_vectors + num_segments * (2^code_size * 4 * d + 4 * ivf_nlist * d))` bytes.

As an example, assume that you have 1 million vectors with a dimension of 256, `ivf_nlist` of 512, `pq_m` of 32,
`pq_code_size` of 8 and 100 segments. The memory requirement can be estimated as follows:

```bash
1.1*((8 / 8 * 64 + 24) * 1000000 + 100 * (2^8 * 4 * 256 + 4 * 512 * 256)) ~= 0.171 GB
```
Loading