You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In k-NN benchmarking tests, the use of byte rather than float vectors resulted in a significant reduction in storage and memory usage as well as improved indexing throughput and reduced query latency. Additionally, precision on recall was not greatly affected
However, when we enable this in the mapping, we get an illegal_argument_exception like this:
"error": {
"type": "mapper_parsing_exception",
"reason": "failed to parse field [_fulltext_vectorized.knn] of type [knn_vector] in document with id '4'. Preview of field's value: '-0.0010178537'",
"caused_by": {
"type": "illegal_argument_exception",
"reason": "[data_type] field was set as [byte] in index mapping. But, KNN vector values are floats instead of byte integers"
}
}
How can one reproduce the bug?
Define a dense vector field like this as vector output field of the Neural Search plugin:
The Neural Search plugin vectorized with bytes instead of floats when byte is used as data_type in the mapping. Alternatively, allow us to configure this as a property in the ingest pipeline.
What is your host/environment?
Running OpenSearch 2.10.0 in Docker with latest version of Ubuntu.
Do you have any screenshots?
n/a
Do you have any additional context?
n/a
The text was updated successfully, but these errors were encountered:
@juntezhang, I think the behavior is expected. In my understanding neural search processor get vector data from model and ingest it as it is. If the value returned by model is float, it will throw an error as byte range is only between -128 and 127.
What is the bug?
Since OpenSearch 2.9.0, we are able to set the
data_type
forknn_vector
types to use the Lucene byte vector. See for explanation here: https://opensearch.org/docs/latest/field-types/supported-field-types/knn-vector/#lucene-byte-vectorI quote:
However, when we enable this in the mapping, we get an
illegal_argument_exception
like this:How can one reproduce the bug?
Define a dense vector field like this as vector output field of the Neural Search plugin:
What is the expected behavior?
The Neural Search plugin vectorized with bytes instead of floats when
byte
is used asdata_type
in the mapping. Alternatively, allow us to configure this as a property in the ingest pipeline.What is your host/environment?
Running OpenSearch 2.10.0 in Docker with latest version of Ubuntu.
Do you have any screenshots?
n/a
Do you have any additional context?
n/a
The text was updated successfully, but these errors were encountered: