Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add vector search as new operation type to search #423

Merged
merged 1 commit into from
Dec 13, 2023

Conversation

VijayanB
Copy link
Member

@VijayanB VijayanB commented Dec 12, 2023

Description

For Vector search, we have to compare actual response's hits with expected neighbors to estimate recall for search. However, there is no provision to return recall similar to throughput. Hence, we will publish recall metrics like recall@k, recall@r as part of meta object. We also time those recall calculation and publish in ms to meta object.

Added tests to verify results are as expected.

Issues Resolved

Part of #103

Testing

  • New functionality includes testing

[Describe how this change was tested]

make test 

.venv/lib/python3.8/site-packages/pkg_resources/__init__.py:2868
.venv/lib/python3.8/site-packages/pkg_resources/__init__.py:2868
  /Users/balasvij/PycharmProjects/knn/opensearch-benchmark/.venv/lib/python3.8/site-packages/pkg_resources/__init__.py:2868: DeprecationWarning: Deprecated call to `pkg_resources.declare_namespace('google')`.
  Implementing implicit namespace packages (as specified in PEP 420) is preferred to `pkg_resources.declare_namespace`. See https://setuptools.pypa.io/en/latest/references/keywords.html#keyword-namespace-packages
    declare_namespace(pkg)

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
================= 1191 passed, 5 skipped, 3 warnings in 16.93s =================


---
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check [here](https://github.com/opensearch-project/OpenSearch/blob/main/CONTRIBUTING.md#developer-certificate-of-origin).

@VijayanB VijayanB changed the title Add vector search as new operation type to existing search Add vector search as new operation type to search Dec 12, 2023
@VijayanB VijayanB force-pushed the add-vector-search branch 2 times, most recently from 3ad8c71 to 255f245 Compare December 12, 2023 22:55
@VijayanB
Copy link
Member Author

VijayanB commented Dec 12, 2023

@jmazanec15 @navneet1v I removed recall@r, and, kept only recall@k, where, we will compare predictions with top k neighbors from neighbor's list.

@navneet1v
Copy link

@jmazanec15 @navneet1v I removed rcall@r, and, kept only recall@k, where, we will compare predictions with top k neighbors from neighbor's list.

can we have recall@1 always getting calculated?

@VijayanB
Copy link
Member Author

@jmazanec15 @navneet1v I removed rcall@r, and, kept only recall@k, where, we will compare predictions with top k neighbors from neighbor's list.

can we have recall@1 always getting calculated?

Added recall@1 and updated testcase.

Copy link
Collaborator

@IanHoang IanHoang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a couple comments

For Vector search, we have to compare actual response's hits with
expected neighbors to estimate recall for search. However,
there is no provision to return recall similar to throughput.
Hence, we will publish recall metrics like recall@k,
recall@r as part of meta object. We also time those recall calculation
and publish in ms to meta object.

Added tests to verify results are as expected.

Signed-off-by: Vijayan Balasubramanian <[email protected]>
@IanHoang IanHoang merged commit ad81ccf into opensearch-project:main Dec 13, 2023
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants