Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Vectorstore: use a retriever query for hybrid search #2666

Merged

Conversation

miguelgrinberg
Copy link
Contributor

Fixes #2651

Copy link

A documentation preview will be available soon.

Request a new doc build by commenting
  • Rebuild this PR: run docs-build
  • Rebuild this PR and all Elastic docs: run docs-build rebuild

run docs-build is much faster than run docs-build rebuild. A rebuild should only be needed in rare situations.

If your PR continues to fail for an unknown reason, the doc build pipeline may be broken. Elastic employees can check the pipeline status here.

@miguelgrinberg miguelgrinberg force-pushed the use-retrievers-hybrid-search branch 2 times, most recently from 711892a to 1ba167a Compare October 1, 2024 09:09
Copy link
Member

@pquentin pquentin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While this looks good, can you please explain why we can no longer disable RRF with AsyncDenseVectorStrategy(..., rrf=False)? Should we change the annotation of that rrf parameter from Union[bool, Dict[str, Any]] to Dict[str, Any]?

@miguelgrinberg
Copy link
Contributor Author

I did not consider the case of rrf = False, I just assumed that rrf = False means that hybrid search is off, but now I realize that we have two booleans.

I'm not sure what would be the expected response when having hybrid search enabled and RRF disabled. Does that even make sense? I guess I need to try it out to see what response I get.

}

if self.rrf is False:
query_body = {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With rrf=False we have to use a standard query, since the retriever query can only combine results from multiple searches with RRF.

@miguelgrinberg
Copy link
Contributor Author

miguelgrinberg commented Oct 4, 2024

@pquentin Okay, new revision of this fix. When the user asks for hybrid=True and rrf=False we use a regular query with knn and query sections in the body. Only when rrf=True or rrf=dict(<rrf-options>) we use the new retriever query.

Copy link
Member

@pquentin pquentin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! LGTM.

@miguelgrinberg miguelgrinberg merged commit e22de7e into elastic:main Oct 14, 2024
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Bug] Vectorstore Helper: Hybrid does not work in Elasticsearch Serverless
3 participants