Skip to content

Commit

Permalink
Uniformize rerank -> re-rank
Browse files Browse the repository at this point in the history
  • Loading branch information
leemthompo committed Sep 19, 2024
1 parent ff7106b commit e815f94
Show file tree
Hide file tree
Showing 2 changed files with 26 additions and 26 deletions.
44 changes: 22 additions & 22 deletions docs/reference/reranking/semantic-reranking.asciidoc
Original file line number Diff line number Diff line change
@@ -1,51 +1,51 @@
[[semantic-reranking]]
== Semantic reranking
== Semantic re-ranking

preview::[]

[TIP]
====
This overview focuses more on the high-level concepts and use cases for semantic reranking. For full implementation details on how to set up and use semantic reranking in {es}, see the <<text-similarity-reranker-retriever,reference documentation>> in the Search API docs.
This overview focuses more on the high-level concepts and use cases for semantic re-ranking. For full implementation details on how to set up and use semantic re-ranking in {es}, see the <<text-similarity-reranker-retriever,reference documentation>> in the Search API docs.
====

Rerankers improve the relevance of results from earlier-stage retrieval mechanisms.
_Semantic_ rerankers use machine learning models to reorder search results based on their semantic similarity to a query.
Re-rankers improve the relevance of results from earlier-stage retrieval mechanisms.
_Semantic_ re-rankers use machine learning models to reorder search results based on their semantic similarity to a query.

Semantic reranking requires relatively large and complex machine learning models and operates in real-time in response to queries.
Semantic re-ranking requires relatively large and complex machine learning models and operates in real-time in response to queries.
This technique makes sense on a small _top-k_ result set, as one the of the final steps in a pipeline.
This is a powerful technique for improving search relevance that works equally well with keyword, semantic, or hybrid retrieval algorithms.

The next sections provide more details on the benefits, use cases, and model types used for semantic reranking.
The final sections include a practical, high-level overview of how to implement <<semantic-reranking-in-es,semantic reranking in {es}>> and links to the full reference documentation.
The next sections provide more details on the benefits, use cases, and model types used for semantic re-ranking.
The final sections include a practical, high-level overview of how to implement <<semantic-reranking-in-es,semantic re-ranking in {es}>> and links to the full reference documentation.

[discrete]
[[semantic-reranking-use-cases]]
=== Use cases

Semantic reranking enables a variety of use cases:
Semantic re-ranking enables a variety of use cases:

* *Lexical (BM25) retrieval results reranking*
* *Lexical (BM25) retrieval results re-ranking*
** Out-of-the-box semantic search by adding a simple API call to any lexical/BM25 retrieval pipeline.
** Adds semantic search capabilities on top of existing indices without reindexing, perfect for quick improvements.
** Ideal for environments with complex existing indices.

* *Semantic retrieval results reranking*
* *Semantic retrieval results re-ranking*
** Improves results from semantic retrievers using ELSER sparse vector embeddings or dense vector embeddings by using more powerful models.
** Adds a refinement layer on top of hybrid retrieval with <<rrf, reciprocal rank fusion (RRF)>>.

* *General applications*
** Supports automatic and transparent chunking, eliminating the need for pre-chunking at index time.
** Provides explicit control over document relevance in retrieval-augmented generation (RAG) uses cases or other scenarios involving language model (LLM) inputs.

Now that we've outlined the value of semantic reranking, we'll explore the specific models that power this process and how they differ.
Now that we've outlined the value of semantic re-ranking, we'll explore the specific models that power this process and how they differ.

[discrete]
[[semantic-reranking-models]]
=== Cross-encoder and bi-encoder models

At a high level, two model types are used for semantic reranking: cross-encoders and bi-encoders.
At a high level, two model types are used for semantic re-ranking: cross-encoders and bi-encoders.

NOTE: In this version, {es} *only supports cross-encoders* for semantic reranking.
NOTE: In this version, {es} *only supports cross-encoders* for semantic re-ranking.

* A *cross-encoder model* can be thought of as a more powerful, all-in-one solution, because it generates query-aware document representations.
It takes the query and document texts as a single, concatenated input.
Expand All @@ -62,7 +62,7 @@ If you're interested in a more detailed analysis of the practical differences be
.Comparisons between cross-encoder and bi-encoder
[%collapsible]
==============
The following is a non-exhaustive list of considerations when choosing between cross-encoders and bi-encoders for semantic reranking:
The following is a non-exhaustive list of considerations when choosing between cross-encoders and bi-encoders for semantic re-ranking:
* Because a cross-encoder model simultaneously processes both query and document texts, it can better infer their relevance, making it more effective as a reranker than a bi-encoder.
* Cross-encoder models are generally larger and more computationally intensive, resulting in higher latencies and increased computational costs.
Expand All @@ -74,28 +74,28 @@ For example, their ability to take word order into account can improve on dense
This enables you to maintain high relevance in result sets, by setting a minimum score threshold for all queries.
For example, this is important when using results in a RAG workflow or if you're otherwise feeding results to LLMs.
Note that similarity scores from bi-encoders/embedding similarities are _query-dependent_, meaning you cannot set universal cut-offs.
* Bi-encoders rerank using embeddings. You can improve your reranking latency by creating embeddings at ingest-time. These embeddings can be stored for reranking without being indexed for retrieval, reducing your memory footprint.
* Bi-encoders rerank using embeddings. You can improve your re-ranking latency by creating embeddings at ingest-time. These embeddings can be stored for re-ranking without being indexed for retrieval, reducing your memory footprint.
==============

[discrete]
[[semantic-reranking-in-es]]
=== Semantic reranking in {es}
=== Semantic re-ranking in {es}

In {es}, semantic rerankers are implemented using the {es} <<inference-apis,Inference API>> and a <<retriever,retriever>>.
In {es}, semantic re-rankers are implemented using the {es} <<inference-apis,Inference API>> and a <<retriever,retriever>>.

To use semantic reranking in {es}, you need to:
To use semantic re-ranking in {es}, you need to:

. *Choose a reranking model*.
. *Choose a re-ranking model*.
Currently you can:

** Integrate directly with the <<infer-service-cohere,Cohere Rerank inference endpoint>> using the `rerank` task type
** Integrate directly with the <<infer-service-google-vertex-ai,Google Vertex AI inference endpoint>> using the `rerank` task type
** Upload a model to {es} from Hugging Face with {eland-docs}/machine-learning.html#ml-nlp-pytorch[Eland]. You'll need to use the `text_similarity` NLP task type when loading the model using Eland. Refer to {ml-docs}/ml-nlp-model-ref.html#ml-nlp-model-ref-text-similarity[the Elastic NLP model reference] for a list of third party text similarity models supported by {es} for semantic reranking.
** Upload a model to {es} from Hugging Face with {eland-docs}/machine-learning.html#ml-nlp-pytorch[Eland]. You'll need to use the `text_similarity` NLP task type when loading the model using Eland. Refer to {ml-docs}/ml-nlp-model-ref.html#ml-nlp-model-ref-text-similarity[the Elastic NLP model reference] for a list of third party text similarity models supported by {es} for semantic re-ranking.
*** Then set up an <<inference-example-eland,{es} service inference endpoint>> with the `rerank` task type
. *Create a `rerank` task using the <<put-inference-api,{es} Inference API>>*.
The Inference API creates an inference endpoint and configures your chosen machine learning model to perform the reranking task.
The Inference API creates an inference endpoint and configures your chosen machine learning model to perform the re-ranking task.
. *Define a `text_similarity_reranker` retriever in your search request*.
The retriever syntax makes it simple to configure both the retrieval and reranking of search results in a single API call.
The retriever syntax makes it simple to configure both the retrieval and re-ranking of search results in a single API call.

.*Example search request* with semantic reranker
[%collapsible]
Expand Down
8 changes: 4 additions & 4 deletions docs/reference/search/retriever.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -325,7 +325,7 @@ The `text_similarity_reranker` retriever uses an NLP model to improve search res

[TIP]
====
Refer to <<semantic-reranking>> for a high level overview of semantic reranking.
Refer to <<semantic-reranking>> for a high level overview of semantic re-ranking.
====

===== Prerequisites
Expand Down Expand Up @@ -387,7 +387,7 @@ A text similarity re-ranker retriever is a compound retriever. Child retrievers
[[text-similarity-reranker-retriever-example-cohere]]
==== Example: Cohere Rerank

This example enables out-of-the-box semantic search by reranking top documents using the Cohere Rerank API. This approach eliminate the need to generate and store embeddings for all indexed documents.
This example enables out-of-the-box semantic search by re-ranking top documents using the Cohere Rerank API. This approach eliminate the need to generate and store embeddings for all indexed documents.
This requires a <<infer-service-cohere,Cohere Rerank inference endpoint>> using the `rerank` task type.

[source,js]
Expand Down Expand Up @@ -418,7 +418,7 @@ GET /index/_search

[discrete]
[[text-similarity-reranker-retriever-example-eland]]
==== Example: Semantic reranking with a Hugging Face model
==== Example: Semantic re-ranking with a Hugging Face model

The following example uses the `cross-encoder/ms-marco-MiniLM-L-6-v2` model from Hugging Face to rerank search results based on semantic similarity.
The model must be uploaded to {es} using https://www.elastic.co/guide/en/elasticsearch/client/eland/current/machine-learning.html#ml-nlp-pytorch[Eland].
Expand All @@ -428,7 +428,7 @@ The model must be uploaded to {es} using https://www.elastic.co/guide/en/elastic
Refer to {ml-docs}/ml-nlp-model-ref.html#ml-nlp-model-ref-text-similarity[the Elastic NLP model reference] for a list of third party text similarity models supported by {es}.
====

Follow these steps to load the model and create a semantic reranker.
Follow these steps to load the model and create a semantic re-ranker.

. Install Eland using `pip`
+
Expand Down

0 comments on commit e815f94

Please sign in to comment.