diff --git a/docs/reference/reranking/semantic-reranking.asciidoc b/docs/reference/reranking/semantic-reranking.asciidoc index 200a5f673dc0c..4ebe90e44708e 100644 --- a/docs/reference/reranking/semantic-reranking.asciidoc +++ b/docs/reference/reranking/semantic-reranking.asciidoc @@ -1,35 +1,35 @@ [[semantic-reranking]] -== Semantic reranking +== Semantic re-ranking preview::[] [TIP] ==== -This overview focuses more on the high-level concepts and use cases for semantic reranking. For full implementation details on how to set up and use semantic reranking in {es}, see the <> in the Search API docs. +This overview focuses more on the high-level concepts and use cases for semantic re-ranking. For full implementation details on how to set up and use semantic re-ranking in {es}, see the <> in the Search API docs. ==== -Rerankers improve the relevance of results from earlier-stage retrieval mechanisms. -_Semantic_ rerankers use machine learning models to reorder search results based on their semantic similarity to a query. +Re-rankers improve the relevance of results from earlier-stage retrieval mechanisms. +_Semantic_ re-rankers use machine learning models to reorder search results based on their semantic similarity to a query. -Semantic reranking requires relatively large and complex machine learning models and operates in real-time in response to queries. +Semantic re-ranking requires relatively large and complex machine learning models and operates in real-time in response to queries. This technique makes sense on a small _top-k_ result set, as one the of the final steps in a pipeline. This is a powerful technique for improving search relevance that works equally well with keyword, semantic, or hybrid retrieval algorithms. -The next sections provide more details on the benefits, use cases, and model types used for semantic reranking. -The final sections include a practical, high-level overview of how to implement <> and links to the full reference documentation. +The next sections provide more details on the benefits, use cases, and model types used for semantic re-ranking. +The final sections include a practical, high-level overview of how to implement <> and links to the full reference documentation. [discrete] [[semantic-reranking-use-cases]] === Use cases -Semantic reranking enables a variety of use cases: +Semantic re-ranking enables a variety of use cases: -* *Lexical (BM25) retrieval results reranking* +* *Lexical (BM25) retrieval results re-ranking* ** Out-of-the-box semantic search by adding a simple API call to any lexical/BM25 retrieval pipeline. ** Adds semantic search capabilities on top of existing indices without reindexing, perfect for quick improvements. ** Ideal for environments with complex existing indices. -* *Semantic retrieval results reranking* +* *Semantic retrieval results re-ranking* ** Improves results from semantic retrievers using ELSER sparse vector embeddings or dense vector embeddings by using more powerful models. ** Adds a refinement layer on top of hybrid retrieval with <>. @@ -37,15 +37,15 @@ Semantic reranking enables a variety of use cases: ** Supports automatic and transparent chunking, eliminating the need for pre-chunking at index time. ** Provides explicit control over document relevance in retrieval-augmented generation (RAG) uses cases or other scenarios involving language model (LLM) inputs. -Now that we've outlined the value of semantic reranking, we'll explore the specific models that power this process and how they differ. +Now that we've outlined the value of semantic re-ranking, we'll explore the specific models that power this process and how they differ. [discrete] [[semantic-reranking-models]] === Cross-encoder and bi-encoder models -At a high level, two model types are used for semantic reranking: cross-encoders and bi-encoders. +At a high level, two model types are used for semantic re-ranking: cross-encoders and bi-encoders. -NOTE: In this version, {es} *only supports cross-encoders* for semantic reranking. +NOTE: In this version, {es} *only supports cross-encoders* for semantic re-ranking. * A *cross-encoder model* can be thought of as a more powerful, all-in-one solution, because it generates query-aware document representations. It takes the query and document texts as a single, concatenated input. @@ -62,7 +62,7 @@ If you're interested in a more detailed analysis of the practical differences be .Comparisons between cross-encoder and bi-encoder [%collapsible] ============== -The following is a non-exhaustive list of considerations when choosing between cross-encoders and bi-encoders for semantic reranking: +The following is a non-exhaustive list of considerations when choosing between cross-encoders and bi-encoders for semantic re-ranking: * Because a cross-encoder model simultaneously processes both query and document texts, it can better infer their relevance, making it more effective as a reranker than a bi-encoder. * Cross-encoder models are generally larger and more computationally intensive, resulting in higher latencies and increased computational costs. @@ -74,28 +74,28 @@ For example, their ability to take word order into account can improve on dense This enables you to maintain high relevance in result sets, by setting a minimum score threshold for all queries. For example, this is important when using results in a RAG workflow or if you're otherwise feeding results to LLMs. Note that similarity scores from bi-encoders/embedding similarities are _query-dependent_, meaning you cannot set universal cut-offs. -* Bi-encoders rerank using embeddings. You can improve your reranking latency by creating embeddings at ingest-time. These embeddings can be stored for reranking without being indexed for retrieval, reducing your memory footprint. +* Bi-encoders rerank using embeddings. You can improve your re-ranking latency by creating embeddings at ingest-time. These embeddings can be stored for re-ranking without being indexed for retrieval, reducing your memory footprint. ============== [discrete] [[semantic-reranking-in-es]] -=== Semantic reranking in {es} +=== Semantic re-ranking in {es} -In {es}, semantic rerankers are implemented using the {es} <> and a <>. +In {es}, semantic re-rankers are implemented using the {es} <> and a <>. -To use semantic reranking in {es}, you need to: +To use semantic re-ranking in {es}, you need to: -. *Choose a reranking model*. +. *Choose a re-ranking model*. Currently you can: ** Integrate directly with the <> using the `rerank` task type ** Integrate directly with the <> using the `rerank` task type -** Upload a model to {es} from Hugging Face with {eland-docs}/machine-learning.html#ml-nlp-pytorch[Eland]. You'll need to use the `text_similarity` NLP task type when loading the model using Eland. Refer to {ml-docs}/ml-nlp-model-ref.html#ml-nlp-model-ref-text-similarity[the Elastic NLP model reference] for a list of third party text similarity models supported by {es} for semantic reranking. +** Upload a model to {es} from Hugging Face with {eland-docs}/machine-learning.html#ml-nlp-pytorch[Eland]. You'll need to use the `text_similarity` NLP task type when loading the model using Eland. Refer to {ml-docs}/ml-nlp-model-ref.html#ml-nlp-model-ref-text-similarity[the Elastic NLP model reference] for a list of third party text similarity models supported by {es} for semantic re-ranking. *** Then set up an <> with the `rerank` task type . *Create a `rerank` task using the <>*. -The Inference API creates an inference endpoint and configures your chosen machine learning model to perform the reranking task. +The Inference API creates an inference endpoint and configures your chosen machine learning model to perform the re-ranking task. . *Define a `text_similarity_reranker` retriever in your search request*. -The retriever syntax makes it simple to configure both the retrieval and reranking of search results in a single API call. +The retriever syntax makes it simple to configure both the retrieval and re-ranking of search results in a single API call. .*Example search request* with semantic reranker [%collapsible] diff --git a/docs/reference/search/retriever.asciidoc b/docs/reference/search/retriever.asciidoc index 58cc8ce9ef459..6d3a1a36ad407 100644 --- a/docs/reference/search/retriever.asciidoc +++ b/docs/reference/search/retriever.asciidoc @@ -325,7 +325,7 @@ The `text_similarity_reranker` retriever uses an NLP model to improve search res [TIP] ==== -Refer to <> for a high level overview of semantic reranking. +Refer to <> for a high level overview of semantic re-ranking. ==== ===== Prerequisites @@ -387,7 +387,7 @@ A text similarity re-ranker retriever is a compound retriever. Child retrievers [[text-similarity-reranker-retriever-example-cohere]] ==== Example: Cohere Rerank -This example enables out-of-the-box semantic search by reranking top documents using the Cohere Rerank API. This approach eliminate the need to generate and store embeddings for all indexed documents. +This example enables out-of-the-box semantic search by re-ranking top documents using the Cohere Rerank API. This approach eliminate the need to generate and store embeddings for all indexed documents. This requires a <> using the `rerank` task type. [source,js] @@ -418,7 +418,7 @@ GET /index/_search [discrete] [[text-similarity-reranker-retriever-example-eland]] -==== Example: Semantic reranking with a Hugging Face model +==== Example: Semantic re-ranking with a Hugging Face model The following example uses the `cross-encoder/ms-marco-MiniLM-L-6-v2` model from Hugging Face to rerank search results based on semantic similarity. The model must be uploaded to {es} using https://www.elastic.co/guide/en/elasticsearch/client/eland/current/machine-learning.html#ml-nlp-pytorch[Eland]. @@ -428,7 +428,7 @@ The model must be uploaded to {es} using https://www.elastic.co/guide/en/elastic Refer to {ml-docs}/ml-nlp-model-ref.html#ml-nlp-model-ref-text-similarity[the Elastic NLP model reference] for a list of third party text similarity models supported by {es}. ==== -Follow these steps to load the model and create a semantic reranker. +Follow these steps to load the model and create a semantic re-ranker. . Install Eland using `pip` +