Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add doc for neural-sparse-query-two-phase-processor. #7306

Merged
merged 13 commits into from
Jun 14, 2024
34 changes: 34 additions & 0 deletions _search-plugins/neural-sparse-search.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,7 @@
1. [Create an index for ingestion](#step-2-create-an-index-for-ingestion).
1. [Ingest documents into the index](#step-3-ingest-documents-into-the-index).
1. [Search the index using neural search](#step-4-search-the-index-using-neural-sparse-search).
1. [Create and enable two-phase processor (Optional)](#step-5-create-and-enable-two-phase-processor-optional).
Naarcha-AWS marked this conversation as resolved.
Show resolved Hide resolved

## Step 1: Create an ingest pipeline

Expand Down Expand Up @@ -261,6 +262,39 @@
}
}
```
## Step 5: Create and enable two-phase processor (Optional)
Naarcha-AWS marked this conversation as resolved.
Show resolved Hide resolved

This step is optional but strongly recommended, as it significantly improves the performance of neural sparse queries with almost no side effects.
Copy link
Collaborator

@Naarcha-AWS Naarcha-AWS Jun 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@conggguan: Would there be a reason for someone not to create and enable a two-phase processor?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Going to remove this sentence for now.

Naarcha-AWS marked this conversation as resolved.
Show resolved Hide resolved

'neural_sparse_two_phase_processor' is a new feature which introduced in OpenSearch 2.15. It can speed up the neural sparse query's time cost with negligible accurency loss.
Naarcha-AWS marked this conversation as resolved.
Show resolved Hide resolved

Check failure on line 270 in _search-plugins/neural-sparse-search.md

View workflow job for this annotation

GitHub Actions / style-job

[vale] reported by reviewdog 🐶 [OpenSearch.Spelling] Error: accurency. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks. Raw Output: {"message": "[OpenSearch.Spelling] Error: accurency. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks.", "location": {"path": "_search-plugins/neural-sparse-search.md", "range": {"start": {"line": 270, "column": 159}}}, "severity": "ERROR"}
You can quickly launch a pipeline based on the following API example. For more detailed information on the parameter settings and basic principles of this pipeline, please refer to [neural-sparse-query-two-phase-processor]({{site.url}}{{site.baseurl}}/search-plugins/search-pipelines/neural-sparse-query-two-phase-processor/).
Naarcha-AWS marked this conversation as resolved.
Show resolved Hide resolved

Check warning on line 272 in _search-plugins/neural-sparse-search.md

View workflow job for this annotation

GitHub Actions / style-job

[vale] reported by reviewdog 🐶 [OpenSearch.Please] Using 'please' is unnecessary. Remove. Raw Output: {"message": "[OpenSearch.Please] Using 'please' is unnecessary. Remove.", "location": {"path": "_search-plugins/neural-sparse-search.md", "range": {"start": {"line": 272, "column": 166}}}, "severity": "WARNING"}
```json
PUT /_search/pipeline/two_phase_search_pipeline
{
"request_processors": [
{
"neural_sparse_two_phase_processor": {
"tag": "neural-sparse",
"description": "This processor is making two-phase processor."
}
}
]
}
```
{% include copy-curl.html %}

Then choose the proper index and set the `index.search.default_pipeline` to the pipeline name. Replace the `index-name` in url with your index name.
Naarcha-AWS marked this conversation as resolved.
Show resolved Hide resolved
```json

Check failure on line 289 in _search-plugins/neural-sparse-search.md

View workflow job for this annotation

GitHub Actions / style-job

[vale] reported by reviewdog 🐶 [OpenSearch.Spelling] Error: url. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks. Raw Output: {"message": "[OpenSearch.Spelling] Error: url. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks.", "location": {"path": "_search-plugins/neural-sparse-search.md", "range": {"start": {"line": 289, "column": 124}}}, "severity": "ERROR"}
PUT /index-name/_settings
{
"index.search.default_pipeline" : "two_phase_search_pipeline"
}
```
{% include copy-curl.html %}


Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we give some example API calling like the section ## Setting a default model on an index or field? Using all default values is also good, this would help users a lot

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should also mention that this processor is strongly recommended to set

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gave a example API for set a default 2-phase pipeline.
Add a explanation for why we recommend this pipeline.


## Setting a default model on an index or field

Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,150 @@
---
layout: default
title: Neural spare query two-phase processor
nav_order: 13
parent: Search processors
grand_parent: Search pipelines
---

# Neural sparse query two-phase processor
Introduced 2.15
{: .label .label-purple }

The `neural_sparse_two_phase_processor` search processer is designed to set a speed-up search pipelines for [neural sparse search]({{site.url}}{{site.baseurl}}/search-plugins/neural-sparse-search/). It accelerates the neural sparse query by breaking down the original method of scoring all documents with all tokens into two steps:

Check failure on line 13 in _search-plugins/search-pipelines/neural-sparse-query-two-phase-processor.md

View workflow job for this annotation

GitHub Actions / style-job

[vale] reported by reviewdog 🐶 [OpenSearch.Spelling] Error: processer. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks. Raw Output: {"message": "[OpenSearch.Spelling] Error: processer. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks.", "location": {"path": "_search-plugins/search-pipelines/neural-sparse-query-two-phase-processor.md", "range": {"start": {"line": 13, "column": 48}}}, "severity": "ERROR"}
Naarcha-AWS marked this conversation as resolved.
Show resolved Hide resolved

1. High-weight tokens score the documents and filter out the top documents.
2. Low-weight tokens rescore the scores of the top documents.
Naarcha-AWS marked this conversation as resolved.
Show resolved Hide resolved

## Request fields

The following table lists all available request fields.

Field | Data type | Description
:--- | :--- | :---
`enabled` | Boolean | Controls whether two-phase is enabled. Default is `true`.
Naarcha-AWS marked this conversation as resolved.
Show resolved Hide resolved
`two_phase_parameter` | Object | A map of key-value pairs representing the two-phase parameters and their associated values. You can specify the value of `prune_ratio`, `expansion_rate`, `max_window_size`, or any combination of these three parameters. Optional.
`two_phase_parameter.prune_ratio` | Float | A ratio that represents how to split the high-weight tokens and low-weight tokens. The threshold is the token's max score * prune_ratio. Valid range is [0,1]. Default is `0.4`
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"max score * prune_ratio" => "maximum score multiplied by its prune_ratio"?

Naarcha-AWS marked this conversation as resolved.
Show resolved Hide resolved
`two_phase_parameter.expansion_rate` | Float | A rate that specifies how many documents will be fine-tuned during the second phase. The second phase doc number equals query size (default 10) * expansion rate. Valid range is greater than 1.0. Default is `5.0`
Naarcha-AWS marked this conversation as resolved.
Show resolved Hide resolved
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we want to use mathematical expressions, they need to be formatted as such. Otherwise, this looks like it should be something like "The second-phase document number equals the query size (default is 10) multiplied by its expansion rate."

`two_phase_parameter.max_window_size` | Int | A limit number of the two-phase fine-tune documents. Valid range is greater than 50. Default is `10000`.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not following the description. Please revise and tag me on the rewrite.

Naarcha-AWS marked this conversation as resolved.
Show resolved Hide resolved
`tag` | String | The processor's identifier. Optional.
`description` | String | A description of the processor. Optional.

## Example

The following example creates a search pipeline with a `neural_sparse_two_phase_processor` search request processor.

### Create search pipeline

The following example request creates a search pipeline with a `neural_sparse_two_phase_processor` search request processor. The processor sets a custom model ID at the index level and provides different default model IDs for two specific fields in the index:
Naarcha-AWS marked this conversation as resolved.
Show resolved Hide resolved

```json
PUT /_search/pipeline/two_phase_search_pipeline
{
"request_processors": [
{
"neural_sparse_two_phase_processor": {
"tag": "neural-sparse",
"description": "This processor is making two-phase processor.",
"enabled": true,
"two_phase_parameter": {
"prune_ratio": custom_prune_ratio,
"expansion_rate": custom_expansion_rate,
"max_window_size": custom_max_window_size
}
}
}
]
}
```
{% include copy-curl.html %}

### Set search pipeline

After the two-phase pipeline is created, set the `index.search.default_pipeline` setting to the pipeline name of the index for which you want to use the pipeline:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not following the end here. Do we mean something like "to the name of the pipeline for the index on which you want to use the two-phase pipeline"?

Naarcha-AWS marked this conversation as resolved.
Show resolved Hide resolved

```json
PUT /index-name/_settings
{
"index.search.default_pipeline" : "two_phase_search_pipeline"
}
```
{% include copy-curl.html %}

## Limitation

The 'neural_sparse_two_phase_processor' contains the following limitations:
Naarcha-AWS marked this conversation as resolved.
Show resolved Hide resolved

### Version support

`neural_sparse_two_phase_processor` can only be used with OpenSearch 2.15 or greater.
Naarcha-AWS marked this conversation as resolved.
Show resolved Hide resolved

### Compound query support

As of OpenSearch 2.15, only the Boolean [compound query]({{site.url}}{{site.baseurl}}/query-dsl/compound/index/) is supported
Naarcha-AWS marked this conversation as resolved.
Show resolved Hide resolved

Neural sparse queries and boolean queries with a boost parameter (not a boosting query) are also supported.

Check failure on line 85 in _search-plugins/search-pipelines/neural-sparse-query-two-phase-processor.md

View workflow job for this annotation

GitHub Actions / style-job

[vale] reported by reviewdog 🐶 [Vale.Terms] Use 'Boolean' instead of 'boolean'. Raw Output: {"message": "[Vale.Terms] Use 'Boolean' instead of 'boolean'.", "location": {"path": "_search-plugins/search-pipelines/neural-sparse-query-two-phase-processor.md", "range": {"start": {"line": 85, "column": 27}}}, "severity": "ERROR"}

Check failure on line 85 in _search-plugins/search-pipelines/neural-sparse-query-two-phase-processor.md

View workflow job for this annotation

GitHub Actions / style-job

[vale] reported by reviewdog 🐶 [OpenSearch.Spelling] Error: boolean. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks. Raw Output: {"message": "[OpenSearch.Spelling] Error: boolean. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks.", "location": {"path": "_search-plugins/search-pipelines/neural-sparse-query-two-phase-processor.md", "range": {"start": {"line": 85, "column": 27}}}, "severity": "ERROR"}
Naarcha-AWS marked this conversation as resolved.
Show resolved Hide resolved

#### Supported example
Naarcha-AWS marked this conversation as resolved.
Show resolved Hide resolved

The following examples show neural sparse queries with the supported query types.

##### Single neural sparse query
Naarcha-AWS marked this conversation as resolved.
Show resolved Hide resolved

```
GET /my-nlp-index/_search
{
"query": {
"neural_sparse": {
"passage_embedding": {
"query_text": "Hi world"
"model_id": <model-id>
}
}
}
}
```
{% include copy-curl.html %}

##### Neural sparse query nested in boolean query

Check failure on line 108 in _search-plugins/search-pipelines/neural-sparse-query-two-phase-processor.md

View workflow job for this annotation

GitHub Actions / style-job

[vale] reported by reviewdog 🐶 [OpenSearch.Spelling] Error: boolean. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks. Raw Output: {"message": "[OpenSearch.Spelling] Error: boolean. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks.", "location": {"path": "_search-plugins/search-pipelines/neural-sparse-query-two-phase-processor.md", "range": {"start": {"line": 108, "column": 37}}}, "severity": "ERROR"}

Check failure on line 108 in _search-plugins/search-pipelines/neural-sparse-query-two-phase-processor.md

View workflow job for this annotation

GitHub Actions / style-job

[vale] reported by reviewdog 🐶 [Vale.Terms] Use 'Boolean' instead of 'boolean'. Raw Output: {"message": "[Vale.Terms] Use 'Boolean' instead of 'boolean'.", "location": {"path": "_search-plugins/search-pipelines/neural-sparse-query-two-phase-processor.md", "range": {"start": {"line": 108, "column": 37}}}, "severity": "ERROR"}
Naarcha-AWS marked this conversation as resolved.
Show resolved Hide resolved

```
GET /my-nlp-index/_search
{
"query": {
"bool": {
"should": [
{
"neural_sparse": {
"passage_embedding": {
"query_text": "Hi world",
"model_id": <model-id>
},
"boost": 2.0
}
}
]
}
}
}
```
{% include copy-curl.html %}

## P99 Latency Metrics

Check failure on line 132 in _search-plugins/search-pipelines/neural-sparse-query-two-phase-processor.md

View workflow job for this annotation

GitHub Actions / style-job

[vale] reported by reviewdog 🐶 [OpenSearch.HeadingCapitalization] 'P99 Latency Metrics' is a heading and should be in sentence case. Raw Output: {"message": "[OpenSearch.HeadingCapitalization] 'P99 Latency Metrics' is a heading and should be in sentence case.", "location": {"path": "_search-plugins/search-pipelines/neural-sparse-query-two-phase-processor.md", "range": {"start": {"line": 132, "column": 4}}}, "severity": "ERROR"}
Naarcha-AWS marked this conversation as resolved.
Show resolved Hide resolved
On an OpenSearch cluster set up on 3 m5.4xlarge Amazon EC2 instances, OpenSearch conducts neural sparse query's P99 latency tests on indexes corresponding to over ten datasets.
Naarcha-AWS marked this conversation as resolved.
Show resolved Hide resolved
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On an OpenSearch cluster set up on" => "For an OpenSearch cluster configured with"?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be "On" or "Using" since the testing is done internally.


### Doc-only mode latency metric

In doc-only mode, the two-phase processor can significantly decrease query latency, as shown by the following latency metrics:

- Average latency without 2-phase: 53.56 ms
Naarcha-AWS marked this conversation as resolved.
Show resolved Hide resolved
- Average latency with 2-phase: 38.61 ms
Naarcha-AWS marked this conversation as resolved.
Show resolved Hide resolved

This results in an overall reduction of approximately 27.92% in latency. Most indexes show a significant decrease in latency with the 2-phase processor, with reductions ranging from 5.14% to 84.6. The specific latency optimization values depend on the data distribution within the indexes.
Naarcha-AWS marked this conversation as resolved.
Show resolved Hide resolved

### Bi-encoder mode latency metric

In bi-encoder mode, the two-phase processor can significantly decrease query latency. Analyzing the data:
Naarcha-AWS marked this conversation as resolved.
Show resolved Hide resolved
- Average latency without 2-phase: 300.79 ms
Naarcha-AWS marked this conversation as resolved.
Show resolved Hide resolved
- Average latency with 2-phase: 121.64 ms
Naarcha-AWS marked this conversation as resolved.
Show resolved Hide resolved

This results in an overall reduction of approximately 59.56% in latency. Most indexes show a significant decrease in latency with the 2-phase processor, with reductions ranging from 1.56% to 82.84%. The specific latency optimization values depend on the data distribution within the indexes.
Naarcha-AWS marked this conversation as resolved.
Show resolved Hide resolved
Loading