Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add doc for neural-sparse-query-two-phase-processor. #7306

Merged
merged 13 commits into from
Jun 14, 2024
5 changes: 5 additions & 0 deletions _search-plugins/neural-sparse-search.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,7 @@
1. [Create an index for ingestion](#step-2-create-an-index-for-ingestion).
1. [Ingest documents into the index](#step-3-ingest-documents-into-the-index).
1. [Search the index using neural search](#step-4-search-the-index-using-neural-sparse-search).
1. [Create and enable two-phase processor (Optional)](#step-5-create-and-enable-two-phase-processor-optional).
Naarcha-AWS marked this conversation as resolved.
Show resolved Hide resolved

## Step 1: Create an ingest pipeline

Expand Down Expand Up @@ -261,6 +262,10 @@
}
}
```
## Step 5: Create and enable two-phase processor (Optional)
Naarcha-AWS marked this conversation as resolved.
Show resolved Hide resolved
'neural_sparse_two_phase_processor' is a new feature which introduced in OpenSearch 2.15. It can speed up the neural sparse query's time cost with negligible accurency loss

Check failure on line 266 in _search-plugins/neural-sparse-search.md

View workflow job for this annotation

GitHub Actions / style-job

[vale] reported by reviewdog 🐶 [OpenSearch.Spelling] Error: accurency. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks. Raw Output: {"message": "[OpenSearch.Spelling] Error: accurency. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks.", "location": {"path": "_search-plugins/neural-sparse-search.md", "range": {"start": {"line": 266, "column": 159}}}, "severity": "ERROR"}
For more information, you can refer to [neural-sparse-query-two-phase-processor]({{site.url}}{{site.baseurl}}/search-plugins/search-pipelines/neural-sparse-query-two-phase-processor/).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we give some example API calling like the section ## Setting a default model on an index or field? Using all default values is also good, this would help users a lot

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should also mention that this processor is strongly recommended to set

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gave a example API for set a default 2-phase pipeline.
Add a explanation for why we recommend this pipeline.


## Setting a default model on an index or field

Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,124 @@
---
layout: default
title: NeuralSparse query two-phase processor
nav_order: 13
has_children: false
parent: Search processors
grand_parent: Search pipelines
---

# NeuralSparse query two-phase processor

The `neural_sparse_two_phase_processor` search request processor is designed to set a speed-up pipeline for [neural sparse search]({{site.url}}{{site.baseurl}}/search-plugins/neural-sparse-search/). It accelerates the neural sparse query by breaking down the original method of scoring all documents with all tokens into two steps. In the first step, it uses high-weight tokens to score the documents and filters out the top documents; in the second step, it uses low-weight tokens to fine-tune the scores of the top documents.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fine-tune is usually used for ML models, maybe we should use rescore here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replaced with word rescore.


## Request fields

The following table lists all available request fields.

Field | Data type | Description
:--- | :--- | :---
`enabled` | Boolean | Controls whether the two-phase is enabled with a default value of `true`.
`two_phase_parameter` | Object | A map of key-value pairs representing the two-phase parameters and their associated values. Optional. You can specify the value of `prune_ratio`, `expansion_rate`, `max_window_size`, or any combination of these three parameters.
`two_phase_parameter.prune_ratio` | Float | A ratio that represents how to split the high-weight tokens and low-weight tokens. The threshold is the token's max score * prune_ratio. Default value is 0.4. Valid range is [0,1].
`two_phase_parameter.expansion_rate` | Float | A rate that specifies how many documents will be fine-tuned during the second phase. The second phase doc number equals query size (default 10) * expansion rate. Default value is 5.0. Valid range is greater than 1.0.
`two_phase_parameter.max_window_size` | Int | A limit number of the two-phase fine-tune documents. Default value is 10000. Valid range is greater than 50.
`tag` | String | The processor's identifier. Optional.
`description` | String | A description of the processor. Optional.

## Example

### Create search pipeline

Check failure on line 30 in _search-plugins/search-pipelines/neural-sparse-query-two-phase-processor.md

View workflow job for this annotation

GitHub Actions / style-job

[vale] reported by reviewdog 🐶 [OpenSearch.StackedHeadings] Do not stack headings. Insert an introductory sentence between headings. Raw Output: {"message": "[OpenSearch.StackedHeadings] Do not stack headings. Insert an introductory sentence between headings.", "location": {"path": "_search-plugins/search-pipelines/neural-sparse-query-two-phase-processor.md", "range": {"start": {"line": 30, "column": 1}}}, "severity": "ERROR"}

The following example request creates a search pipeline with a `neural_sparse_two_phase_processor` search request processor. The processor sets a custom model ID at the index level and provides different default model IDs for two specific fields in the index:
Naarcha-AWS marked this conversation as resolved.
Show resolved Hide resolved

```json
PUT /_search/pipeline/two_phase_search_pipeline
{
"request_processors": [
{
"neural_sparse_two_phase_processor": {
"tag": "neural-sparse",
"description": "This processor is making two-phase processor.",
"enabled": true,
"two_phase_parameter": {
"prune_ratio": custom_prune_ratio,
"expansion_rate": custom_expansion_rate,
"max_window_size": custom_max_window_size
}
}
}
]
}
```
{% include copy-curl.html %}

### Set search pipeline

Then choose the proper index and set the `index.search.default_pipeline` to the pipeline name.
```json
PUT /index-name/_settings
{
"index.search.default_pipeline" : "two_phase_search_pipeline"
}
```
{% include copy-curl.html %}

## Limitation
### Version support

Check failure on line 67 in _search-plugins/search-pipelines/neural-sparse-query-two-phase-processor.md

View workflow job for this annotation

GitHub Actions / style-job

[vale] reported by reviewdog 🐶 [OpenSearch.StackedHeadings] Do not stack headings. Insert an introductory sentence between headings. Raw Output: {"message": "[OpenSearch.StackedHeadings] Do not stack headings. Insert an introductory sentence between headings.", "location": {"path": "_search-plugins/search-pipelines/neural-sparse-query-two-phase-processor.md", "range": {"start": {"line": 67, "column": 1}}}, "severity": "ERROR"}
`neural_sparse_two_phase_processor` is introduced in OpenSearch 2.15. You can use this pipeline in a cluster whose minimal version is greater than or equals to 2.15.

### Compound query support
There is 6 types of [compound query]({{site.url}}{{site.baseurl}}/query-dsl/compound/index/). And we only support bool query now.

Check failure on line 71 in _search-plugins/search-pipelines/neural-sparse-query-two-phase-processor.md

View workflow job for this annotation

GitHub Actions / style-job

[vale] reported by reviewdog 🐶 [OpenSearch.Spelling] Error: bool. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks. Raw Output: {"message": "[OpenSearch.Spelling] Error: bool. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks.", "location": {"path": "_search-plugins/search-pipelines/neural-sparse-query-two-phase-processor.md", "range": {"start": {"line": 71, "column": 115}}}, "severity": "ERROR"}
- [x] bool (Boolean)

Check failure on line 72 in _search-plugins/search-pipelines/neural-sparse-query-two-phase-processor.md

View workflow job for this annotation

GitHub Actions / style-job

[vale] reported by reviewdog 🐶 [OpenSearch.Spelling] Error: bool. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks. Raw Output: {"message": "[OpenSearch.Spelling] Error: bool. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks.", "location": {"path": "_search-plugins/search-pipelines/neural-sparse-query-two-phase-processor.md", "range": {"start": {"line": 72, "column": 7}}}, "severity": "ERROR"}
- [ ] boosting
- [ ] constant_score
- [ ] dis_max (disjunction max)
- [ ] function_score
- [ ] hybrid

Notice, neural sparse query or bool query with a boost parameter (not same as boosting query) are also supported.

Check failure on line 79 in _search-plugins/search-pipelines/neural-sparse-query-two-phase-processor.md

View workflow job for this annotation

GitHub Actions / style-job

[vale] reported by reviewdog 🐶 [OpenSearch.Spelling] Error: bool. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks. Raw Output: {"message": "[OpenSearch.Spelling] Error: bool. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks.", "location": {"path": "_search-plugins/search-pipelines/neural-sparse-query-two-phase-processor.md", "range": {"start": {"line": 79, "column": 32}}}, "severity": "ERROR"}

#### Supported Example

Check failure on line 81 in _search-plugins/search-pipelines/neural-sparse-query-two-phase-processor.md

View workflow job for this annotation

GitHub Actions / style-job

[vale] reported by reviewdog 🐶 [OpenSearch.HeadingCapitalization] 'Supported Example' is a heading and should be in sentence case. Raw Output: {"message": "[OpenSearch.HeadingCapitalization] 'Supported Example' is a heading and should be in sentence case.", "location": {"path": "_search-plugins/search-pipelines/neural-sparse-query-two-phase-processor.md", "range": {"start": {"line": 81, "column": 6}}}, "severity": "ERROR"}
##### Single neural sparse query

Check failure on line 82 in _search-plugins/search-pipelines/neural-sparse-query-two-phase-processor.md

View workflow job for this annotation

GitHub Actions / style-job

[vale] reported by reviewdog 🐶 [OpenSearch.StackedHeadings] Do not stack headings. Insert an introductory sentence between headings. Raw Output: {"message": "[OpenSearch.StackedHeadings] Do not stack headings. Insert an introductory sentence between headings.", "location": {"path": "_search-plugins/search-pipelines/neural-sparse-query-two-phase-processor.md", "range": {"start": {"line": 82, "column": 1}}}, "severity": "ERROR"}
Naarcha-AWS marked this conversation as resolved.
Show resolved Hide resolved

```
GET /my-nlp-index/_search
{
"query": {
"neural_sparse": {
"passage_embedding": {
"query_text": "Hi world"
}
}
}
}
```
{% include copy-curl.html %}
##### Neural sparse query nested in bool query

Check failure on line 97 in _search-plugins/search-pipelines/neural-sparse-query-two-phase-processor.md

View workflow job for this annotation

GitHub Actions / style-job

[vale] reported by reviewdog 🐶 [OpenSearch.Spelling] Error: bool. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks. Raw Output: {"message": "[OpenSearch.Spelling] Error: bool. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks.", "location": {"path": "_search-plugins/search-pipelines/neural-sparse-query-two-phase-processor.md", "range": {"start": {"line": 97, "column": 37}}}, "severity": "ERROR"}

```
GET /my-nlp-index/_search
{
"query": {
"bool": {
"should": [
{
"neural_sparse": {
"passage_embedding": {
"query_text": "Hi world",
"model_id": <model-id>
},
"boost": 2.0
}
}
]
}
}
}
```
{% include copy-curl.html %}

## Metrics

In doc-only mode, the two-phase processor will reduce the query latency by 20% to 50%, depending on the index configuration and two-phase parameters.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this sparse_vector search or end to end latency

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also depending on the index configuration and two-phase parameters Do you mean this depends on the dataset distribution? Because I think we're using default two-phase parameters and consistent index settings in our experiments

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this sparse_vector search or end to end latency

end to end.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also depending on the index configuration and two-phase parameters Do you mean this depends on the dataset distribution? Because I think we're using default two-phase parameters and consistent index settings in our experiments

Yes, i corrected the description to index specific data distribution and dataset size .

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

end to end.

I think the speed up ratio on pure sparse vector is more important. The model inference cost and retrieval cost is orthogonal and the 2-phase pipeline only optimize the retrieval cost.

In bi-encoder mode, the two-phase processor can decrease the query latency by up to 90%, also depending on the index configuration and two-phase parameters.
Loading