Skip to content

Commit

Permalink
Deprecate max_token_score in neural sparse search (#6554)
Browse files Browse the repository at this point in the history
* deprecated max_token_score

Signed-off-by: zhichao-aws <[email protected]>

* Update _query-dsl/specialized/neural-sparse.md

Signed-off-by: kolchfa-aws <[email protected]>

---------

Signed-off-by: zhichao-aws <[email protected]>
Signed-off-by: kolchfa-aws <[email protected]>
Co-authored-by: kolchfa-aws <[email protected]>
  • Loading branch information
zhichao-aws and kolchfa-aws authored Mar 1, 2024
1 parent ee2b67f commit 5f486ab
Show file tree
Hide file tree
Showing 2 changed files with 4 additions and 7 deletions.
8 changes: 3 additions & 5 deletions _query-dsl/specialized/neural-sparse.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,8 +20,7 @@ Include the following request fields in the `neural_sparse` query:
"neural_sparse": {
"<vector_field>": {
"query_text": "<query_text>",
"model_id": "<model_id>",
"max_token_score": "<max_token_score>"
"model_id": "<model_id>"
}
}
```
Expand All @@ -32,7 +31,7 @@ Field | Data type | Required/Optional | Description
:--- | :--- | :---
`query_text` | String | Required | The query text from which to generate vector embeddings.
`model_id` | String | Required | The ID of the sparse encoding model or tokenizer model that will be used to generate vector embeddings from the query text. The model must be deployed in OpenSearch before it can be used in sparse neural search. For more information, see [Using custom models within OpenSearch]({{site.url}}{{site.baseurl}}/ml-commons-plugin/using-ml-models/) and [Neural sparse search]({{site.url}}{{site.baseurl}}/search-plugins/neural-sparse-search/).
`max_token_score` | Float | Optional | The theoretical upper bound of the score for all tokens in the vocabulary (required for performance optimization). For OpenSearch-provided [pretrained sparse embedding models]({{site.url}}{{site.baseurl}}/ml-commons-plugin/pretrained-models/#sparse-encoding-models), we recommend setting `max_token_score` to 2 for `amazon/neural-sparse/opensearch-neural-sparse-encoding-doc-v1` and to 3.5 for `amazon/neural-sparse/opensearch-neural-sparse-encoding-v1`.
`max_token_score` | Float | Optional | (Deprecated) The theoretical upper bound of the score for all tokens in the vocabulary (required for performance optimization). For OpenSearch-provided [pretrained sparse embedding models]({{site.url}}{{site.baseurl}}/ml-commons-plugin/pretrained-models/#sparse-encoding-models), we recommend setting `max_token_score` to 2 for `amazon/neural-sparse/opensearch-neural-sparse-encoding-doc-v1` and to 3.5 for `amazon/neural-sparse/opensearch-neural-sparse-encoding-v1`. This field has been deprecated as of OpenSearch 2.12.

#### Example request

Expand All @@ -43,8 +42,7 @@ GET my-nlp-index/_search
"neural_sparse": {
"passage_embedding": {
"query_text": "Hi world",
"model_id": "aP2Q8ooBpBj3wT4HVS8a",
"max_token_score": 2
"model_id": "aP2Q8ooBpBj3wT4HVS8a"
}
}
}
Expand Down
3 changes: 1 addition & 2 deletions _search-plugins/neural-sparse-search.md
Original file line number Diff line number Diff line change
Expand Up @@ -154,8 +154,7 @@ GET my-nlp-index/_search
"neural_sparse": {
"passage_embedding": {
"query_text": "Hi world",
"model_id": "aP2Q8ooBpBj3wT4HVS8a",
"max_token_score": 2
"model_id": "aP2Q8ooBpBj3wT4HVS8a"
}
}
}
Expand Down

0 comments on commit 5f486ab

Please sign in to comment.