Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deprecate max_token_score in neural sparse search #6554

Merged
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 3 additions & 5 deletions _query-dsl/specialized/neural-sparse.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,8 +20,7 @@ Include the following request fields in the `neural_sparse` query:
"neural_sparse": {
"<vector_field>": {
"query_text": "<query_text>",
"model_id": "<model_id>",
"max_token_score": "<max_token_score>"
"model_id": "<model_id>"
}
}
```
Expand All @@ -32,7 +31,7 @@ Field | Data type | Required/Optional | Description
:--- | :--- | :---
`query_text` | String | Required | The query text from which to generate vector embeddings.
`model_id` | String | Required | The ID of the sparse encoding model or tokenizer model that will be used to generate vector embeddings from the query text. The model must be deployed in OpenSearch before it can be used in sparse neural search. For more information, see [Using custom models within OpenSearch]({{site.url}}{{site.baseurl}}/ml-commons-plugin/using-ml-models/) and [Neural sparse search]({{site.url}}{{site.baseurl}}/search-plugins/neural-sparse-search/).
`max_token_score` | Float | Optional | The theoretical upper bound of the score for all tokens in the vocabulary (required for performance optimization). For OpenSearch-provided [pretrained sparse embedding models]({{site.url}}{{site.baseurl}}/ml-commons-plugin/pretrained-models/#sparse-encoding-models), we recommend setting `max_token_score` to 2 for `amazon/neural-sparse/opensearch-neural-sparse-encoding-doc-v1` and to 3.5 for `amazon/neural-sparse/opensearch-neural-sparse-encoding-v1`.
`max_token_score` | Float | Optional | (Deprecated) The theoretical upper bound of the score for all tokens in the vocabulary (required for performance optimization). For OpenSearch-provided [pretrained sparse embedding models]({{site.url}}{{site.baseurl}}/ml-commons-plugin/pretrained-models/#sparse-encoding-models), we recommend setting `max_token_score` to 2 for `amazon/neural-sparse/opensearch-neural-sparse-encoding-doc-v1` and to 3.5 for `amazon/neural-sparse/opensearch-neural-sparse-encoding-v1`. This field has been deprecated from 2.12 release.
kolchfa-aws marked this conversation as resolved.
Show resolved Hide resolved

#### Example request

Expand All @@ -43,8 +42,7 @@ GET my-nlp-index/_search
"neural_sparse": {
"passage_embedding": {
"query_text": "Hi world",
"model_id": "aP2Q8ooBpBj3wT4HVS8a",
"max_token_score": 2
"model_id": "aP2Q8ooBpBj3wT4HVS8a"
}
}
}
Expand Down
3 changes: 1 addition & 2 deletions _search-plugins/neural-sparse-search.md
Original file line number Diff line number Diff line change
Expand Up @@ -154,8 +154,7 @@ GET my-nlp-index/_search
"neural_sparse": {
"passage_embedding": {
"query_text": "Hi world",
"model_id": "aP2Q8ooBpBj3wT4HVS8a",
"max_token_score": 2
"model_id": "aP2Q8ooBpBj3wT4HVS8a"
}
}
}
Expand Down
Loading