Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add token pruning to clients for text expansion query #2370

Merged
merged 8 commits into from
Dec 21, 2023

Conversation

kderusso
Copy link
Member

@kderusso kderusso commented Dec 18, 2023

Adds the following specification after merging elastic/elasticsearch#102862:

  • Optional pruning_config added to the text_expansion query
  • New weighted_tokens query

@kderusso kderusso marked this pull request as ready for review December 19, 2023 13:56
Copy link

@jimczi jimczi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTm, thanks @kderusso

Copy link
Contributor

@Anaethelion Anaethelion left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've left a few questions, other than those it looks good!

implements AdditionalProperty<Field, FieldValue>
{
/** The field to query */
value: FieldValue
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the docs there's this example which doesn't have a value property.

{
  "query": {
    "weighted_tokens": {
      "query_expansion": {
        "tokens": {"2161": 0.4679, "2621": 0.307, "2782": 0.1299, "2851": 0.1056, "3088": 0.3041, "3376": 0.1038, "3467": 0.4873, "3684": 0.8958, "4380": 0.334, "4542": 0.4636, "4633": 2.2805, "4785": 1.2628, "4860": 1.0655, "5133": 1.0709, "7139": 1.0016, "7224": 0.2486, "7387": 0.0985, "7394": 0.0542, "8915": 0.369, "9156": 2.8947, "10505": 0.2771, "11464": 0.3996, "13525": 0.0088, "14178": 0.8161, "16893": 0.1376, "17851": 1.5348, "19939": 0.6012},
        "pruning_config": {
          "tokens_freq_ratio_threshold": 5,
          "tokens_weight_threshold": 0.4,
          "only_score_pruned_tokens": false
        }
      }
    }
  }
}

Isn't that field provided by the SingleKeyDictionnary in which the query is embedded ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Anaethelion Thanks for the input. I wasn't quite sure how to model this SingleKeyDictionary. Does this look better?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Much better! SingleKeyDictionary exactly reflects that layout.

I just noticed you have an AdditionalProperty that probably is not needed since you do not have a map with arbitrary keys to deserialize to. That should be removed otherwise it would allow an arbitrary key:value pair to be sent in the payload.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, fixed!

specification/_types/query_dsl/TokenPruningConfig.ts Outdated Show resolved Hide resolved
@kderusso kderusso requested a review from Anaethelion December 21, 2023 13:44
Copy link
Contributor

@Anaethelion Anaethelion left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@kderusso kderusso merged commit 6d9c9c8 into main Dec 21, 2023
5 checks passed
@kderusso kderusso deleted the kderusso/text-expansion-token-pruning branch December 21, 2023 17:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants