Extend cases for #11643 - optimization for string terms aggregation #13119

sandeshkr419 · 2024-04-08T17:25:33Z

W.r.t to the above PR, when optimizing the string terms aggregation for single terms, we omitted some cases:

Deleted documents in a segment
Doc frequency exists in any document of a segment

It should be possible that with the optimization, deleted documents in a segment can be handled in a segment gracefully by computing the aggregations using the same optimization and then traversing only through deleted documents.

Need to figure out:

What should be those ideal thresholds for the deleted documents compared to the total number of documents in a segment. This will require some data collection with experimenting different percentages of deleted documents.

Note, that we are mentioning this per segment basis as the optimization (if or not) is also relevant at a segment level.

peternied · 2024-04-10T15:23:56Z

[Triage - attendees 1 2 3 4 5 6]
@sandeshkr419 Thanks for creating this issue, look forward to seeing a pull request to resolve.

From @peternied please use the template for issue creation

sandeshkr419 added this to Performance Roadmap Apr 8, 2024

sandeshkr419 converted this from a draft issue Apr 8, 2024

github-actions bot added the untriaged label Apr 8, 2024

sandeshkr419 added Search:Aggregations Search:Performance labels Apr 8, 2024

github-project-automation bot added this to Search Project Board Apr 8, 2024

github-project-automation bot moved this to 🆕 New in Search Project Board Apr 8, 2024

peternied removed the untriaged label Apr 10, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Extend cases for #11643 - optimization for string terms aggregation #13119

Extend cases for #11643 - optimization for string terms aggregation #13119

sandeshkr419 commented Apr 8, 2024

peternied commented Apr 10, 2024

Extend cases for #11643 - optimization for string terms aggregation #13119

Extend cases for #11643 - optimization for string terms aggregation #13119

Comments

sandeshkr419 commented Apr 8, 2024

peternied commented Apr 10, 2024