-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Concurrent Segment Search] Doc count error needs to be computed at the slice level #11680
Comments
Copying over the explanation of how the With the changes in this PR, there are basically 3 different cases when comparing the results from non-concurrent search vs. concurrent search. I'll illustrate this with the
Now, there are basically 3 ways that using concurrent search with the 1. Some buckets may have a lower
|
I think its worth mentioning that same behavior exists with non-concurrent path at shard level, depending on how data is distributed across the shards. Probably in most of the cases, if a shard has x,y,z as top terms, then each segment may also have (not guaranteed) those as its top terms and should get reflected in the response. |
Is your feature request related to a problem? Please describe
For #9246 we forced doc count error to be 0 during the shard level reduce phase as we were not eliminating any buckets at that stage. However, this logic was changed to use a
slice_size = shard_size * 1.5 + 10
heuristic as a part of #11585. This means that it's now possible to eliminate bucket candidates during the shard level reduce so the doc count error needs to be calculated accordingly in those cases.As an example, take this agg from the noaa OSB workload:
The
"date"
aggregation usessize = 1
, so the computedslice_size
heuristic will be26
which is fairly small compared to the cardinality of the "date" field.Attaching the aggregation outputs with concurrent search enabled/disabled:
cs-disabled.txt
cs-enabled.txt
Describe the solution you'd like
Doc count error needs to be calculated in a way that includes the buckets eliminated at the slice level reduce.
Related component
Search:Performance
Describe alternatives you've considered
No response
Additional context
No response
The text was updated successfully, but these errors were encountered: