-
Notifications
You must be signed in to change notification settings - Fork 2.5k
Description
Is your feature request related to a problem? Please describe
For #9246 we forced doc count error to be 0 during the shard level reduce phase as we were not eliminating any buckets at that stage. However, this logic was changed to use a slice_size = shard_size * 1.5 + 10 heuristic as a part of #11585. This means that it's now possible to eliminate bucket candidates during the shard level reduce so the doc count error needs to be calculated accordingly in those cases.
As an example, take this agg from the noaa OSB workload:
{
"size": 0,
"aggs": {
"station": {
"terms": {
"field": "station.elevation",
"size": 50
},
"aggs": {
"date": {
"terms": {
"field": "date",
"size": 1
},
"aggs": {
"max": {
"max": {
"field": "TMAX"
}
}
}
}
}
}
}
}
The "date" aggregation uses size = 1, so the computed slice_size heuristic will be 26 which is fairly small compared to the cardinality of the "date" field.
Attaching the aggregation outputs with concurrent search enabled/disabled:
cs-disabled.txt
cs-enabled.txt
Describe the solution you'd like
Doc count error needs to be calculated in a way that includes the buckets eliminated at the slice level reduce.
Related component
Search:Performance
Describe alternatives you've considered
No response
Additional context
No response
Metadata
Metadata
Assignees
Labels
Type
Projects
Status
Status