Clean up how pipeline aggs check for multi-bucket by nik9000 · Pull Request #54161 · elastic/elasticsearch

nik9000 · 2020-03-25T10:33:38Z

Pipeline aggregations like stats_bucket, sum_bucket, and
percentiles_bucket only operate on buckets that have multiple buckets.
This adds support for those aggregations to geo_distance, ip_range,
auto_date_histogram, and rare_terms.

This all happened because we used a marker interface to mark compatible
aggs, MultiBucketAggregationBuilder and it was fairly easy to forget
to implement the interface.

This replaces the marker interface with an abstract method in
AggregationBuilder, bucketCardinality which makes you return NONE,
ONE, or MANY. The bucket aggregations can check for MANY. At
this point ONE and NONE amount to about the same thing, but I
suspect that'll be a useful distinction when validating bucket sorts.

Closes #53215

Pipeline aggregations like `stats_bucket`, `sum_bucket`, and `percentiles_bucket` only operate on buckets that have multiple buckets. This adds support for those aggregations to `geo_distance`, `ip_range`, `auto_date_histogram`, and `rare_terms`. This all happened because we used a marker interface to mark compatible aggs, `MultiBucketAggregationBuilder` and it was fairly easy to forget to implement the interface. This replaces the marker interface with an abstract method in `AggregationBuilder`, `bucketCardinality` which makes you return `NONE`, `ONE`, or `MANY`. The `bucket` aggregations can check for `MANY`. At this point `ONE` and `NONE` amount to about the same thing, but I suspect that'll be a useful distinction when validating bucket sorts. Closes elastic#53215

elasticmachine · 2020-03-25T10:33:41Z

Pinging @elastic/es-analytics-geo (:Analytics/Aggregations)

nik9000 · 2020-03-25T13:03:10Z

@elasticmachine, run elasticsearch-ci/2

nik9000 · 2020-03-25T13:03:50Z

The bwc failures are legit. The fixes for these aggs need skips. The elasticsearch-ci/2 didn't look legit at first glance.

polyfractal

Rad, I like this. As a followup I wonder if we could give a similar treatment for "sequential" vs "non-sequential" aggs (e.g. date_histo has sequentially ordered buckets). Would allow us to cleanup that spot in pipeline validation where certain pipelines need sequential buckets.

(I think the merge conflicts are just little things, like removed generics, due to the VS merge. Should be easy enough to fix I hope)

polyfractal · 2020-03-27T17:21:32Z

...java/org/elasticsearch/search/aggregations/bucket/composite/CompositeAggregationBuilder.java

+         * This might be a bit of a lie though because you can't really use
+         * pipeline aggregations on composite aggregations.
+         */
+        return BucketCardinality.MANY;


Yeah... I wonder if we should set this to NONE for the time being, to prevent people from accidentally using pipelines on composite? There's an issue flying around somewhere discussing it... but last time we chatted, we decided to punt on the decision. Some pipelines are theoretically safe (bucket_script, operating on the bucket itself), but others like derivative are totally not kosher.

Yeah. Composite is a weird beast.

nik9000 · 2020-03-27T17:30:56Z

Rad, I like this. As a followup I wonder if we could give a similar treatment for "sequential" vs "non-sequential" aggs (e.g. date_histo has sequentially ordered buckets). Would allow us to cleanup that spot in pipeline validation where certain pipelines need sequential buckets.

I think so!

…c#54161) Pipeline aggregations like `stats_bucket`, `sum_bucket`, and `percentiles_bucket` only operate on buckets that have multiple buckets. This adds support for those aggregations to `geo_distance`, `ip_range`, `auto_date_histogram`, and `rare_terms`. This all happened because we used a marker interface to mark compatible aggs, `MultiBucketAggregationBuilder` and it was fairly easy to forget to implement the interface. This replaces the marker interface with an abstract method in `AggregationBuilder`, `bucketCardinality` which makes you return `NONE`, `ONE`, or `MANY`. The `bucket` aggregations can check for `MANY`. At this point `ONE` and `NONE` amount to about the same thing, but I suspect that'll be a useful distinction when validating bucket sorts. Closes elastic#53215

* upstream/master: (447 commits) Adjust BWC version on node roles being sorted Deprecate node local storage setting (elastic#54374) Remove Unused BaseNodeRequest (elastic#54323) Decouple environment from DiscoveryNode (elastic#54373) Ensure that the output of node roles are sorted (elastic#54376) Do not stash environment in security (elastic#54372) Do not stash environment in machine learning (elastic#54371) Clean up how pipeline aggs check for multi-bucket (elastic#54161) Remove toString from Painless user tree nodes (elastic#54117) Docs: Use splitOnToken instead of custom function (elastic#48408) bwc: enable script cache in node stats (elastic#54362) bwc: Mute for script cache in node stats (elastic#54359) Test to enforce response to invalid data stream names Scripting: Use randomValueOtherThan in stats test (elastic#54356) Move transport decoding and aggregation to server (elastic#48263) Mute ScriptServiceTests.testContextCacheStats (elastic#54352) Improve checkstyle performance and readability (elastic#54308) Disable Gradle daemon when executing Windows packaging tests (elastic#54316) [Docs] Minor fix for SubmitAsyncSearchRequest.keepOnCompletion javadoc (elastic#54325) [DOCS] Fix typos in top metrics agg docs (elastic#54299) ... Conflicts: server/src/main/java/org/elasticsearch/index/IndexModule.java server/src/main/java/org/elasticsearch/index/IndexService.java server/src/main/java/org/elasticsearch/indices/IndicesService.java server/src/test/java/org/elasticsearch/index/IndexModuleTests.java

`master` wasn't sending `auto_date_histogram`'s `minimumIntervalExpression` over the wire to `7.x` nodes even though everything above `7.3` expected it. We never noticed this because we didn't have any yml tests for `auto_date_histogram` until I added some in elastic#54161. Closes elastic#54396

#54379) Pipeline aggregations like `stats_bucket`, `sum_bucket`, and `percentiles_bucket` only operate on buckets that have multiple buckets. This adds support for those aggregations to `geo_distance`, `ip_range`, `auto_date_histogram`, and `rare_terms`. This all happened because we used a marker interface to mark compatible aggs, `MultiBucketAggregationBuilder` and it was fairly easy to forget to implement the interface. This replaces the marker interface with an abstract method in `AggregationBuilder`, `bucketCardinality` which makes you return `NONE`, `ONE`, or `MANY`. The `bucket` aggregations can check for `MANY`. At this point `ONE` and `NONE` amount to about the same thing, but I suspect that'll be a useful distinction when validating bucket sorts. Closes #53215

`master` wasn't sending `auto_date_histogram`'s `minimumIntervalExpression` over the wire to `7.x` nodes even though everything above `7.3` expected it. We never noticed this because we didn't have any yml tests for `auto_date_histogram` until I added some in #54161. Closes #54396

nik9000 · 2020-03-30T15:40:37Z

I've finished the backport but I haven't updated the branches on master to run the bwc tests properly. Before I do that I'm going to handle #54382 which I discovered while backporting.

Now that elastic#54161 is backported we can run its tests against nodes that have the fix.

Now that #54161 is backported we can run its tests against nodes that have the fix. Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>

nik9000 added >bug :Analytics/Aggregations Aggregations v8.0.0 v7.7.0 labels Mar 25, 2020

nik9000 requested review from not-napoleon, polyfractal and talevy March 25, 2020 10:33

bpintea added v7.8.0 and removed v7.7.0 labels Mar 25, 2020

nik9000 added 2 commits March 25, 2020 09:32

Merge branch 'master' into agg_multi_validation

af76a7f

Update skips

660e5ac

nik9000 mentioned this pull request Mar 25, 2020

Some multi-bucket aggs cannot be used in pipeline aggs #53215

Closed

$polyfractal$

polyfractal approved these changes Mar 27, 2020

View reviewed changes

nik9000 added 3 commits March 27, 2020 15:53

Merge branch 'master' into agg_multi_validation

93951b1

Merge branch 'master' into agg_multi_validation

97c642a

Nope

b3fcbba

nik9000 merged commit a0f7c4a into elastic:master Mar 28, 2020

nik9000 added the backport pending label Mar 28, 2020

nik9000 mentioned this pull request Mar 28, 2020

auto date histogram error in mixed version cluster #54382

Closed

nik9000 mentioned this pull request Mar 30, 2020

Fix a master->7.x serialization bug #54429

Merged

nik9000 added a commit to nik9000/elasticsearch that referenced this pull request Mar 30, 2020

Update skip after backport

ec86c6c

Now that elastic#54161 is backported we can run its tests against nodes that have the fix.

nik9000 mentioned this pull request Mar 30, 2020

Update skip after backport #54450

Merged

nik9000 added a commit that referenced this pull request Mar 31, 2020

Update skip after backport (#54450)

2437c5e

Now that #54161 is backported we can run its tests against nodes that have the fix. Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>

nik9000 removed the backport pending label Mar 31, 2020

jakelandis added v8.0.0-alpha1 and removed v8.0.0 labels Jul 26, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Clean up how pipeline aggs check for multi-bucket#54161

Clean up how pipeline aggs check for multi-bucket#54161
nik9000 merged 6 commits intoelastic:masterfrom
nik9000:agg_multi_validation

nik9000 commented Mar 25, 2020

Uh oh!

elasticmachine commented Mar 25, 2020

Uh oh!

nik9000 commented Mar 25, 2020

Uh oh!

nik9000 commented Mar 25, 2020

Uh oh!

$@polyfractal$ polyfractal left a comment

Uh oh!

$@polyfractal$ polyfractal Mar 27, 2020

Uh oh!

nik9000 Mar 27, 2020

Uh oh!

nik9000 commented Mar 27, 2020

Uh oh!

nik9000 commented Mar 30, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

nik9000 commented Mar 25, 2020

Uh oh!

elasticmachine commented Mar 25, 2020

Uh oh!

nik9000 commented Mar 25, 2020

Uh oh!

nik9000 commented Mar 25, 2020

Uh oh!

polyfractal left a comment

Choose a reason for hiding this comment

Uh oh!

polyfractal Mar 27, 2020

Choose a reason for hiding this comment

Uh oh!

nik9000 Mar 27, 2020

Choose a reason for hiding this comment

Uh oh!

nik9000 commented Mar 27, 2020

Uh oh!

nik9000 commented Mar 30, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

$@polyfractal$ polyfractal left a comment

$@polyfractal$ polyfractal Mar 27, 2020