Adds hard_bounds to histogram aggregations#59175
Conversation
Adds a hard_bounds parameter to explicitly limit the buckets that a histogram can generate. This is especially useful in case of open ended ranges that can produce a very large number of buckets. Closes elastic#50109
|
Pinging @elastic/es-analytics-geo (:Analytics/Aggregations) |
not-napoleon
left a comment
There was a problem hiding this comment.
This looks really good, thanks for fielding it. I know some users will be really happy to see this land :)
I left a couple of nits, but nothing major.
| offset = in.readLong(); | ||
| extendedBounds = in.readOptionalWriteable(ExtendedBounds::new); | ||
| extendedBounds = in.readOptionalWriteable(LongBounds::new); | ||
| if (in.getVersion().onOrAfter(Version.V_8_0_0)) { |
There was a problem hiding this comment.
You'll change this after the backport, I presume?
There was a problem hiding this comment.
Yes, that's just until backport.
| /** Set hard bounds on this histogram, specifying boundaries outside which buckets cannot be created. */ | ||
| public DateHistogramAggregationBuilder hardBounds(LongBounds hardBounds) { | ||
| if (hardBounds == null) { | ||
| throw new IllegalArgumentException("[hardBounds] must not be null: [" + name + "]"); |
There was a problem hiding this comment.
Why do we not allow setting the bounds to null? It's not a required parameter.
There was a problem hiding this comment.
This is consistent with all other optional parameters in this builder like order or externed_bounds.
...ava/org/elasticsearch/search/aggregations/bucket/histogram/DateRangeHistogramAggregator.java
Outdated
Show resolved
Hide resolved
...ava/org/elasticsearch/search/aggregations/bucket/histogram/DateRangeHistogramAggregator.java
Outdated
Show resolved
Hide resolved
|
|
||
| import static org.elasticsearch.common.xcontent.ConstructingObjectParser.optionalConstructorArg; | ||
|
|
||
| public class DoubleBounds implements ToXContentFragment, Writeable { |
There was a problem hiding this comment.
Am I correct that there's no common interface between DoubleBound and LongBound because we'd have to autobox in the collection loop? If so, maybe let's just leave a comment to that effect and maybe link the two in Javadoc, so there's at least something indicating there are two implementations here.
There was a problem hiding this comment.
I am not sure I follow. Where do you expect them to intersect?
| private double offset = 0; | ||
| private double minBound = Double.POSITIVE_INFINITY; | ||
| private double maxBound = Double.NEGATIVE_INFINITY; | ||
| private DoubleBounds hardBounds; |
There was a problem hiding this comment.
Should we convert minBound and maxBound into a DoubleBounds instance? if not, should probably just leave a comment that it's done this way for legacy reasons, not out of any specific need.
There was a problem hiding this comment.
I would like to refactor minBounds and maxBounds in a follow up PR. I just didn't feel like making this one more complicated than necessary. I will add TODO comment.
| return max; | ||
| } | ||
|
|
||
| public boolean contain(long time) { |
There was a problem hiding this comment.
time seems like a confusing name for this parameter. Maybe just value like the double version uses?
...in/java/org/elasticsearch/search/aggregations/bucket/histogram/RangeHistogramAggregator.java
Outdated
Show resolved
Hide resolved
...in/java/org/elasticsearch/search/aggregations/bucket/histogram/RangeHistogramAggregator.java
Outdated
Show resolved
Hide resolved
| ); | ||
| } | ||
|
|
||
| public void testOverlappingBounds() throws Exception { |
There was a problem hiding this comment.
I think it would be good to have a test for hard bounds exactly equal to extended bounds. That's a bit of an edge case, and we want to make sure it doesn't break down the line.
…ket/histogram/DateRangeHistogramAggregator.java Co-authored-by: Mark Tozzi <mark.tozzi@gmail.com>
…ket/histogram/DateRangeHistogramAggregator.java Co-authored-by: Mark Tozzi <mark.tozzi@gmail.com>
…ket/histogram/RangeHistogramAggregator.java Co-authored-by: Mark Tozzi <mark.tozzi@gmail.com>
…ket/histogram/RangeHistogramAggregator.java Co-authored-by: Mark Tozzi <mark.tozzi@gmail.com>
|
This PR requires a bit tricky backport, so we will let it slip into 7.10 in order to not disrupt feature freeze today. |
Adds a hard_bounds parameter to explicitly limit the buckets that a histogram can generate. This is especially useful in case of open ended ranges that can produce a very large number of buckets.
Refactors extendedBounds to use DoubleBounds instead of 2 variables. This is a follow up for #59175
Refactors extendedBounds to use DoubleBounds instead of 2 variables. This is a follow up for elastic#59175
|
Thanks for fixing this @imotov 👍 |
|
Should the renaming of Was upgrading elastic and dependencies today and came across this and was looking to see if there was any impact in simply renaming |
Adds a hard_bounds parameter to explicitly limit the buckets that a histogram
can generate. This is especially useful in case of open ended ranges that can
produce a very large number of buckets.
Closes #50109