Skip to content

Unify precomputation of aggregations behind a common API#16733

Merged
jainankitk merged 5 commits intoopensearch-project:mainfrom
msfroh:agg_precomputation_API
Jan 30, 2025
Merged

Unify precomputation of aggregations behind a common API#16733
jainankitk merged 5 commits intoopensearch-project:mainfrom
msfroh:agg_precomputation_API

Conversation

@msfroh
Copy link
Copy Markdown
Contributor

@msfroh msfroh commented Nov 27, 2024

Description

We've had a series of aggregation speedups that use the same strategy: instead of iterating through documents that match the query one-by-one, we can look at a Lucene segment and compute the aggregation directly (if some particular conditions are met).

In every case, we've hooked that into custom logic that hijacks the getLeafCollector method and throws CollectionTerminatedException. This creates the illusion that we're implementing a custom LeafCollector, when really we're not collecting at all (which is the whole point).

With this refactoring, the mechanism (hijacking getLeafCollector) is moved into AggregatorBase. Aggregators that have a strategy to precompute their answer can override tryPrecomputeAggregationForLeaf, which is expected to return true if they managed to precompute.

This should also make it easier to keep track of which aggregations have precomputation approaches (since they override this method).

Related Issues

N/A

Check List

  • Functionality includes testing.
  • API changes companion pull request created, if applicable.
  • Public documentation issue/PR created, if applicable.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@github-actions
Copy link
Copy Markdown
Contributor

❌ Gradle check result for 4d5c32b: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@sandeshkr419
Copy link
Copy Markdown
Member

Regarding implementation of this, I have one more alternative which I think is worth discussing. How about bringing this abstraction at ContextIndexSearcher itself.

            weight = wrapWeight(weight);
            // See please https://github.com/apache/lucene/pull/964
            collector.setWeight(weight);
            leafCollector = collector.getLeafCollector(ctx);

Basically if we have pre computed aggregations already, we assign it as EarlyTerminationCollector.

So, what I'm thinking about is cases with sub-aggregations that we can pre-compute, which is highly relevant in cases of star tree pre-computation. For eg.: #16674 and if a dedicated abstraction for star-tree preCompute in ComtextIndexSearcher wopuld make more sense or not.

@github-actions
Copy link
Copy Markdown
Contributor

✅ Gradle check result for 4d5c32b: SUCCESS

@codecov
Copy link
Copy Markdown

codecov bot commented Dec 12, 2024

Codecov Report

Attention: Patch coverage is 82.05128% with 14 lines in your changes missing coverage. Please review.

Project coverage is 72.32%. Comparing base (cd149a9) to head (1c3c990).
Report is 11 commits behind head on main.

Files with missing lines Patch % Lines
...rch/search/aggregations/metrics/MinAggregator.java 60.00% 2 Missing and 2 partials ⚠️
...ket/terms/GlobalOrdinalsStringTermsAggregator.java 84.21% 2 Missing and 1 partial ⚠️
...rch/aggregations/metrics/ValueCountAggregator.java 66.66% 2 Missing and 1 partial ⚠️
...rch/search/aggregations/metrics/MaxAggregator.java 80.00% 1 Missing and 1 partial ⚠️
...rch/search/aggregations/metrics/AvgAggregator.java 85.71% 1 Missing ⚠️
...rch/search/aggregations/metrics/SumAggregator.java 87.50% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##               main   #16733      +/-   ##
============================================
- Coverage     72.41%   72.32%   -0.09%     
- Complexity    65626    65712      +86     
============================================
  Files          5306     5319      +13     
  Lines        304927   305722     +795     
  Branches      44257    44348      +91     
============================================
+ Hits         220804   221107     +303     
- Misses        66007    66573     +566     
+ Partials      18116    18042      -74     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@msfroh
Copy link
Copy Markdown
Contributor Author

msfroh commented Jan 10, 2025

@jainankitk -- you're probably the maintainer (other than me) with the most context into this change. What do you think?

@msfroh msfroh force-pushed the agg_precomputation_API branch from c3897a0 to 19a40cc Compare January 29, 2025 20:06
@msfroh
Copy link
Copy Markdown
Contributor Author

msfroh commented Jan 29, 2025

@expani, @sandeshkr419 -- I resolved conflicts with your recent star-tree changes. Can you please take a look?

@github-actions
Copy link
Copy Markdown
Contributor

❌ Gradle check result for 19a40cc: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@sandeshkr419
Copy link
Copy Markdown
Member

One high level class I see missing among the metric aggregators is AvgAggregator.java which has similar pre-computations involved.

Signed-off-by: Michael Froh <froh@amazon.com>
@msfroh msfroh force-pushed the agg_precomputation_API branch from 4ac8bcb to caceb62 Compare January 29, 2025 23:13
@github-actions
Copy link
Copy Markdown
Contributor

✅ Gradle check result for caceb62: SUCCESS

Signed-off-by: Michael Froh <froh@amazon.com>
@github-actions
Copy link
Copy Markdown
Contributor

✅ Gradle check result for 1c3c990: SUCCESS

@jainankitk jainankitk merged commit 2847695 into opensearch-project:main Jan 30, 2025
30 checks passed
@jainankitk jainankitk added the backport 2.x Backport to 2.x branch label Jan 30, 2025
opensearch-trigger-bot bot pushed a commit that referenced this pull request Jan 30, 2025
* Unify precomputation of aggregations behind a common API

We've had a series of aggregation speedups that use the same strategy:
instead of iterating through documents that match the query
one-by-one, we can look at a Lucene segment and compute the
aggregation directly (if some particular conditions are met).

In every case, we've hooked that into custom logic hijacks the
getLeafCollector method and throws CollectionTerminatedException. This
creates the illusion that we're implementing a custom LeafCollector,
when really we're not collecting at all (which is the whole point).

With this refactoring, the mechanism (hijacking getLeafCollector) is
moved into AggregatorBase. Aggregators that have a strategy to
precompute their answer can override tryPrecomputeAggregationForLeaf,
which is expected to return true if they managed to precompute.

This should also make it easier to keep track of which aggregations
have precomputation approaches (since they override this method).

Signed-off-by: Michael Froh <froh@amazon.com>

* Remove subaggregator check from CompositeAggregator

Not sure why I added this, when the existing implementation didn't have it.

That said, we *should* call finishLeaf() before precomputing the current leaf.

Signed-off-by: Michael Froh <froh@amazon.com>

* Resolve conflicts with star-tree changes

Signed-off-by: Michael Froh <froh@amazon.com>

* Skip precomputation when valuesSource is null

Signed-off-by: Michael Froh <froh@amazon.com>

* Add comment as suggested by @bowenlan-amzn

Signed-off-by: Michael Froh <froh@amazon.com>

---------

Signed-off-by: Michael Froh <froh@amazon.com>
(cherry picked from commit 2847695)
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
msfroh pushed a commit that referenced this pull request Jan 30, 2025
…7197)

We've had a series of aggregation speedups that use the same strategy:
instead of iterating through documents that match the query
one-by-one, we can look at a Lucene segment and compute the
aggregation directly (if some particular conditions are met).

In every case, we've hooked that into custom logic hijacks the
getLeafCollector method and throws CollectionTerminatedException. This
creates the illusion that we're implementing a custom LeafCollector,
when really we're not collecting at all (which is the whole point).

With this refactoring, the mechanism (hijacking getLeafCollector) is
moved into AggregatorBase. Aggregators that have a strategy to
precompute their answer can override tryPrecomputeAggregationForLeaf,
which is expected to return true if they managed to precompute.

This should also make it easier to keep track of which aggregations
have precomputation approaches (since they override this method).

---------


(cherry picked from commit 2847695)

Signed-off-by: Michael Froh <froh@amazon.com>
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
@sandeshkr419
Copy link
Copy Markdown
Member

sandeshkr419 commented Jan 30, 2025

@msfroh Since this change is not a feature update, should we create a backport 2.19 as well?

One major advantage to backport in 2.19 I see is that any critical bugs if we have to backport to 2.19 in future, can be easily backported to 2.19 without having to worry about making too many manual changes. Thoughts?

cc - @rishabh6788 (2.19 Release Manager)

@msfroh
Copy link
Copy Markdown
Contributor Author

msfroh commented Jan 30, 2025

@msfroh Since this change is not a feature update, should we create a backport 2.19 as well?

One major advantage to backport in 2.19 I see is that any critical bugs if we have to backport to 2.19 in future, can be easily backported to 2.19 without having to worry about making too many manual changes. Thoughts?

cc - @rishabh6788 (2.19 Release Manager)

That's a good question. Part of me says, "Well, I missed the 2.19 cut-off, so too bad". On the other hand, your argument about avoiding merge conflicts is also relevant. I'll defer to @rishabh6788's judgement.

opensearch-trigger-bot bot pushed a commit that referenced this pull request Jan 30, 2025
* Unify precomputation of aggregations behind a common API

We've had a series of aggregation speedups that use the same strategy:
instead of iterating through documents that match the query
one-by-one, we can look at a Lucene segment and compute the
aggregation directly (if some particular conditions are met).

In every case, we've hooked that into custom logic hijacks the
getLeafCollector method and throws CollectionTerminatedException. This
creates the illusion that we're implementing a custom LeafCollector,
when really we're not collecting at all (which is the whole point).

With this refactoring, the mechanism (hijacking getLeafCollector) is
moved into AggregatorBase. Aggregators that have a strategy to
precompute their answer can override tryPrecomputeAggregationForLeaf,
which is expected to return true if they managed to precompute.

This should also make it easier to keep track of which aggregations
have precomputation approaches (since they override this method).

Signed-off-by: Michael Froh <froh@amazon.com>

* Remove subaggregator check from CompositeAggregator

Not sure why I added this, when the existing implementation didn't have it.

That said, we *should* call finishLeaf() before precomputing the current leaf.

Signed-off-by: Michael Froh <froh@amazon.com>

* Resolve conflicts with star-tree changes

Signed-off-by: Michael Froh <froh@amazon.com>

* Skip precomputation when valuesSource is null

Signed-off-by: Michael Froh <froh@amazon.com>

* Add comment as suggested by @bowenlan-amzn

Signed-off-by: Michael Froh <froh@amazon.com>

---------

Signed-off-by: Michael Froh <froh@amazon.com>
(cherry picked from commit 2847695)
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
@sandeshkr419
Copy link
Copy Markdown
Member

sandeshkr419 commented Jan 30, 2025

Discussed with @rishabh6788 offline. We are in consensus to include this for the fore-mentioned reason. Adding up backport 2.19 label for the bot to create a backport PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport 2.x Backport to 2.x branch backport 2.19 skip-changelog v2.19.0 Issues and PRs related to version 2.19.0

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

6 participants