Streaming aggregation planning to determine the appropriate flush mode#19488
Merged
rishabhmaurya merged 9 commits intoopensearch-project:mainfrom Oct 3, 2025
Merged
Conversation
Contributor
Author
|
@bowenlan-amzn @harshavamsi @mch2 please take a look. I'm still working on testing it and fixing checks |
Contributor
|
❌ Gradle check result for c618a71: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
c618a71 to
6e5888c
Compare
Contributor
|
❌ Gradle check result for 6e5888c: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
Signed-off-by: Rishabh Maurya <rishabhmaurya05@gmail.com>
Signed-off-by: Rishabh Maurya <rishabhmaurya05@gmail.com>
Signed-off-by: Rishabh Maurya <rishabhmaurya05@gmail.com>
Signed-off-by: Rishabh Maurya <rishabhmaurya05@gmail.com>
Signed-off-by: Rishabh Maurya <rishabhmaurya05@gmail.com>
Signed-off-by: Rishabh Maurya <rishabhmaurya05@gmail.com>
Signed-off-by: Rishabh Maurya <rishabhmaurya05@gmail.com>
aa03fc7 to
4c608dd
Compare
mch2
reviewed
Oct 2, 2025
...-rpc/src/internalClusterTest/java/org/opensearch/streaming/aggregation/SubAggregationIT.java
Show resolved
Hide resolved
server/src/main/java/org/opensearch/search/streaming/StreamingCostMetrics.java
Show resolved
Hide resolved
server/src/main/java/org/opensearch/search/streaming/FlushModeResolver.java
Show resolved
Hide resolved
server/src/main/java/org/opensearch/search/streaming/FlushModeResolver.java
Show resolved
Hide resolved
server/src/main/java/org/opensearch/search/DefaultSearchContext.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/opensearch/search/aggregations/AggregatorTreeEvaluator.java
Show resolved
Hide resolved
Signed-off-by: Rishabh Maurya <rishabhmaurya05@gmail.com>
833b301 to
79fa652
Compare
mch2
approved these changes
Oct 2, 2025
Contributor
|
❕ Gradle check result for 79fa652: UNSTABLE Please review all flaky tests that succeeded after retry and create an issue if one does not already exist to track the flaky failure. |
Signed-off-by: Rishabh Maurya <rishabhmaurya05@gmail.com>
Contributor
opensearch-trigger-bot bot
pushed a commit
that referenced
this pull request
Oct 3, 2025
#19488) * Planning for flush mode for streaming aggs Signed-off-by: Rishabh Maurya <rishabhmaurya05@gmail.com> * Address PR comments Signed-off-by: Rishabh Maurya <rishabhmaurya05@gmail.com> * Fix for nested aggs and more unit tests Signed-off-by: Rishabh Maurya <rishabhmaurya05@gmail.com> * Integ test to validate stream agg used using profile output Signed-off-by: Rishabh Maurya <rishabhmaurya05@gmail.com> * Make StreamNumericTermsAggregator streamable Signed-off-by: Rishabh Maurya <rishabhmaurya05@gmail.com> * Integ test for StreamNumericTermsAggregator Signed-off-by: Rishabh Maurya <rishabhmaurya05@gmail.com> * Improve coverage and PR comments Signed-off-by: Rishabh Maurya <rishabhmaurya05@gmail.com> * Minor refactor and address PR comments Signed-off-by: Rishabh Maurya <rishabhmaurya05@gmail.com> --------- Signed-off-by: Rishabh Maurya <rishabhmaurya05@gmail.com> (cherry picked from commit c851fdf) Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
19 tasks
rishabhmaurya
pushed a commit
that referenced
this pull request
Oct 3, 2025
#19488) (#19512) * Planning for flush mode for streaming aggs * Address PR comments * Fix for nested aggs and more unit tests * Integ test to validate stream agg used using profile output * Make StreamNumericTermsAggregator streamable * Integ test for StreamNumericTermsAggregator * Improve coverage and PR comments * Minor refactor and address PR comments --------- (cherry picked from commit c851fdf) Signed-off-by: Rishabh Maurya <rishabhmaurya05@gmail.com> Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
This was referenced Oct 9, 2025
peteralfonsi
pushed a commit
to peteralfonsi/OpenSearch
that referenced
this pull request
Oct 15, 2025
opensearch-project#19488) * Planning for flush mode for streaming aggs Signed-off-by: Rishabh Maurya <rishabhmaurya05@gmail.com> * Address PR comments Signed-off-by: Rishabh Maurya <rishabhmaurya05@gmail.com> * Fix for nested aggs and more unit tests Signed-off-by: Rishabh Maurya <rishabhmaurya05@gmail.com> * Integ test to validate stream agg used using profile output Signed-off-by: Rishabh Maurya <rishabhmaurya05@gmail.com> * Make StreamNumericTermsAggregator streamable Signed-off-by: Rishabh Maurya <rishabhmaurya05@gmail.com> * Integ test for StreamNumericTermsAggregator Signed-off-by: Rishabh Maurya <rishabhmaurya05@gmail.com> * Improve coverage and PR comments Signed-off-by: Rishabh Maurya <rishabhmaurya05@gmail.com> * Minor refactor and address PR comments Signed-off-by: Rishabh Maurya <rishabhmaurya05@gmail.com> --------- Signed-off-by: Rishabh Maurya <rishabhmaurya05@gmail.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Add streaming aggregation planning layer. Introduces smart fallback logic for streaming aggregations to prevent coordinator overhead and performance regressions as after #19373, cluster settings based flag will controls streaming for term aggs, so its enabled for all or none for term aggs and no way to control per request.
The high level idea is to determine the cost of running a complete agg tree using streaming mode, if the estimated overhead is too high on coordinator or it may perform poorly compared to traditional approach of flushing per shard, then fallback and recreate agg tree without using Streamable aggregators.
Changes
Streamableinterface for aggregators (in future other collectors) that support streaming with cost metrics reportingFlushModeResolver- Analyzes aggregation cost metrics and decides when to use streaming vs traditional processingStreamingCostMetrics- Captures bucket count, cardinality, and document estimates for cost analysisAggregatorTreeEvaluator- Evaluates entire aggregation tree streaming feasibility and falls back to traditional aggregators when needed.Enhanced existing
StreamStringTermAggregatorto implement Streamable interface with cost metricsAdded configurable thresholds for streaming decisions (max buckets, min cardinality ratio, min bucket count)
Automatically falls back to per-shard processing when streaming would cause coordinator overload or performance regression for low cardinality cases.
Related Issues
Resolves #[Issue number to be closed when this PR is merged]
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.