Skip to content

Add PushPartialAggregationThroughJoinRuleSet#27343

Draft
aaneja wants to merge 3 commits intoprestodb:masterfrom
aaneja:pushPartialAggThruJoinWithProject
Draft

Add PushPartialAggregationThroughJoinRuleSet#27343
aaneja wants to merge 3 commits intoprestodb:masterfrom
aaneja:pushPartialAggThruJoinWithProject

Conversation

@aaneja
Copy link
Contributor

@aaneja aaneja commented Mar 16, 2026

Description

Apply agg pushdown thru Join for cases where a Project stops the rule application
Co-authored-by: @copilot

Motivation and Context

Impact

Test Plan

Contributor checklist

  • Please make sure your submission complies with our contributing guide, in particular code style and commit standards.
  • PR description addresses the issue accurately and concisely. If the change is non-trivial, a GitHub Issue is referenced.
  • Documented new properties (with its default value), SQL syntax, functions, or other functionality.
  • If release notes are required, they follow the release notes guidelines.
  • Adequate tests were added if applicable.
  • CI passed.
  • If adding new dependencies, verified they have an OpenSSF Scorecard score of 5.0 or higher (or obtained explicit TSC approval for lower scores).

Release Notes

Please follow release notes guidelines and fill in the release notes below.

== RELEASE NOTES ==

General Changes
* ... 
* ... 

Hive Connector Changes
* ... 
* ... 

If release note is NOT required, use:

== NO RELEASE NOTE ==

Summary by Sourcery

Introduce a ruleset for pushing partial aggregations (optionally with intervening projections) through joins and update optimizers to use it instead of the previous single rule.

Enhancements:

  • Refactor partial-aggregation-through-join optimization into a reusable PushPartialAggregationThroughJoinRuleSet that supports both direct and projection-wrapped joins.

Tests:

  • Extend planner rule tests to cover pushing partial aggregation with a project through a join and ensuring no pushdown occurs when projections reference both join sides.

@prestodb-ci prestodb-ci added the from:IBM PR from IBM label Mar 16, 2026
@sourcery-ai
Copy link
Contributor

sourcery-ai bot commented Mar 16, 2026

Reviewer's Guide

Introduces PushPartialAggregationThroughJoinRuleSet to handle pushing partial aggregations (optionally with an intervening projection) through joins, wires it into PlanOptimizers, and adds tests for both the new rule set behavior and projection-spanning cases.

Class diagram for PushPartialAggregationThroughJoinRuleSet and related rules

classDiagram
    class PushPartialAggregationThroughJoinRuleSet {
        +Set~Rule~ rules()
        +PushPartialAggregationThroughJoin withoutProjectionRule()
        +PushPartialAggregationWithProjectThroughJoin withProjectionRule()
        -static Capture AGGREGATION_NODE
        -static Capture JOIN_NODE
        -static Capture PROJECT_NODE
        -static Pattern WITHOUT_PROJECTION
        -static Pattern WITH_PROJECTION
        -static boolean isSupportedAggregationNode(AggregationNode aggregationNode)
    }

    class BaseRule {
        <<abstract>>
        +boolean isEnabled(Session session)
        +Rule.Result applyPushdown(AggregationNode aggregationNode, JoinNode joinNode, Rule.Context context)
        -boolean allAggregationsOn(Map aggregations, List variables)
        -PlanNode pushPartialToLeftChild(AggregationNode node, JoinNode child, Rule.Context context)
        -PlanNode pushPartialToRightChild(AggregationNode node, JoinNode child, Rule.Context context)
        -Set getJoinRequiredVariables(JoinNode node)
        -List getPushedDownGroupingSet(AggregationNode aggregation, Set availableVariables, Set requiredJoinVariables)
        -AggregationNode replaceAggregationSource(AggregationNode aggregation, PlanNode source, List groupingKeys)
        -PlanNode pushPartialToJoin(AggregationNode aggregation, JoinNode child, PlanNode leftChild, PlanNode rightChild, Rule.Context context)
    }

    class PushPartialAggregationThroughJoin {
        +Pattern getPattern()
        +Rule.Result apply(AggregationNode aggregationNode, Captures captures, Rule.Context context)
    }

    class PushPartialAggregationWithProjectThroughJoin {
        +Pattern getPattern()
        +Rule.Result apply(AggregationNode aggregationNode, Captures captures, Rule.Context context)
        -PlanNode buildPushedProjection(ProjectNode originalProject, PlanNode joinChild, Rule.Context context)
        -JoinNode rebuildJoin(JoinNode original, PlanNode newLeft, PlanNode newRight)
    }

    class Rule {
        <<interface>>
    }

    class AggregationNode
    class JoinNode
    class ProjectNode
    class PlanNode

    PushPartialAggregationThroughJoinRuleSet --> PushPartialAggregationThroughJoin : creates
    PushPartialAggregationThroughJoinRuleSet --> PushPartialAggregationWithProjectThroughJoin : creates

    BaseRule ..|> Rule
    PushPartialAggregationThroughJoin ..|> BaseRule
    PushPartialAggregationWithProjectThroughJoin ..|> BaseRule

    AggregationNode --> PlanNode
    JoinNode --> PlanNode
    ProjectNode --> PlanNode
Loading

Flow diagram for applying PushPartialAggregationWithProjectThroughJoin

flowchart TD
    A[Agg_partial_over_Project_over_Join] --> B[Match WITH_PROJECTION pattern]
    B --> C{JoinType == INNER?}
    C -- No --> Z[Result.empty]
    C -- Yes --> D[Compute leftVariables and rightVariables]
    D --> E[Iterate project assignments]
    E --> F{Any expr uses both left and right vars?}
    F -- Yes --> Z
    F -- No --> G[Determine allLeft and allRight flags]
    G --> H{!allLeft && !allRight?}
    H -- Yes --> Z
    H -- No --> I{allLeft?}
    I -- Yes --> J[Build pushed Project over left child]
    I -- No --> K[Build pushed Project over right child]
    J --> L[Rebuild Join with new left child]
    K --> L[Rebuild Join with new right child]
    L --> M[Call applyPushdown on Agg and new Join]
    M --> N{All agg inputs from left side?}
    N -- Yes --> O[pushPartialToLeftChild]
    N -- No --> P{All agg inputs from right side?}
    P -- Yes --> Q[pushPartialToRightChild]
    P -- No --> Z
    O --> R[Build pushed Aggregation over left child]
    Q --> S[Build pushed Aggregation over right child]
    R --> T[Rebuild Join with pushed Aggregation]
    S --> T
    T --> U[restrictOutputs to original Agg outputs]
    U --> V[Return transformed plan]
Loading

File-Level Changes

Change Details Files
Replace single push-partial-aggregation-through-join rule with a rule set that handles both direct Agg->Join and Agg->Project->Join patterns and integrate it into the optimizer pipeline.
  • Introduce PushPartialAggregationThroughJoinRuleSet containing patterns for Aggregation above Join and Aggregation above Project above Join and a shared BaseRule for pushdown logic and enablement via session property.
  • Implement PushPartialAggregationThroughJoin inner class to handle the original Aggregation(PARTIAL)->Join case using the shared pushdown helpers.
  • Implement PushPartialAggregationWithProjectThroughJoin inner class that analyzes Project assignments to determine if they reference only one join side, builds a pushed-down Project on the chosen child, rebuilds the Join, and then applies the aggregation pushdown.
  • Remove the old PushPartialAggregationThroughJoin class and update PlanOptimizers to use PushPartialAggregationThroughJoinRuleSet().rules() alongside PushPartialAggregationThroughExchange.
presto-main-base/src/main/java/com/facebook/presto/sql/planner/iterative/rule/PushPartialAggregationThroughJoinRuleSet.java
presto-main-base/src/main/java/com/facebook/presto/sql/planner/PlanOptimizers.java
presto-main-base/src/main/java/com/facebook/presto/sql/planner/iterative/rule/PushPartialAggregationThroughJoin.java
Extend aggregation pushdown tests to validate behavior with and without an intervening Project, including correctness when projection expressions span both join sides.
  • Adapt existing testPushesPartialAggregationThroughJoin to use the new PushPartialAggregationThroughJoinRuleSet().withoutProjectionRule().
  • Add testPushesPartialAggregationWithProjectThroughJoin to verify that a Project whose expressions depend only on the left side is pushed below the join together with the partial aggregation and that the resulting plan matches expected structure.
  • Add testDoesNotFireWhenProjectSpansBothSides to ensure the rule does not fire when a projection expression references both left and right join sides, using VariablesExtractor-based analysis and the new RuleSet.
  • Use relational Expressions.call and Expressions.constant helpers in tests to build row expressions for arithmetic projections.
presto-main-base/src/test/java/com/facebook/presto/sql/planner/iterative/rule/TestPushPartialAggregationThroughJoin.java

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

from:IBM PR from IBM

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants