Prevent unnecessary heap sort when buckets needs to be ordered by key…#17252
Prevent unnecessary heap sort when buckets needs to be ordered by key…#17252jainankitk merged 1 commit intoopensearch-project:mainfrom
Conversation
|
@msfroh Please let me know if this change looks legit to you, thank you |
|
❌ Gradle check result for b007591: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
… in NumericTermsAggregation Signed-off-by: Rishabh Maurya <rishabhmaurya05@gmail.com>
b007591 to
6141759
Compare
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #17252 +/- ##
============================================
- Coverage 72.43% 72.29% -0.14%
+ Complexity 65725 65710 -15
============================================
Files 5318 5318
Lines 305675 305680 +5
Branches 44350 44352 +2
============================================
- Hits 221408 220999 -409
- Misses 66055 66597 +542
+ Partials 18212 18084 -128 ☔ View full report in Codecov by Sentry. |
msfroh
left a comment
There was a problem hiding this comment.
Wow... That's subtle. Replacing the O(log n) pop with the O(1) next is a good find (for sufficiently large n)!
|
@msfroh looks like you already approved, if you do not have any concerns, we can merge it. |
… in NumericTermsAggregation (opensearch-project#17252) Signed-off-by: Rishabh Maurya <rishabhmaurya05@gmail.com> (cherry picked from commit 3f793b6)
… in NumericTermsAggregation (#17252) (#17290) (cherry picked from commit 3f793b6) Signed-off-by: Rishabh Maurya <rishabhmaurya05@gmail.com> Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Signed-off-by: Ankit Jain <akjain@amazon.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: Ankit Jain <akjain@amazon.com>
We can slightly improve performance of Numeric term Aggs by avoiding double sorting - first by count and then by key before final reduce
Description
We don't need to sort the buckets by
_count. PriorityQueue will ensure topN buckets by_count, as final reduce order is always by key in NumericTermsAggregation andResultStrategy#buildResult()is anyways sorting the buckets by key before final reduce, thus sorting the buckets by count or whatever the order clause was, is redundant.Related Issues
Resolves #[Issue number to be closed when this PR is merged]
Check List
- [ ] API changes companion pull request created, if applicable.- [ ] Public documentation issue/PR created, if applicable.By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.