Skip to content

Show heap percent threshold in cancellation message#20779

Merged
jainankitk merged 1 commit intoopensearch-project:mainfrom
dzane17:sbp-log
Mar 5, 2026
Merged

Show heap percent threshold in cancellation message#20779
jainankitk merged 1 commit intoopensearch-project:mainfrom
dzane17:sbp-log

Conversation

@dzane17
Copy link
Copy Markdown
Member

@dzane17 dzane17 commented Mar 4, 2026

Description

Fixes the confusing cancellation message in HeapUsageTracker reported
in #17947. The message previously showed the moving-average-based
allowedUsage value, which could be extremely small (hundreds of KB)
when historical tasks were lightweight. This made log messages like
heap usage exceeded [3.4gb >= 800.5kb] nonsensical to operators.

The cancellation condition requires both currentUsage >= threshold
(heap percent) and currentUsage >= allowedUsage (moving avg * variance)
to be true. The heap percent threshold is the more meaningful and
accurate threshold to display.

Before: heap usage exceeded [3.4gb >= 800.5kb]
After: heap usage exceeded [3.4gb >= 150mb]

Related Issues

Resolves #17947

Check List

  • Functionality includes testing.
  • API changes companion pull request created, if applicable.
  • Public documentation issue/PR created, if applicable.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@github-actions github-actions bot added bug Something isn't working Search:Resiliency labels Mar 4, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Mar 4, 2026

PR Reviewer Guide 🔍

(Review updated until commit 8f77d6c)

Here are some key observations to aid the review process:

🧪 PR contains tests
🔒 No security concerns identified
📝 TODO sections

🔀 No multiple PR themes
⚡ Recommended focus areas for review

Suspicious Test Values

The updated test assertions expect "heap usage exceeded [300b >= 0b]" and "heap usage exceeded [200b >= 0b]", where the threshold is 0b. This suggests the heap percent threshold resolves to 0 in the test setup, which may indicate the test is not properly validating the new behavior. The threshold should reflect a meaningful value (e.g., the configured heap percent threshold) to ensure the fix is actually tested correctly.

assertEquals("heap usage exceeded [300b >= 0b]", reason.get().getMessage());
Suspicious Test Values

Similar to the SearchTask test, the SearchShardTask test expects "heap usage exceeded [200b >= 0b]" with a threshold of 0b. A threshold of zero would normally prevent cancellation from triggering (since currentUsage < threshold would be false only if threshold is 0). This warrants verification that the test mock setup correctly reflects the intended threshold configuration.

assertEquals("heap usage exceeded [200b >= 0b]", reason.get().getMessage());

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Mar 4, 2026

PR Code Suggestions ✨

Latest suggestions up to 8f77d6c

Explore these optional code suggestions:

CategorySuggestion                                                                                                                                    Impact
Possible issue
Test assertion validates meaningful threshold value

The test now expects 0b as the threshold value, which suggests the
heapPercentThresholdSupplier returns 0 in the test setup, making the threshold
effectively 0 bytes. This means the test no longer validates that the cancellation
message correctly shows the configured heap percent threshold. The test setup should
mock heapPercentThresholdSupplier to return a meaningful non-zero value so the
message can be properly verified.

server/src/test/java/org/opensearch/search/backpressure/trackers/HeapUsageTrackerTests.java [73]

-assertEquals("heap usage exceeded [300b >= 0b]", reason.get().getMessage());
+assertEquals("heap usage exceeded [300b >= 200b]", reason.get().getMessage());
Suggestion importance[1-10]: 6

__

Why: The test now asserts 0b as the threshold, which means the heapPercentThresholdSupplier returns 0 in the test context, making the message validation less meaningful. The suggestion to mock a non-zero threshold value would make the test more robust and actually verify the threshold is correctly included in the message. However, the fix itself (changing from allowedUsage to threshold in the message) is the core change, and the test value depends on the existing test setup.

Low

Previous suggestions

Suggestions up to commit 4c98efd
CategorySuggestion                                                                                                                                    Impact
General
Test should validate meaningful threshold value

The test now expects 0b as the threshold value, which suggests the
heapPercentThresholdSupplier returns 0 in the test setup, making the test not
actually validate the threshold display. The test should be updated to configure a
meaningful heap percent threshold so the message reflects a real threshold value
rather than 0b, which would make the test more meaningful and catch regressions.

server/src/test/java/org/opensearch/search/backpressure/trackers/HeapUsageTrackerTests.java [73]

-assertEquals("heap usage exceeded [300b >= 0b]", reason.get().getMessage());
+assertEquals("heap usage exceeded [300b >= " + new ByteSizeValue((long)(heapPercentThreshold * HEAP_SIZE_BYTES)) + "]", reason.get().getMessage());
Suggestion importance[1-10]: 5

__

Why: The suggestion raises a valid concern that 0b as the threshold in the test message may indicate the heapPercentThresholdSupplier returns 0, making the test less meaningful. However, the improved_code references variables (heapPercentThreshold, HEAP_SIZE_BYTES) that may not be in scope in the test method, making the proposed fix potentially incorrect as written.

Low
Suggestions up to commit 65901c1
CategorySuggestion                                                                                                                                    Impact
Possible issue
Test should validate meaningful threshold value

The test now expects 0b as the threshold value, which suggests the
heapPercentThresholdSupplier is returning 0 in the test setup. The test should be
updated to mock a meaningful threshold value so the message reflects a realistic
scenario and validates the fix properly. A threshold of 0b does not meaningfully
test that the threshold is shown correctly in the cancellation message.

server/src/test/java/org/opensearch/search/backpressure/trackers/HeapUsageTrackerTests.java [73]

-assertEquals("heap usage exceeded [300b >= 0b]", reason.get().getMessage());
+assertEquals("heap usage exceeded [300b >= 200b]", reason.get().getMessage());
Suggestion importance[1-10]: 5

__

Why: The suggestion raises a valid concern that 0b as the threshold value in the test message may not meaningfully validate the fix. However, the improved_code proposes 200b without justification from the test setup, and the actual threshold depends on heapPercentThresholdSupplier configuration in the test. The suggestion is directionally correct but the proposed value is arbitrary.

Low
Suggestions up to commit 3dc614f
CategorySuggestion                                                                                                                                    Impact
Possible issue
Test should validate meaningful threshold value

The test now expects 0b as the threshold value, which suggests the
heapPercentThresholdSupplier is returning 0 in the test setup. The test should be
updated to mock a meaningful threshold value so the message reflects a realistic
threshold (e.g., the heap percent threshold multiplied by heap size), making the
test actually validate the new behavior introduced by the PR.

server/src/test/java/org/opensearch/search/backpressure/trackers/HeapUsageTrackerTests.java [73]

-assertEquals("heap usage exceeded [300b >= 0b]", reason.get().getMessage());
+assertEquals("heap usage exceeded [300b >= <expected_threshold_bytes>b]", reason.get().getMessage());
Suggestion importance[1-10]: 6

__

Why: The test expects 0b as the threshold, which indicates the heapPercentThresholdSupplier returns 0 in the test setup. This is a valid concern - the test should verify a meaningful threshold value to properly validate the new message format introduced in the PR. However, the improved_code uses a placeholder <expected_threshold_bytes> rather than an actual value, making it incomplete.

Low
Suggestions up to commit 7e42915
CategorySuggestion                                                                                                                                    Impact
Possible issue
Fix test to verify correct threshold value

The test now expects 0b as the threshold value, which suggests the threshold
variable is zero in the test context. This likely indicates the test is not properly
setting up the heap percent threshold, making the test verify incorrect behavior.
The test should be updated to configure a meaningful threshold value and assert the
correct threshold appears in the message.

server/src/test/java/org/opensearch/search/backpressure/trackers/HeapUsageTrackerTests.java [73]

-assertEquals("heap usage exceeded [300b >= 0b]", reason.get().getMessage());
+assertEquals("heap usage exceeded [300b >= <expected_threshold_bytes>b]", reason.get().getMessage());
Suggestion importance[1-10]: 6

__

Why: The test now asserts 0b as the threshold, which suggests the threshold variable evaluates to zero in the test context. This may indicate a real issue where the threshold isn't being properly set up, making the test verify incorrect behavior. However, the improved_code uses a placeholder <expected_threshold_bytes> rather than an actual value, so it's not directly actionable.

Low
Fix shard task test threshold assertion

Similar to the SearchTask test, the SearchShardTask test now expects 0b as the
threshold, which means the threshold is not being initialized or configured properly
in the test setup. A threshold of 0b would mean any heap usage triggers
cancellation, which is not a meaningful test scenario. The test setup should
configure a non-zero threshold and the assertion should reflect the actual
configured value.

server/src/test/java/org/opensearch/search/backpressure/trackers/HeapUsageTrackerTests.java [103]

-assertEquals("heap usage exceeded [200b >= 0b]", reason.get().getMessage());
+assertEquals("heap usage exceeded [200b >= <expected_threshold_bytes>b]", reason.get().getMessage());
Suggestion importance[1-10]: 6

__

Why: Similar to suggestion 1, the SearchShardTask test asserting 0b as the threshold may indicate the threshold isn't properly initialized in the test setup. The improved_code uses a placeholder rather than a concrete value, limiting its direct applicability.

Low

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Mar 4, 2026

❌ Gradle check result for 7e42915: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Mar 4, 2026

Persistent review updated to latest commit 3dc614f

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Mar 4, 2026

Persistent review updated to latest commit 65901c1

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Mar 4, 2026

Persistent review updated to latest commit 4c98efd

Signed-off-by: David Zane <davizane@amazon.com>
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Mar 4, 2026

Persistent review updated to latest commit 8f77d6c

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Mar 5, 2026

✅ Gradle check result for 8f77d6c: SUCCESS

@codecov
Copy link
Copy Markdown

codecov bot commented Mar 5, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 73.27%. Comparing base (59be6ae) to head (8f77d6c).
⚠️ Report is 11 commits behind head on main.

Additional details and impacted files
@@             Coverage Diff              @@
##               main   #20779      +/-   ##
============================================
- Coverage     73.29%   73.27%   -0.02%     
- Complexity    72088    72095       +7     
============================================
  Files          5794     5794              
  Lines        329733   329733              
  Branches      47577    47577              
============================================
- Hits         241664   241599      -65     
- Misses        68612    68718     +106     
+ Partials      19457    19416      -41     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working Search:Resiliency

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] SearchBackpressureService threshold calculations incorrect?

2 participants