Fix SearchPhaseExecutionException to properly initCause#20320
Fix SearchPhaseExecutionException to properly initCause#20320cwperks merged 3 commits intoopensearch-project:mainfrom
Conversation
Signed-off-by: Craig Perkins <cwperx@amazon.com>
📝 WalkthroughWalkthroughReplaces SearchPhaseExecutionException's previous deduplication logic with a new getEffectiveCause helper used by constructors; an explicit cause is preferred, otherwise the first shard failure cause is used. Tests and serialized expectations were updated to reflect the new caused_by behavior; changelog entry added. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
📜 Recent review detailsConfiguration used: defaults Review profile: CHILL Plan: Pro 📒 Files selected for processing (3)
🧰 Additional context used🧠 Learnings (1)📚 Learning: 2025-12-02T22:44:14.799ZApplied to files:
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (20)
🔇 Additional comments (9)
Comment |
Signed-off-by: Craig Perkins <cwperx@amazon.com>
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (2)
server/src/main/java/org/opensearch/action/search/SearchPhaseExecutionException.java (2)
76-85: Redundant loop — both branches returncause.The loop checking for duplicate causes has no effect on the outcome. Whether a duplicate is found or not,
causeis always returned when non-null. The comment mentions "not duplicated" but the behavior is identical in both cases.Consider simplifying:
🔎 Proposed simplification
// If explicit cause provided and not duplicated in shard failures, use it if (cause != null) { - for (ShardSearchFailure failure : shardFailures) { - if (failure.getCause() == cause) { - // Cause is duplicated, but still return it to properly wire the chain - return cause; - } - } return cause; }
163-175: Stale comment references removedgetCause()override.The comment at lines 163-165 references
{@link #getCause()}with "guessing" behavior, but the overriddengetCause()method was removed in this PR. The distinction betweengetCause()andsuper.getCause()(line 174) is no longer meaningful since both now return the same value.Consider updating the comment to reflect that cause resolution now happens at construction time via
getEffectiveCause.
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
CHANGELOG.mdserver/src/main/java/org/opensearch/action/search/SearchPhaseExecutionException.java
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (19)
- GitHub Check: gradle-check
- GitHub Check: assemble (21, ubuntu-24.04-arm)
- GitHub Check: assemble (25, ubuntu-latest)
- GitHub Check: assemble (21, ubuntu-latest)
- GitHub Check: assemble (25, ubuntu-24.04-arm)
- GitHub Check: assemble (25, windows-latest)
- GitHub Check: assemble (21, windows-latest)
- GitHub Check: precommit (25, ubuntu-latest)
- GitHub Check: precommit (21, windows-2025, true)
- GitHub Check: precommit (25, macos-15-intel)
- GitHub Check: precommit (25, ubuntu-24.04-arm)
- GitHub Check: precommit (25, macos-15)
- GitHub Check: precommit (21, ubuntu-24.04-arm)
- GitHub Check: precommit (21, windows-latest)
- GitHub Check: precommit (21, macos-15)
- GitHub Check: precommit (25, windows-latest)
- GitHub Check: precommit (21, ubuntu-latest)
- GitHub Check: precommit (21, macos-15-intel)
- GitHub Check: Analyze (java)
🔇 Additional comments (2)
CHANGELOG.md (1)
25-25: LGTM!The changelog entry follows the project format and is correctly placed under the Fixed section.
server/src/main/java/org/opensearch/action/search/SearchPhaseExecutionException.java (1)
62-66: LGTM!The constructor now properly initializes the cause at construction time, which ensures
initCauseis called correctly for log4j compatibility.
|
❌ Gradle check result for 4923381: null Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
|
❌ Gradle check result for 4923381: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
|
Looks like some test failures (all based on assuming lack of the new caused-by text): |
|
I'll push a fix for the tests soon |
|
@dbwiddis I pushed a commit to fix the tests. Note that this slightly switches the serialization of I also took the opportunity to modernize and use Text Blocks (IMO much more readable). |
|
❌ Gradle check result for 14b84e1: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
|
❌ Gradle check result for 14b84e1: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
dbwiddis
left a comment
There was a problem hiding this comment.
LGTM with one question on logic sequence. OK to merge if you're happy with it.
server/src/main/java/org/opensearch/action/search/SearchPhaseExecutionException.java
Show resolved
Hide resolved
|
❕ Gradle check result for 14b84e1: UNSTABLE Please review all flaky tests that succeeded after retry and create an issue if one does not already exist to track the flaky failure. |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #20320 +/- ##
============================================
- Coverage 73.28% 73.26% -0.02%
- Complexity 71723 71761 +38
============================================
Files 5785 5785
Lines 328137 328136 -1
Branches 47270 47269 -1
============================================
- Hits 240465 240408 -57
- Misses 68397 68485 +88
+ Partials 19275 19243 -32 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
@cwperks should we backport this? I think 2.19 will suffer the same fate |
the log4j bump hasn't been merged yet. I will raise a backport for this once the log4j bump is merged |
|
#20314 is merged now. I will open up a backport on Monday. |
|
Opened backport for 2.19: #20336 |
…roject#20320) * Fix SearchPhaseExecutionException to properly initCause Signed-off-by: Craig Perkins <cwperx@amazon.com> * Add to CHANGELOG Signed-off-by: Craig Perkins <cwperx@amazon.com> * Fix tests Signed-off-by: Craig Perkins <cwperx@amazon.com> --------- Signed-off-by: Craig Perkins <cwperx@amazon.com>
…roject#20320) * Fix SearchPhaseExecutionException to properly initCause Signed-off-by: Craig Perkins <cwperx@amazon.com> * Add to CHANGELOG Signed-off-by: Craig Perkins <cwperx@amazon.com> * Fix tests Signed-off-by: Craig Perkins <cwperx@amazon.com> --------- Signed-off-by: Craig Perkins <cwperx@amazon.com>
Description
This PR is an alternative to opensearch-project/anomaly-detection#1638 and fixes an issue with the log4j upgrade from 2.21.0 to 2.25.3. This PR ensures that the SearchPhaseExecutionException calls
initCausewith the proper root cause to ensure that log4j can get cause metadata.Related Issues
See linked exception on opensearch-project/anomaly-detection#1638 that this PR resolves.
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.
Summary by CodeRabbit
Bug Fixes
Tests
✏️ Tip: You can customize this high-level summary in your review settings.