Skip to content

Fix a flaky test that forces materialization to hit cache#19759

Merged
andrross merged 1 commit intoopensearch-project:mainfrom
liuguoqingfz:flakytest-19730
Dec 17, 2025
Merged

Fix a flaky test that forces materialization to hit cache#19759
andrross merged 1 commit intoopensearch-project:mainfrom
liuguoqingfz:flakytest-19730

Conversation

@liuguoqingfz
Copy link
Copy Markdown
Contributor

@liuguoqingfz liuguoqingfz commented Oct 24, 2025

Description

IndexFieldData#load(LeafReaderContext) does not always build and insert the heavy field-data into the shared cache immediately. For some field types/implementations it returns a lightweight wrapper that defers the expensive build until you actually access the values, or until a global/agg path forces construction.

Related Issues

Resolves #19730

Check List

  • Functionality includes testing.
  • API changes companion pull request created, if applicable.
  • Public documentation issue/PR created, if applicable.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Summary by CodeRabbit

  • Tests
    • Improved index field data cache validation testing by implementing explicit materialization of field data instances prior to cache state assertions. This ensures comprehensive and accurate verification of cache behavior during test execution.

✏️ Tip: You can customize this high-level summary in your review settings.

@github-actions
Copy link
Copy Markdown
Contributor

❌ Gradle check result for e931bbc: null

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@github-actions
Copy link
Copy Markdown
Contributor

❌ Gradle check result for 5e91d97: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@github-actions
Copy link
Copy Markdown
Contributor

❌ Gradle check result for 5e91d97: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@github-actions
Copy link
Copy Markdown
Contributor

❕ Gradle check result for 86b2ece: UNSTABLE

Please review all flaky tests that succeeded after retry and create an issue if one does not already exist to track the flaky failure.

@codecov
Copy link
Copy Markdown

codecov bot commented Oct 24, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 73.18%. Comparing base (930ae74) to head (0275362).
⚠️ Report is 6 commits behind head on main.

Additional details and impacted files
@@             Coverage Diff              @@
##               main   #19759      +/-   ##
============================================
- Coverage     73.20%   73.18%   -0.03%     
+ Complexity    71745    71728      -17     
============================================
  Files          5795     5795              
  Lines        328304   328298       -6     
  Branches      47283    47279       -4     
============================================
- Hits         240334   240261      -73     
- Misses        68663    68800     +137     
+ Partials      19307    19237      -70     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@opensearch-trigger-bot
Copy link
Copy Markdown
Contributor

This PR is stalled because it has been open for 30 days with no activity.

@opensearch-trigger-bot opensearch-trigger-bot bot added stalled Issues that have stalled and removed stalled Issues that have stalled labels Nov 24, 2025
@liuguoqingfz
Copy link
Copy Markdown
Contributor Author

Can someone review this please...

Signed-off-by: Joe Liu <guoqing4@illinois.edu>
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Dec 16, 2025

Walkthrough

A test fix that adds explicit materialization of LeafFieldData instances by calling getBytesValues() to force cache population before asserting cache state, addressing a flaky test in the field data service tests.

Changes

Cohort / File(s) Change Summary
Test Cache Materialization Fix
server/src/test/java/org/opensearch/index/fielddata/IndexFieldDataServiceTests.java
Added explicit calls to getBytesValues() on two LeafFieldData instances to ensure cache population before assertion, fixing race condition in testIndexFieldDataCacheIsClearedAfterIndexRemoval

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~8 minutes

  • Review focuses on verifying that the materialization calls correctly populate the cache before the assertion
  • Understand the timing issue causing the flakiness and validate the fix resolves it

Suggested labels

Indexing

Suggested reviewers

  • msfroh
  • dbwiddis
  • sachinpkale
  • kotwanikunal
  • cwperks
  • shwetathareja

Poem

🐰 A flaky test once danced in the night,
Cache and timing just wouldn't align right,
But now with a materialization call,
The data flows true and stable for all!

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and concisely describes the main change: fixing a flaky test by forcing materialization to hit the cache.
Description check ✅ Passed The description explains the root cause of the flaky test and references the linked issue #19730, though the Description section could be more detailed about the fix.
Linked Issues check ✅ Passed The PR addresses the flaky test issue #19730 by ensuring field-data materialization triggers cache insertion as required.
Out of Scope Changes check ✅ Passed All changes are scoped to fixing the flaky test in IndexFieldDataServiceTests with explicit materialization calls to ensure cache population.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
server/src/test/java/org/opensearch/index/fielddata/IndexFieldDataServiceTests.java (1)

243-248: Consider applying the same fix to other tests with similar patterns.

The test method testClosingSegmentInvalidatesEntries follows a similar pattern: loading field data (lines 244-246) and immediately asserting cache state (line 248) without forcing materialization. While this test hasn't been reported as flaky, it may be susceptible to the same issue.

Consider adding materialization calls here as well:

 List<LeafFieldData> loadFields = new ArrayList<>();
 for (LeafReaderContext lrContext : reader.getContext().leaves()) {
-    loadFields.add(ifd.load(lrContext));
+    LeafFieldData leafFieldData = ifd.load(lrContext);
+    leafFieldData.getBytesValues();  // Force materialization
+    loadFields.add(leafFieldData);
 }

Similar patterns exist in:

  • testClearField (lines 335-336)
  • testFieldDataCacheListener (line 413)
  • testExceptionWhileRemovingKey (lines 589-590)
📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 799fb9b and 0275362.

📒 Files selected for processing (1)
  • server/src/test/java/org/opensearch/index/fielddata/IndexFieldDataServiceTests.java (1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (20)
  • GitHub Check: gradle-check
  • GitHub Check: assemble (21, windows-latest)
  • GitHub Check: assemble (21, ubuntu-24.04-arm)
  • GitHub Check: assemble (25, ubuntu-latest)
  • GitHub Check: assemble (21, ubuntu-latest)
  • GitHub Check: assemble (25, ubuntu-24.04-arm)
  • GitHub Check: assemble (25, windows-latest)
  • GitHub Check: Analyze (java)
  • GitHub Check: precommit (21, ubuntu-24.04-arm)
  • GitHub Check: precommit (25, windows-latest)
  • GitHub Check: precommit (21, windows-2025, true)
  • GitHub Check: precommit (25, macos-15)
  • GitHub Check: precommit (21, windows-latest)
  • GitHub Check: detect-breaking-change
  • GitHub Check: precommit (25, macos-15-intel)
  • GitHub Check: precommit (21, macos-15-intel)
  • GitHub Check: precommit (25, ubuntu-24.04-arm)
  • GitHub Check: precommit (25, ubuntu-latest)
  • GitHub Check: precommit (21, ubuntu-latest)
  • GitHub Check: precommit (21, macos-15)
🔇 Additional comments (1)
server/src/test/java/org/opensearch/index/fielddata/IndexFieldDataServiceTests.java (1)

188-190: LGTM! The fix correctly addresses the flaky test.

The explicit materialization via getBytesValues() ensures that both field data instances populate the shared cache before the assertion at line 192. This eliminates the race condition where load() might return lightweight wrappers that defer cache population.

@github-actions
Copy link
Copy Markdown
Contributor

✅ Gradle check result for 0275362: SUCCESS

@andrross andrross merged commit 2b4ede6 into opensearch-project:main Dec 17, 2025
35 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

autocut flaky-test Random test failure that succeeds on second run >test-failure Test failure from CI, local build, etc.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[AUTOCUT] Gradle Check Flaky Test Report for IndexFieldDataServiceTests

2 participants