Fix segment replication bug during primary relocation#18944

Merged

ashking94 merged 3 commits intoopensearch-project:mainfrom

ashking94:fix-segrep-bug

Aug 7, 2025

Member

ashking94 commented Aug 6, 2025 •

edited

Loading

Description

After the bug which led to infinite loop of segment replication was fixed in PR #18636, the FullRollingRestartIT test became flaky as seen in #18490. On deeper analysis, I found that this happens due to race condition in primary shard relocation. On primary shard relocation, the new primary has a bumped up segment infos generation and version which is broadcasted to all of it's replica via the checkpoint publisher. This happens around the same time when the shard_started primary action is called to active cluster manager to inform that the primary handover happened successfully. In certain condition, it was seen that the replica received the latest checkpoint from the new primary, but the cluster applier service was yet to be applied. This led to the replica reaching out to the old primary for getting the segment infos. This issue has slight probability of happening for indexes not getting any kind of ingestion during relocation after the permits have been acquired on the older primary.

With this PR, the following things would happen to prevent the issue that happens now:

This is only for segrep local indexes only. The reason why it is not handled for remote store is mentioned in 4th point below.
If the checkpoint that is received during the get checkpoint info transport action is behind the checkpoint that was received during the checkpoint publish action by primary, then we fail the replication event with appropriate reason.
There is retry built-in in the current segrep flow to retry on recoverable failure modes.
The change in PR fails the segrep event if the checkpoint received during get segment infos call is stale than the original checkpoint received during the checkpoint publish call. This is problematic for remote store as the new primary is not given the rights for upload until the cluster applier service runs on the primary node after the shard started action. This is done intentionally to prevent multi writer during primary relocation case.

Related Issues

Resolves #18490

Check List

Functionality includes testing.
~~[ ] API changes companion pull request created, if applicable.~~
~~[ ] Public documentation issue/PR created, if applicable.~~

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.


          Fix segment replication bug during primary relocation

2d76208

Signed-off-by: Ashish Singh <ssashish@amazon.com>

ashking94 requested a review from a team as a code owner

August 6, 2025 17:33

github-actions bot added >test-failure autocut flaky-test labels

ashking94 added the skip-changelog label

Member Author

ashking94 commented Aug 6, 2025

@mch2 @andrross @getsaurabh02 @sachinpkale @Bukhtawar - This one is a small PR for fixing a flaky test before 3.2 release. Can you help with the review?

getsaurabh02 approved these changes

View reviewed changes

andrross approved these changes

View reviewed changes

andrross added the backport 2.19 label

Contributor

github-actions bot commented Aug 6, 2025

❌ Gradle check result for 2d76208: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

mch2 approved these changes

View reviewed changes

Member

mch2 commented Aug 6, 2025

@ashking94 thanks for fixing this. Not for now but with node-node segrep I'm thinking we could also change the replication source to fetch segments from the publisher of the cp vs today where we rely on the cluster state lookup of the active primary. This would allow us to replicate from non primary nodes.

This was referenced Jul 28, 2025

[AUTOCUT] Gradle Check Flaky Test Report for RemoteStoreClusterStateRestoreIT #14326

Open

[AUTOCUT] Gradle Check Flaky Test Report for RemoteStoreKafkaIT #17693

Open

Contributor

github-actions bot commented Aug 6, 2025

❌ Gradle check result for 2d76208: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Contributor

github-actions bot commented Aug 6, 2025

❌ Gradle check result for 2d76208: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Contributor

github-actions bot commented Aug 7, 2025

❌ Gradle check result for 2d76208: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Contributor

github-actions bot commented Aug 7, 2025

❌ Gradle check result for 2d76208: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

ashking94 added 2 commits

August 7, 2025 10:07


          Merge remote-tracking branch 'upstream/main' into fix-segrep-bug

a2f357f

Signed-off-by: Ashish Singh <ssashish@amazon.com>


          Fix applicable for segrep local indexes only

3c475c1

Signed-off-by: Ashish Singh <ssashish@amazon.com>

Member Author

ashking94 commented Aug 7, 2025

@ashking94 thanks for fixing this. Not for now but with node-node segrep I'm thinking we could also change the replication source to fetch segments from the publisher of the cp vs today where we rely on the cluster state lookup of the active primary. This would allow us to replicate from non primary nodes.

Sure, Marc. This does make sense.

Contributor

github-actions bot commented Aug 7, 2025

✅ Gradle check result for 3c475c1: SUCCESS

codecov bot commented Aug 7, 2025 •

edited

Loading

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 72.85%. Comparing base (c01ff89) to head (3c475c1).
⚠️ Report is 4 commits behind head on main.

Additional details and impacted files

@@             Coverage Diff              @@
##               main   #18944      +/-   ##
============================================
- Coverage     72.89%   72.85%   -0.04%     
- Complexity    69318    69340      +22     
============================================
  Files          5642     5642              
  Lines        318636   318640       +4     
  Branches      46107    46108       +1     
============================================
- Hits         232254   232138     -116     
- Misses        67540    67752     +212     
+ Partials      18842    18750      -92

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

ashking94 merged commit 251cc36 into opensearch-project:main

31 checks passed

Contributor

opensearch-trigger-bot bot commented Aug 7, 2025

The backport to 2.19 failed:

The process '/usr/bin/git' failed with exit code 128

To backport manually, run these commands in your terminal:

# Navigate to the root of your repository
cd $(git rev-parse --show-toplevel)
# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add ../.worktrees/OpenSearch/backport-2.19 2.19
# Navigate to the new working tree
pushd ../.worktrees/OpenSearch/backport-2.19
# Create a new branch
git switch --create backport/backport-18944-to-2.19
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 251cc3603e73405cc047e192663e8b5dbaa1c61d
# Push it to GitHub
git push --set-upstream origin backport/backport-18944-to-2.19
# Go back to the original working tree
popd
# Delete the working tree
git worktree remove ../.worktrees/OpenSearch/backport-2.19

Then, create a pull request where the base branch is 2.19 and the compare/head branch is backport/backport-18944-to-2.19.

opensearch-trigger-bot bot added the backport-failed label

ashking94 added the backport 3.2 label

opensearch-trigger-bot bot pushed a commit that referenced this pull request


          Fix segment replication bug during primary relocation (#18944)

31c8427

* Fix segment replication bug during primary relocation

Signed-off-by: Ashish Singh <ssashish@amazon.com>

* Fix applicable for segrep local indexes only

Signed-off-by: Ashish Singh <ssashish@amazon.com>

---------

Signed-off-by: Ashish Singh <ssashish@amazon.com>
(cherry picked from commit 251cc36)
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

opensearch-trigger-bot bot mentioned this pull request

[Backport 3.2] Fix segment replication bug during primary relocation #18957

Merged

ashking94 added a commit to ashking94/OpenSearch that referenced this pull request


          Fix segment replication bug during primary relocation (opensearch-pro…

ffcb48a

…ject#18944)

* Fix segment replication bug during primary relocation

Signed-off-by: Ashish Singh <ssashish@amazon.com>

* Fix applicable for segrep local indexes only

Signed-off-by: Ashish Singh <ssashish@amazon.com>

---------

Signed-off-by: Ashish Singh <ssashish@amazon.com>

ashking94 mentioned this pull request

[Backport 2.19] Fix segment replication bug during primary relocation #18958

Merged

Member Author

ashking94 commented Aug 7, 2025

Raised manual backport for 2.19 - #18958

opensearch-ci-bot mentioned this pull request

[AUTOCUT] Gradle Check Flaky Test Report for LeafSorterOptimizationTests #18898

Closed

ashking94 pushed a commit that referenced this pull request


          Fix segment replication bug during primary relocation (#18944) (#18957)

d9a0855

* Fix segment replication bug during primary relocation



* Fix applicable for segrep local indexes only



---------


(cherry picked from commit 251cc36)

Signed-off-by: Ashish Singh <ssashish@amazon.com>
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

mch2 pushed a commit that referenced this pull request


          Fix segment replication bug during primary relocation (#18944) (#18958)

3fa4dfd

* Fix segment replication bug during primary relocation



* Fix applicable for segrep local indexes only



---------

Signed-off-by: Ashish Singh <ssashish@amazon.com>

This was referenced Aug 12, 2025

[AUTOCUT] Gradle Check Flaky Test Report for WarmIndexSegmentReplicationIT #18157

Open

[AUTOCUT] Gradle Check Flaky Test Report for SegmentReplicationUsingRemoteStoreIT #14329

Open

[AUTOCUT] Gradle Check Flaky Test Report for RemoteStoreRestoreIT #14484

Open

RajatGupta02 pushed a commit to RajatGupta02/OpenSearch that referenced this pull request


          Fix segment replication bug during primary relocation (opensearch-pro…

364c037

…ject#18944)

* Fix segment replication bug during primary relocation

Signed-off-by: Ashish Singh <ssashish@amazon.com>

* Fix applicable for segrep local indexes only

Signed-off-by: Ashish Singh <ssashish@amazon.com>

---------

Signed-off-by: Ashish Singh <ssashish@amazon.com>

kh3ra pushed a commit to kh3ra/OpenSearch that referenced this pull request


          Fix segment replication bug during primary relocation (opensearch-pro…

2dce92a

…ject#18944)

* Fix segment replication bug during primary relocation

Signed-off-by: Ashish Singh <ssashish@amazon.com>

* Fix applicable for segrep local indexes only

Signed-off-by: Ashish Singh <ssashish@amazon.com>

---------

Signed-off-by: Ashish Singh <ssashish@amazon.com>

vinaykpud pushed a commit to vinaykpud/OpenSearch that referenced this pull request


          Fix segment replication bug during primary relocation (opensearch-pro…

63fd932

…ject#18944)

* Fix segment replication bug during primary relocation

Signed-off-by: Ashish Singh <ssashish@amazon.com>

* Fix applicable for segrep local indexes only

Signed-off-by: Ashish Singh <ssashish@amazon.com>

---------

Signed-off-by: Ashish Singh <ssashish@amazon.com>

cuonghm2809 added a commit to cuonghm2809/OpenSearch that referenced this pull request


          Fix segment replication failure during rolling restart

5e2114f

During rolling restarts, replica shards may have received newer checkpoints
from the primary before the restart, but after restart, the primary may have
rolled back to an older state. The strict checkpoint validation added in opensearch-project#18944
to fix race conditions during primary relocation incorrectly rejects this
legitimate scenario, causing shards to fail allocation after 5 retries.

This fix distinguishes between two scenarios:
1. Normal replication - strict checkpoint validation applies to prevent
   accepting stale data during primary relocation (maintains opensearch-project#18944 fix)
2. Recovery (shard INITIALIZING or RELOCATING) - accepts the primary's
   current state even if it appears older than the replica's last known
   checkpoint, as this is expected during recovery from restart

Added unit tests to verify:
- Stale checkpoint is rejected during normal replication
- Stale checkpoint is accepted during shard recovery

Signed-off-by: Cuong Ha <cuong.ha@optimizely.com>

Fixes opensearch-project#19234

This was referenced Jan 14, 2026

Fix segment replication failure during rolling restart #20422

Merged

[2.19] Fix segment replication failure during rolling restart cuonghm2809/OpenSearch#1

Closed

cuonghm2809 added a commit to cuonghm2809/OpenSearch that referenced this pull request


          [2.19] Fix segment replication failure during rolling restart

85c8e00

During rolling restarts, replica shards may have received newer checkpoints
from the primary before the restart, but after restart, the primary may have
rolled back to an older state. The strict checkpoint validation added in opensearch-project#18944
to fix race conditions during primary relocation incorrectly rejects this
legitimate scenario, causing shards to fail allocation after 5 retries.

This fix distinguishes between two scenarios:
1. Normal replication - strict checkpoint validation applies to prevent
   accepting stale data during primary relocation (maintains opensearch-project#18944 fix)
2. Recovery (shard INITIALIZING or RELOCATING) - accepts the primary's
   current state even if it appears older than the replica's last known
   checkpoint, as this is expected during recovery from restart

Added unit tests to verify:
- Stale checkpoint is rejected during normal replication
- Stale checkpoint is accepted during shard recovery

Note: In 2.19, the logic is in SegmentReplicationTarget.java instead of
AbstractSegmentReplicationTarget.java (which was introduced in later versions).

Signed-off-by: Cuong Ha <cuong.ha@optimizely.com>

Fixes opensearch-project#19234

This was referenced Jan 14, 2026

[2.19.4] Fix segment replication failure during rolling restart (#19234) cuonghm2809/OpenSearch#2

Open

[2.19] Fix segment replication failure during rolling restart #20423

Closed

andrross pushed a commit to cuonghm2809/OpenSearch that referenced this pull request


          Fix segment replication failure during rolling restart

f425f09

During rolling restarts, replica shards may have received newer checkpoints
from the primary before the restart, but after restart, the primary may have
rolled back to an older state. The strict checkpoint validation added in opensearch-project#18944
to fix race conditions during primary relocation incorrectly rejects this
legitimate scenario, causing shards to fail allocation after 5 retries.

This fix distinguishes between two scenarios:
1. Normal replication - strict checkpoint validation applies to prevent
   accepting stale data during primary relocation (maintains opensearch-project#18944 fix)
2. Recovery (shard INITIALIZING or RELOCATING) - accepts the primary's
   current state even if it appears older than the replica's last known
   checkpoint, as this is expected during recovery from restart

Added unit tests to verify:
- Stale checkpoint is rejected during normal replication
- Stale checkpoint is accepted during shard recovery

Signed-off-by: Cuong Ha <cuong.ha@optimizely.com>

Fixes opensearch-project#19234

andrross added a commit that referenced this pull request


          Fix segment replication failure during rolling restart (#20422)

5906e7b

* Fix segment replication failure during rolling restart

During rolling restarts, replica shards may have received newer checkpoints
from the primary before the restart, but after restart, the primary may have
rolled back to an older state. The strict checkpoint validation added in #18944
to fix race conditions during primary relocation incorrectly rejects this
legitimate scenario, causing shards to fail allocation after 5 retries.

This fix distinguishes between two scenarios:
1. Normal replication - strict checkpoint validation applies to prevent
   accepting stale data during primary relocation (maintains #18944 fix)
2. Recovery (shard INITIALIZING or RELOCATING) - accepts the primary's
   current state even if it appears older than the replica's last known
   checkpoint, as this is expected during recovery from restart

Added unit tests to verify:
- Stale checkpoint is rejected during normal replication
- Stale checkpoint is accepted during shard recovery

Signed-off-by: Cuong Ha <cuong.ha@optimizely.com>

Fixes #19234

* Fix incorrect mock chaining in unit tests

The chained mock syntax when(spyIndexShard.routingEntry().initializing())
doesn't work as intended because routingEntry() returns a real ShardRouting
object, not a mock.

Fixed by:
- Added ShardRouting import
- Created separate ShardRouting mocks for both test cases
- Properly stubbed initializing() and relocating() methods on the mock
- Stubbed routingEntry() to return the mocked ShardRouting

This ensures tests correctly verify the behavior for both:
1. Active shard (initializing=false) - should reject stale checkpoint
2. Recovering shard (initializing=true) - should accept stale checkpoint

Signed-off-by: Cuong Ha <cuong.ha@optimizely.com>

* Fix ReplicationCheckpoint constructor in tests

Signed-off-by: Cuong Ha <cuong.ha@optimizely.com>

* Fix code formatting

Signed-off-by: Cuong Ha <cuong.ha@optimizely.com>

* Add CHANGELOG entry

Signed-off-by: Andrew Ross <andrross@amazon.com>

---------

Signed-off-by: Cuong Ha <cuong.ha@optimizely.com>
Signed-off-by: Andrew Ross <andrross@amazon.com>
Co-authored-by: Andrew Ross <andrross@amazon.com>

cuonghm2809 mentioned this pull request

[Backport 2.19] Fix segment replication failure during rolling restart #20498

Merged

cuonghm2809 added a commit to cuonghm2809/OpenSearch that referenced this pull request


          [2.19] Fix segment replication failure during rolling restart

d41a493

During rolling restarts, replica shards may have received newer checkpoints
from the primary before the restart, but after restart, the primary may have
rolled back to an older state. The strict checkpoint validation added in opensearch-project#18944
to fix race conditions during primary relocation incorrectly rejects this
legitimate scenario, causing shards to fail allocation after 5 retries.

This fix distinguishes between two scenarios:
1. Normal replication - strict checkpoint validation applies to prevent
   accepting stale data during primary relocation (maintains opensearch-project#18944 fix)
2. Recovery (shard INITIALIZING or RELOCATING) - accepts the primary's
   current state even if it appears older than the replica's last known
   checkpoint, as this is expected during recovery from restart

Added unit tests to verify:
- Stale checkpoint is rejected during normal replication
- Stale checkpoint is accepted during shard recovery

Note: In 2.19, the logic is in SegmentReplicationTarget.java instead of
AbstractSegmentReplicationTarget.java (which was introduced in later versions).

Signed-off-by: Cuong Ha <cuong.ha@optimizely.com>

Fixes opensearch-project#19234

andrross added a commit that referenced this pull request


          [Backport 2.19] Fix segment replication failure during rolling restart (

04900c6

#20498)

* [2.19] Fix segment replication failure during rolling restart

During rolling restarts, replica shards may have received newer checkpoints
from the primary before the restart, but after restart, the primary may have
rolled back to an older state. The strict checkpoint validation added in #18944
to fix race conditions during primary relocation incorrectly rejects this
legitimate scenario, causing shards to fail allocation after 5 retries.

This fix distinguishes between two scenarios:
1. Normal replication - strict checkpoint validation applies to prevent
   accepting stale data during primary relocation (maintains #18944 fix)
2. Recovery (shard INITIALIZING or RELOCATING) - accepts the primary's
   current state even if it appears older than the replica's last known
   checkpoint, as this is expected during recovery from restart

Added unit tests to verify:
- Stale checkpoint is rejected during normal replication
- Stale checkpoint is accepted during shard recovery

Note: In 2.19, the logic is in SegmentReplicationTarget.java instead of
AbstractSegmentReplicationTarget.java (which was introduced in later versions).

Signed-off-by: Cuong Ha <cuong.ha@optimizely.com>

Fixes #19234

* Fix javadoc syntax error in SearchPhase

Signed-off-by: Cuong Ha <cuong.ha@optimizely.com>

* Fix ReplicationCheckpoint constructor in unit tests

Signed-off-by: Cuong Ha <cuong.ha@optimizely.com>

* Fix code formatting

Signed-off-by: Cuong Ha <cuong.ha@optimizely.com>

* Add CHANGELOG entry

Signed-off-by: Andrew Ross <andrross@amazon.com>

---------

Signed-off-by: Cuong Ha <cuong.ha@optimizely.com>
Signed-off-by: Andrew Ross <andrross@amazon.com>
Co-authored-by: Andrew Ross <andrross@amazon.com>

guojialiang92 mentioned this pull request

[RFC] Simplify the ckp verification for the primary shard relocation #20610

Open

tanyabti pushed a commit to tanyabti/OpenSearch that referenced this pull request


          Fix segment replication failure during rolling restart (opensearch-pr…

…oject#20422)

* Fix segment replication failure during rolling restart

During rolling restarts, replica shards may have received newer checkpoints
from the primary before the restart, but after restart, the primary may have
rolled back to an older state. The strict checkpoint validation added in opensearch-project#18944
to fix race conditions during primary relocation incorrectly rejects this
legitimate scenario, causing shards to fail allocation after 5 retries.

This fix distinguishes between two scenarios:
1. Normal replication - strict checkpoint validation applies to prevent
   accepting stale data during primary relocation (maintains opensearch-project#18944 fix)
2. Recovery (shard INITIALIZING or RELOCATING) - accepts the primary's
   current state even if it appears older than the replica's last known
   checkpoint, as this is expected during recovery from restart

Added unit tests to verify:
- Stale checkpoint is rejected during normal replication
- Stale checkpoint is accepted during shard recovery

Signed-off-by: Cuong Ha <cuong.ha@optimizely.com>

Fixes opensearch-project#19234

* Fix incorrect mock chaining in unit tests

The chained mock syntax when(spyIndexShard.routingEntry().initializing())
doesn't work as intended because routingEntry() returns a real ShardRouting
object, not a mock.

Fixed by:
- Added ShardRouting import
- Created separate ShardRouting mocks for both test cases
- Properly stubbed initializing() and relocating() methods on the mock
- Stubbed routingEntry() to return the mocked ShardRouting

This ensures tests correctly verify the behavior for both:
1. Active shard (initializing=false) - should reject stale checkpoint
2. Recovering shard (initializing=true) - should accept stale checkpoint

Signed-off-by: Cuong Ha <cuong.ha@optimizely.com>

* Fix ReplicationCheckpoint constructor in tests

Signed-off-by: Cuong Ha <cuong.ha@optimizely.com>

* Fix code formatting

Signed-off-by: Cuong Ha <cuong.ha@optimizely.com>

* Add CHANGELOG entry

Signed-off-by: Andrew Ross <andrross@amazon.com>

---------

Signed-off-by: Cuong Ha <cuong.ha@optimizely.com>
Signed-off-by: Andrew Ross <andrross@amazon.com>
Co-authored-by: Andrew Ross <andrross@amazon.com>

tanyabti pushed a commit to tanyabti/OpenSearch that referenced this pull request


          Fix segment replication failure during rolling restart (opensearch-pr…

4b11bfa

…oject#20422)

* Fix segment replication failure during rolling restart

During rolling restarts, replica shards may have received newer checkpoints
from the primary before the restart, but after restart, the primary may have
rolled back to an older state. The strict checkpoint validation added in opensearch-project#18944
to fix race conditions during primary relocation incorrectly rejects this
legitimate scenario, causing shards to fail allocation after 5 retries.

This fix distinguishes between two scenarios:
1. Normal replication - strict checkpoint validation applies to prevent
   accepting stale data during primary relocation (maintains opensearch-project#18944 fix)
2. Recovery (shard INITIALIZING or RELOCATING) - accepts the primary's
   current state even if it appears older than the replica's last known
   checkpoint, as this is expected during recovery from restart

Added unit tests to verify:
- Stale checkpoint is rejected during normal replication
- Stale checkpoint is accepted during shard recovery

Signed-off-by: Cuong Ha <cuong.ha@optimizely.com>

Fixes opensearch-project#19234

* Fix incorrect mock chaining in unit tests

The chained mock syntax when(spyIndexShard.routingEntry().initializing())
doesn't work as intended because routingEntry() returns a real ShardRouting
object, not a mock.

Fixed by:
- Added ShardRouting import
- Created separate ShardRouting mocks for both test cases
- Properly stubbed initializing() and relocating() methods on the mock
- Stubbed routingEntry() to return the mocked ShardRouting

This ensures tests correctly verify the behavior for both:
1. Active shard (initializing=false) - should reject stale checkpoint
2. Recovering shard (initializing=true) - should accept stale checkpoint

Signed-off-by: Cuong Ha <cuong.ha@optimizely.com>

* Fix ReplicationCheckpoint constructor in tests

Signed-off-by: Cuong Ha <cuong.ha@optimizely.com>

* Fix code formatting

Signed-off-by: Cuong Ha <cuong.ha@optimizely.com>

* Add CHANGELOG entry

Signed-off-by: Andrew Ross <andrross@amazon.com>

---------

Signed-off-by: Cuong Ha <cuong.ha@optimizely.com>
Signed-off-by: Andrew Ross <andrross@amazon.com>
Co-authored-by: Andrew Ross <andrross@amazon.com>

andrross mentioned this pull request

[AUTOCUT] Gradle Check Flaky Test Report for FullRollingRestartIT #18490

Open

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

autocut backport 2.19 backport 3.2 backport-failed flaky-test skip-changelog >test-failure