-
Notifications
You must be signed in to change notification settings - Fork 2.5k
Description
Background
In the segment copy scenario, to avoid file conflicts after primary promotion, we incremented the segment counter by 100000 when NRTReplicationEngine was closed.
In the earlier discussion #5701, it was evaluated whether to update all generation fields in SegmentCommitInfo. Since the environment where the issue occurred was deleted and it was difficult to reproduce the problem, the issue was ultimately closed.
Currently, the InternalEngine only uses soft delete, so we focus on the segment replication scenario with soft deletion enabled. I have studied the implementation of the current segment replication and believe that in this scenario, there is indeed a possibility of an OpenSearchCorruptionException being thrown.
Reproduce
To reproduce the issue, I constructed an IT test (testPrimaryStopped_ReplicaPromoted_UpdateDoc) based on the latest main branch and submitted it to branch.
The exception stack is as follows.
[2025-12-23T18:05:06,031][ERROR][o.o.i.r.SegmentReplicationTargetService] [node_t1] [shardId [test-idx-1][0]] [replication id 611] Replication failed, timing data: {INIT=0, GET_CHECKPOINT_INFO=0, REPLICATING=0}
org.opensearch.indices.replication.common.ReplicationFailedException: Store corruption during replication
at org.opensearch.indices.replication.SegmentReplicator$2.onFailure(SegmentReplicator.java:350) [main/:?]
at org.opensearch.core.action.ActionListener$1.onFailure(ActionListener.java:90) [opensearch-core-3.5.0-SNAPSHOT.jar:3.5.0-SNAPSHOT]
at org.opensearch.core.action.ActionListener$1.onResponse(ActionListener.java:84) [opensearch-core-3.5.0-SNAPSHOT.jar:3.5.0-SNAPSHOT]
at org.opensearch.common.util.concurrent.ListenableFuture$1.doRun(ListenableFuture.java:126) [main/:?]
at org.opensearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:52) [main/:?]
at org.opensearch.common.util.concurrent.OpenSearchExecutors$DirectExecutorService.execute(OpenSearchExecutors.java:341) [main/:?]
at org.opensearch.common.util.concurrent.ListenableFuture.notifyListener(ListenableFuture.java:120) [main/:?]
at org.opensearch.common.util.concurrent.ListenableFuture.lambda$done$0(ListenableFuture.java:112) [main/:?]
at java.base/java.util.ArrayList.forEach(ArrayList.java:1604) [?:?]
at org.opensearch.common.util.concurrent.ListenableFuture.done(ListenableFuture.java:112) [main/:?]
at org.opensearch.common.util.concurrent.BaseFuture.set(BaseFuture.java:160) [main/:?]
at org.opensearch.common.util.concurrent.ListenableFuture.onResponse(ListenableFuture.java:141) [main/:?]
at org.opensearch.action.StepListener.innerOnResponse(StepListener.java:79) [main/:?]
at org.opensearch.core.action.NotifyOnceListener.onResponse(NotifyOnceListener.java:58) [opensearch-core-3.5.0-SNAPSHOT.jar:3.5.0-SNAPSHOT]
at org.opensearch.action.ActionListenerResponseHandler.handleResponse(ActionListenerResponseHandler.java:70) [main/:?]
at org.opensearch.telemetry.tracing.handler.TraceableTransportResponseHandler.handleResponse(TraceableTransportResponseHandler.java:73) [main/:?]
at org.opensearch.transport.TransportService$ContextRestoreResponseHandler.handleResponse(TransportService.java:1587) [main/:?]
at org.opensearch.transport.NativeMessageHandler.doHandleResponse(NativeMessageHandler.java:468) [main/:?]
at org.opensearch.transport.NativeMessageHandler.lambda$handleResponse$3(NativeMessageHandler.java:462) [main/:?]
at org.opensearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:916) [main/:?]
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1090) [?:?]
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:614) [?:?]
at java.base/java.lang.Thread.run(Thread.java:1474) [?:?]
Caused by: org.opensearch.OpenSearchCorruptionException: Shard [test-idx-1][0] has local copies of segments that differ from the primary [name [_0_2_Lucene90_0.dvm], length [160], checksum [12940pr], writtenBy [10.3.2], name [_0_2_Lucene90_0.dvd], length [91], checksum [1aenr37], writtenBy [10.3.2]]
at org.opensearch.indices.replication.AbstractSegmentReplicationTarget.getFiles(AbstractSegmentReplicationTarget.java:251) ~[main/:?]
at org.opensearch.indices.replication.AbstractSegmentReplicationTarget.lambda$startReplication$2(AbstractSegmentReplicationTarget.java:180) ~[main/:?]
at org.opensearch.core.action.ActionListener$1.onResponse(ActionListener.java:82) ~[opensearch-core-3.5.0-SNAPSHOT.jar:3.5.0-SNAPSHOT]
... 20 more
The test process is as follows:
- Start two data nodes.
- Create an index contain
1primary shard and1replica shard. Turn off automatic refresh. - Write
20documents. - Execute refresh. Generate
_0.si. - Wait for the segment replication to complete, and both the primary and replica shard contain
20documents. - Mock segment replication process to ensure that segment replication between primary and replica shard cannot be completed.
- Update the doc with id
5and execute refresh. Primary shard generate_0_1_Lucene90_0.dvm. - Update the doc with id
6and execute refresh. Primary shard generate_0_2_Lucene90_0.dvm. - Restart the node where the primary shard is located.
- The replica is promoted to a new primary shard, generating
_0_1_Lucene90_0.dvmthrough translog recovery, but the content is different from the previous file with the same name. - Update the doc with id
7. The new primary shard generates_0_2_Lucene90_0.dvm, but its content is different from the previous file with the same name. - Peer recovery fails due to
OpenSearchCorruptionExceptionwhen performing force segment replication. - After throwing
OpenSearchCorruptionException, file-based recovery will be executed, and the cluster will eventually turn green.
This test reproduces the situation where OpenSearchCorruptionException occurs, but the test will pass due to the retry of recovery.
Expected behavior
In the segment replication scenario with soft delete enabled, OpenSearchCorruptionException will not occur.
Additional Details
Plugins
Please list all plugins currently enabled.
Screenshots
If applicable, add screenshots to help explain your problem.
Host/Environment (please complete the following information):
- OS: [e.g. iOS]
- Version [e.g. 22]
Additional context
Add any other context about the problem here.