-
Notifications
You must be signed in to change notification settings - Fork 2.5k
[BUG] Dual Replication - Failover to remote replica from remote primary fails when the replication group contains a docrep index #13158
Description
Describe the bug
In a situation where a replication group has at-least one docrep shard copy, failover from a remote primary to a remote replica fails with no retention lease for tracked shard
During dual replication phase, RetentionLeases generated on the primary shard, is synced over to the docrep copy through the RetentionLeaseBackgroundSyncAction, but we block the replication call to remote enabled replica copies. When the primary shard copy fails over to another remote enabled replica, the invariant() check fails.
This is how the code flows. During a failover, the activatePrimaryMode() method of ReplicationTracker is invoked
OpenSearch/server/src/main/java/org/opensearch/index/shard/IndexShard.java
Lines 784 to 790 in 7103e56
| replicationTracker.activatePrimaryMode(getLocalCheckpoint()); | |
| if (indexSettings.isSegRepEnabledOrRemoteNode()) { | |
| // force publish a checkpoint once in primary mode so that replicas not caught up to previous primary | |
| // are brought up to date. | |
| checkpointPublisher.publish(this, getLatestReplicationCheckpoint()); | |
| } | |
| postActivatePrimaryMode(); |
This enabled the primaryMode flag for the ReplicationTracker instance, updates global and local Ckp, creates retention lease for itself and runs the invariant() checks
OpenSearch/server/src/main/java/org/opensearch/index/seqno/ReplicationTracker.java
Lines 1359 to 1364 in 7103e56
| primaryMode = true; | |
| updateLocalCheckpoint(shardAllocationId, checkpoints.get(shardAllocationId), localCheckpoint); | |
| updateGlobalCheckpointOnPrimary(); | |
| addPeerRecoveryRetentionLeaseForSolePrimary(); | |
| assert invariant(); |
The invariant() method checks for retention leases again all replicated shard copies. During dual replication all docrep shard copies are marked as replicated.
OpenSearch/server/src/main/java/org/opensearch/index/seqno/ReplicationTracker.java
Lines 958 to 975 in 7103e56
| if (primaryMode && indexSettings.isSoftDeleteEnabled() && hasAllPeerRecoveryRetentionLeases) { | |
| // all tracked shard copies have a corresponding peer-recovery retention lease | |
| for (final ShardRouting shardRouting : routingTable.assignedShards()) { | |
| final CheckpointState cps = checkpoints.get(shardRouting.allocationId().getId()); | |
| if (cps.tracked && cps.replicated) { | |
| assert retentionLeases.contains(getPeerRecoveryRetentionLeaseId(shardRouting)) | |
| : "no retention lease for tracked shard [" + shardRouting + "] in " + retentionLeases; | |
| assert PEER_RECOVERY_RETENTION_LEASE_SOURCE.equals( | |
| retentionLeases.get(getPeerRecoveryRetentionLeaseId(shardRouting)).source() | |
| ) : "incorrect source [" | |
| + retentionLeases.get(getPeerRecoveryRetentionLeaseId(shardRouting)).source() | |
| + "] for [" | |
| + shardRouting | |
| + "] in " | |
| + retentionLeases; | |
| } | |
| } | |
| } |
Since retention leases weren't copied over from the primary shard instance, the assertion trips here.
We need to re-create retention leases for docrep shard copies and hold off from invoking this assertion until the leases are created.
Related component
Storage:Remote
To Reproduce
N/A
Expected behavior
Failover from both remote primary to both docrep and remote replicas should work seamlessly during the dual replication phase
Additional Details
Plugins
Please list all plugins currently enabled.
Screenshots
If applicable, add screenshots to help explain your problem.
Host/Environment (please complete the following information):
- OS: [e.g. iOS]
- Version [e.g. 22]
Additional context
Add any other context about the problem here.
Metadata
Metadata
Assignees
Labels
Type
Projects
Status