Keep snapshot restore state and routing table in sync by ywelsch · Pull Request #20836 · elastic/elasticsearch

ywelsch · 2016-10-10T16:09:16Z

The snapshot restore state tracks information about shards being restored from a snapshot in the cluster state. For example it records if a shard has been successfully restored or if restoring it was not possible due to a corruption of the snapshot. Recording these events is usually based on changes to the shard routing table, i.e., when a shard is started after a successful restore or failed after an unsuccessful one. As of now, there were two communication channels to transmit recovery failure / success to update the routing table and the restore state. This lead to issues where a shard was failed but the restore state was not updated due to connection issues between data and master node. In some rare situations, this lead to an issue where the restore state could not be properly cleaned up anymore by the master, making it impossible to start new restore operations. The following change updates routing table and restore state in the same cluster state update so that both always stay in sync. It also eliminates the extra communication channel for restore operations and uses the standard cluster state listener mechanism to update restore listener upon successful completion of a snapshot restore.

Closes #19774

The snapshot restore state tracks information about shards being restored from a snapshot in the cluster state. For example it records if a shard has been successfully restored or if restoring it was not possible due to a corruption of the snapshot. Recording these events is usually based on changes to the shard routing table, i.e., when a shard is started after a successful restore or failed after an unsuccessful one. As of now, there were two communication channels to transmit recovery failure / success to update the routing table and the restore state. This lead to issues where a shard was failed but the restore state was not updated due to connection issues between data and master node. In some rare situations, this lead to an issue where the restore state could not be properly cleaned up anymore by the master, making it impossible to start new restore operations. The following change updates routing table and restore state in the same cluster state update so that both always stay in sync. It also eliminates the extra communication channel for restore operations and uses standard cluster state listener mechanism to update restore listener upon successful completion of a snapshot.

imotov

Left a couple of minor comments. Otherwise, LGTM.

imotov · 2016-10-11T00:19:12Z

core/src/main/java/org/elasticsearch/cluster/routing/allocation/AllocationService.java

        assert newRoutingTable.validate(newMetaData); // validates the routing table is coherent with the cluster state metadata
-        final ClusterState newState = ClusterState.builder(oldState).routingTable(newRoutingTable).metaData(newMetaData).build();
+        final RestoreInProgress restoreInProgress = allocation.custom(RestoreInProgress.TYPE);
+        RestoreInProgress updatedRestoreInProgress = allocation.updateRestoreInfoWithRoutingChanges(restoreInProgress);


That doesn't seem to cause any issues, but I think moving this into the if statement bellow might help to clarify the logic. This method can be called when no restore takes place and we kind of plunge head on into updating restore info without even checking if restore actually takes place. We check it inside updateRestoreInfoWithRoutingChanges, but I think it might make the logic clearer if we checked it here.

imotov · 2016-10-11T15:01:06Z

core/src/main/java/org/elasticsearch/snapshots/RestoreService.java

+                RecoverySource recoverySource = failedShard.recoverySource();
+                if (recoverySource.getType() == RecoverySource.Type.SNAPSHOT) {
+                    Snapshot snapshot = ((SnapshotRecoverySource) recoverySource).snapshot();
+                    if (Lucene.isCorruptionException(unassignedInfo.getFailure().getCause())) {


Could you add a comment explaining why we only fail in case of lucene corruption?

ywelsch · 2016-10-12T07:09:07Z

thanks for the review @imotov

The snapshot restore state tracks information about shards being restored from a snapshot in the cluster state. For example it records if a shard has been successfully restored or if restoring it was not possible due to a corruption of the snapshot. Recording these events is usually based on changes to the shard routing table, i.e., when a shard is started after a successful restore or failed after an unsuccessful one. As of now, there were two communication channels to transmit recovery failure / success to update the routing table and the restore state. This lead to issues where a shard was failed but the restore state was not updated due to connection issues between data and master node. In some rare situations, this lead to an issue where the restore state could not be properly cleaned up anymore by the master, making it impossible to start new restore operations. The following change updates routing table and restore state in the same cluster state update so that both always stay in sync. It also eliminates the extra communication channel for restore operations and uses standard cluster state listener mechanism to update restore listener upon successful completion of a snapshot.

ILMostro · 2017-08-31T03:12:27Z

Is there any hope of this being backported to 1.7 version?

jasontedor · 2017-08-31T03:13:45Z

No, sorry, 1.7 is end-of-life.

ywelsch added >bug review :Distributed/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs v6.0.0-alpha1 labels Oct 10, 2016

ywelsch assigned imotov Oct 10, 2016

imotov approved these changes Oct 11, 2016

View reviewed changes

ywelsch added 2 commits October 11, 2016 18:24

address review comments

387eff9

Fix NPE

ec56e81

ywelsch merged commit 0750470 into elastic:master Oct 12, 2016

ywelsch mentioned this pull request Oct 26, 2016

Keep snapshot restore state and routing table in sync (5.x backport) #21131

Merged

ywelsch mentioned this pull request Jul 24, 2018

Update IndexShardSnapshotStatus when an exception is encountered #32265

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Keep snapshot restore state and routing table in sync#20836

Keep snapshot restore state and routing table in sync#20836
ywelsch merged 3 commits intoelastic:masterfrom
ywelsch:fix/snap-restore-update-entries

ywelsch commented Oct 10, 2016

Uh oh!

imotov left a comment

Uh oh!

imotov Oct 11, 2016

Uh oh!

ywelsch Oct 11, 2016

Uh oh!

imotov Oct 11, 2016

Uh oh!

ywelsch Oct 11, 2016

Uh oh!

ywelsch commented Oct 12, 2016

Uh oh!

ILMostro commented Aug 31, 2017

Uh oh!

jasontedor commented Aug 31, 2017

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

ywelsch commented Oct 10, 2016

Uh oh!

imotov left a comment

Choose a reason for hiding this comment

Uh oh!

imotov Oct 11, 2016

Choose a reason for hiding this comment

Uh oh!

ywelsch Oct 11, 2016

Choose a reason for hiding this comment

Uh oh!

imotov Oct 11, 2016

Choose a reason for hiding this comment

Uh oh!

ywelsch Oct 11, 2016

Choose a reason for hiding this comment

Uh oh!

ywelsch commented Oct 12, 2016

Uh oh!

ILMostro commented Aug 31, 2017

Uh oh!

jasontedor commented Aug 31, 2017

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants