-
Notifications
You must be signed in to change notification settings - Fork 2.5k
Closed
Labels
Storage:DurabilityIssues and PRs related to the durability frameworkIssues and PRs related to the durability frameworkdistributed frameworkenhancementEnhancement or improvement to existing feature or requestEnhancement or improvement to existing feature or requestv2.5.0'Issues and PRs related to version v2.5.0''Issues and PRs related to version v2.5.0'
Description
We have discussed NoOp replication on high level in #3706 and the proposal can be found here and the approach selected has been discussed here.
As part of the exercise, in brief, we want to achieve the following -
- Use replication call for primary term validation, but the call itself is no op.
- Since the replicas don't need the translog, recovery needs to be refactored for replica. We do not need to replay Lucene operations anymore. Similarly, there would be refactoring in replica-primary promotion and replica-replica recovery.
- PRRL helps to keep a certain history of operations so that if a recovery was to happen, then the recovery can make use of replaying Lucene operations,
- Global and local checkpoints are used in context of indexing & translog. Since the translog will be stored remotely, we are guaranteed to have translogs always in case of node failure and hence we have to decouple checkpoints interaction between primary and replica. Global / local checkpoint should hold no more relevance with remote translog.
- Redefine insync shards now can be any replica since we have segments and translogs available remotely.
This issue is a meta-issue for tracking the progress of implementing no-op replication. We will proceed with implementing no-op replication along the following rough plan:
- [Remote Store] Change behaviour in replica recovery for remote translog enabled indices #4318
- [Remote Store] Primary term validation with replicas - New approach POC #5033
- [Remote Store] Remove acquire PRRL during replica recovery if remote translog is enabled #4502
- [Remote Store] Primary term validation in TransportShardBulkAction on replicas when remote translog is enabled #5464
- Stop creation of empty translog in the cleanFiles method of RecoveryTarget. TODO - create issue.
- [Remote Store] Introduce replication tracker proxy layer to make checkpoint related methods as no-op #4503
- [Remote Store] Remove reliance of checkpoints from primary shard if remote translog is enabled #4504
- [Remote Store] Remove reliance of checkpoints from replica shards if remote translog is enabled #4505
- [Remote Store] Redefine the insync allocations if remote translog is enabled #4506
- Refactoring the sync/async calls from primary shards to replicas which modifies the checkpoints in the replica's replication tracker.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
Storage:DurabilityIssues and PRs related to the durability frameworkIssues and PRs related to the durability frameworkdistributed frameworkenhancementEnhancement or improvement to existing feature or requestEnhancement or improvement to existing feature or requestv2.5.0'Issues and PRs related to version v2.5.0''Issues and PRs related to version v2.5.0'