Skip to content

Recover all translogs during relocation handoff for remote-backed indexes#6314

Merged
gbbafna merged 1 commit intoopensearch-project:mainfrom
ashking94:6214
Feb 14, 2023
Merged

Recover all translogs during relocation handoff for remote-backed indexes#6314
gbbafna merged 1 commit intoopensearch-project:mainfrom
ashking94:6214

Conversation

@ashking94
Copy link
Copy Markdown
Member

@ashking94 ashking94 commented Feb 14, 2023

Signed-off-by: Ashish Singh ssashish@amazon.com

Description

As discussed in #6214, during relocation certain documents indexed on the older primary were not seen on the new primary upon search. This was happening as recoverFromTranslog was recovering upto last known global checkpoint while it should be done upto Long.Max for cases where remote translog store is enabled.

newEngineReference.get()
.translogManager()
.recoverFromTranslog(translogRunner, newEngineReference.get().getProcessedLocalCheckpoint(), globalCheckpoint);
newEngineReference.get().refresh("reset_engine");

When remote translog is enabled, then remote store acts as the source of truth for storing translogs. During relocation we reset engine to writeable engine. When this happens, we initialise the RemoteFsTranslog which downloads all the translogs from remote store. The fix is to replay all the translogs here.

Issues Resolved

This fixes #6214.

Check List

  • New functionality includes testing.
    • All tests pass
  • New functionality has been documented.
    • New functionality has javadoc added
  • Commits are signed per the DCO using --signoff
  • Commit changes are listed out in CHANGELOG.md file (See: Changelog)

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

…exes

Signed-off-by: Ashish Singh <ssashish@amazon.com>
@github-actions
Copy link
Copy Markdown
Contributor

Gradle Check (Jenkins) Run Completed with:

@gbbafna gbbafna merged commit ac32259 into opensearch-project:main Feb 14, 2023
@gbbafna gbbafna added the backport 2.x Backport to 2.x branch label Feb 14, 2023
@opensearch-trigger-bot
Copy link
Copy Markdown
Contributor

The backport to 2.x failed:

The process '/usr/bin/git' failed with exit code 128

To backport manually, run these commands in your terminal:

# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add ../.worktrees/backport-2.x 2.x
# Navigate to the new working tree
pushd ../.worktrees/backport-2.x
# Create a new branch
git switch --create backport/backport-6314-to-2.x
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 ac322595eb1838b68e4d9a46cf97cbe56b811aed
# Push it to GitHub
git push --set-upstream origin backport/backport-6314-to-2.x
# Go back to the original working tree
popd
# Delete the working tree
git worktree remove ../.worktrees/backport-2.x

Then, create a pull request where the base branch is 2.x and the compare/head branch is backport/backport-6314-to-2.x.

@ashking94
Copy link
Copy Markdown
Member Author

Created manual PR - #6318

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport 2.x Backport to 2.x branch skip-changelog Storage:Durability Issues and PRs related to the durability framework v2.6.0 'Issues and PRs related to version v2.6.0'

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] Data loss during primary relocation for remote-backed indexes

2 participants