Skip to content

[BUG] Close refresh listeners during primary relocation of remote enabled indexes #11320

@ashking94

Description

@ashking94

Describe the bug
During the peer recovery relocation, we close Closeable internal refresh listeners. This is present to ensure that we drain all ongoing refreshes before primary hand-off. There is a code bug that is causing the listeners to not close.

// Ensures all in-flight remote store operations drain, before we perform the handoff.
internalRefreshListener.stream()
.filter(refreshListener -> refreshListener instanceof Closeable)
.map(refreshListener -> (Closeable) refreshListener)
.close();

This has been found by the below check during shard creation.

public static void verifyNoMultipleWriters(List<String> mdFiles, Function<String, Tuple<String, String>> fn) {
Map<String, String> nodesByPrimaryTermAndGen = new HashMap<>();
mdFiles.forEach(mdFile -> {
Tuple<String, String> nodeIdByPrimaryTermAndGen = fn.apply(mdFile);
if (nodeIdByPrimaryTermAndGen != null) {
if (nodesByPrimaryTermAndGen.containsKey(nodeIdByPrimaryTermAndGen.v1())
&& (!nodesByPrimaryTermAndGen.get(nodeIdByPrimaryTermAndGen.v1()).equals(nodeIdByPrimaryTermAndGen.v2()))) {
throw new IllegalStateException(
"Multiple metadata files from different nodes"
+ "having same primary term and generations "
+ nodeIdByPrimaryTermAndGen.v1()
+ " detected "
);
}
nodesByPrimaryTermAndGen.put(nodeIdByPrimaryTermAndGen.v1(), nodeIdByPrimaryTermAndGen.v2());
}
});
}

To Reproduce
Simulate conditions so that there is a retry ongoing in RemoteStoreRefreshListener. The upload can succeed after the new primary mode has started uploading segment and translog files.

Expected behavior
The older primary should not upload once the handoff has been done.

Plugins
Please list all plugins currently enabled.

Screenshots
If applicable, add screenshots to help explain your problem.

Host/Environment (please complete the following information):

  • OS: [e.g. iOS]
  • Version [e.g. 22]

Additional context
Add any other context about the problem here.

Metadata

Metadata

Assignees

Labels

Storage:DurabilityIssues and PRs related to the durability frameworkStorage:RemotebugSomething isn't workingv2.12.0Issues and PRs related to version 2.12.0

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions