Skip to content

Comments

NRG: peer removing all followers leaves membership in progress#7609

Merged
neilalexander merged 1 commit intomainfrom
raft-peer-remove-all-followers
Dec 4, 2025
Merged

NRG: peer removing all followers leaves membership in progress#7609
neilalexander merged 1 commit intomainfrom
raft-peer-remove-all-followers

Conversation

@sciascid
Copy link
Contributor

@sciascid sciascid commented Dec 4, 2025

Fix the case where a node that removes all of the peers would be left with membChanging set permanently. This would prevent the single node cluster to admit new nodes back to form a bigger cluster. The membChanging flag prevents concurrent membership changes, and is reset after a membership change is committed. However, in the case of a single node cluster, entries are never committed and memChanging remains set.
The underlying problem is that single node clusters are not working properly: entries are committed only after tracking successful append entry responses from the followers. In a single node cluster there's no one that sends append entry responses, and the quorum checking is never done elsewhere.
The fix consists in extracting function tryCommit from trackResponse. tryCommit checks if there's a quorum for a given index, and if so it commits the corresponding entry. We call this function if, after proposing a peer remove, we are left with a single node cluster.

Signed-off-by: Daniele Sciascia danele@nats.io

@sciascid sciascid requested a review from a team as a code owner December 4, 2025 08:46
Copy link
Member

@MauriceVanVeen MauriceVanVeen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Fix the case where a node that removes all of the peers would be
left with `membChanging` set permanently. This would prevent the
single node cluster to admit new nodes back to form a bigger
cluster. The `membChanging` flag prevents concurrent membership
changes, and is reset after a membership change is committed.
However, in the case of a single node cluster, entries are
never committed and `memChanging` remains set.
The underlying problem is that single node clusters are not
working properly: entries are committed only after tracking
successful append entry responses from the followers. In a single
node cluster there's no one that sends append entry responses,
and the quorum checking is never done elsewhere.
The fix consists in extracting function `tryCommit` from
`trackResponse`. `tryCommit` checks if there's a quorum for a
given index, and if so it commits the corresponding entry. We
call this function if, after proposing a peer remove, we are
left with a single node cluster.

Signed-off-by: Daniele Sciascia <daniele@nats.io>
@sciascid sciascid force-pushed the raft-peer-remove-all-followers branch from b0ee3fc to 5a98bca Compare December 4, 2025 10:14
Copy link
Member

@neilalexander neilalexander left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@neilalexander neilalexander merged commit 8cd302d into main Dec 4, 2025
68 of 70 checks passed
@neilalexander neilalexander deleted the raft-peer-remove-all-followers branch December 4, 2025 10:55
neilalexander added a commit that referenced this pull request Dec 5, 2025
Includes the following:

- #7581
- #7585
- #7586
- #7565
- #7588
- #7593
- #7589
- #7594
- #7595
- #7596
- #7597
- #7598
- #7600
- #7601
- #7602
- #7604
- #7605
- #7607
- #7609
- #7610
- #7616
- #7614

Signed-off-by: Neil Twigg <neil@nats.io>
neilalexander added a commit that referenced this pull request Jan 6, 2026
Includes the following:

- #7565
- #7589
- #7600
- #7602
- #7609
- #7610
- #7632
- #7649
- #7642
- #7658
- #7659
- #7661
- #7662
- #7663
- #7668
- #7683
- #7685
- #7686
- #7678
- #7691
- #7696
- #7698
- #7699
- #7700

Signed-off-by: Neil Twigg <neil@nats.io>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants