Fix MergeSeen to filter Seen against current Members (#8009)#8011
Merged
Aaronontheweb merged 5 commits intoakkadotnet:devfrom Jan 26, 2026
Merged
Conversation
Apply defensive fix to MergeSeen that ensures the invariant Seen ⊆ Members is always maintained. The merged seen set is now intersected with current member addresses, preventing stale entries from corrupting gossip state. Closes akkadotnet#8009
98e8044 to
905f17b
Compare
Merged
Aaronontheweb
added a commit
to Aaronontheweb/akka.net
that referenced
this pull request
Jan 25, 2026
Changes: - LeaderDowningNodeThatIsUnreachableSpec: Fix bug where test tried to run on Second node after it was already exited (line 143) - NodeDowningAndBeingRemovedSpec: Convert to async, increase outer timeout from 30s to 45s, add explicit timeouts to AwaitConditionAsync/AwaitAssertAsync - NodeLeavingAndExitingAndBeingRemovedSpec: Convert to async, increase outer timeout from 15s to 45s for CI variability, add explicit timeouts These tests are likely affected by PR akkadotnet#8011's MergeSeen filter fix which changes gossip convergence timing.
5 tasks
Aaronontheweb
added a commit
that referenced
this pull request
Jan 25, 2026
…8025) * Fix flaky multi-node cluster tests for member removal Changes: - LeaderDowningNodeThatIsUnreachableSpec: Fix bug where test tried to run on Second node after it was already exited (line 143) - NodeDowningAndBeingRemovedSpec: Convert to async, increase outer timeout from 30s to 45s, add explicit timeouts to AwaitConditionAsync/AwaitAssertAsync - NodeLeavingAndExitingAndBeingRemovedSpec: Convert to async, increase outer timeout from 15s to 45s for CI variability, add explicit timeouts These tests are likely affected by PR #8011's MergeSeen filter fix which changes gossip convergence timing. * Address review feedback: remove redundant explicit timeouts - Remove explicit timeout args from AwaitAssertAsync/AwaitConditionAsync calls inside WithinAsync blocks (timeouts are inherited from outer block) - Move address caching outside WithinAsync block for cleaner code - Keep CancellationToken.None as it's required by the API signature * Fix SBR and ClusterSharding multi-node test race conditions SBR Tests (IndirectlyConnected3NodeSpec, IndirectlyConnected5NodeSpec, DownAllIndirectlyConnected5NodeSpec): - Replace polling via AwaitConditionAsync with event-driven callback - Use cluster.RegisterOnMemberRemoved() for immediate notification - The callback fires as soon as the member is removed or cluster daemon stops - Eliminates race between polling interval and actual state change - Convert remaining sync methods to async pattern ClusterShardingRolePartitioningSpec: - Wrap first message send in AwaitAssert to handle coordinator readiness - The coordinator may not respond to GetShardHome until HasAllRegionsRegistered() - GetShardHome requests are silently ignored until _aliveRegions.Count >= _minMembers - The retry pattern ensures we wait for coordinator readiness without timeout jiggling * Convert ClusterShardingRolePartitioningSpec to async TestKit methods - Convert test methods to return Task and use await - Use AwaitClusterUpAsync, RunOnAsync, EnterBarrierAsync - Use AwaitAssertAsync and ExpectMsgAsync patterns - Maintains the coordinator readiness fix from previous commit
Aaronontheweb
added a commit
to Aaronontheweb/akka.net
that referenced
this pull request
Jan 26, 2026
akkadotnet#8011) Apply defensive fix to MergeSeen that ensures the invariant Seen ⊆ Members is always maintained. The merged seen set is now intersected with current member addresses, preventing stale entries from corrupting gossip state. Closes akkadotnet#8009
Merged
Aaronontheweb
added a commit
to Aaronontheweb/akka.net
that referenced
this pull request
Jan 26, 2026
Documents all 8 backported PRs for the 1.5.59 release including: - Critical cluster gossip fix (akkadotnet#8011) - Bug fixes for logging, inbox, persistence, and TestKit - New features: ActivityContext capture and BroadcastHub improvements - CoordinatedShutdown logging enhancement
Aaronontheweb
added a commit
that referenced
this pull request
Jan 26, 2026
Apply defensive fix to MergeSeen that ensures the invariant Seen ⊆ Members is always maintained. The merged seen set is now intersected with current member addresses, preventing stale entries from corrupting gossip state. Closes #8009
Aaronontheweb
added a commit
that referenced
this pull request
Jan 26, 2026
Documents all 8 backported PRs for the 1.5.59 release including: - Critical cluster gossip fix (#8011) - Bug fixes for logging, inbox, persistence, and TestKit - New features: ActivityContext capture and BroadcastHub improvements - CoordinatedShutdown logging enhancement
This was referenced Jan 27, 2026
This was referenced Feb 9, 2026
This was referenced Feb 11, 2026
This was referenced Feb 20, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
ClusterMessageSerializer.GossipToProtothrowsArgumentException: Unknown addresswhenSeencontains addresses not present inMembers.Root Cause
The gossip protocol lacks tombstones for removed members. Without tombstones, removed members can be reintroduced during gossip merging, and their
Seenentries persist after the member is removed fromMembers.The
MergeSeenmethod performed a blind union of seen sets without filtering against current membership.Solution
Apply defensive fix to
MergeSeenthat ensures the invariantSeen ⊆ Membersis always maintained:Characteristics
Future Work
A more comprehensive fix using tombstones is planned for 1.6.0 - see #8015.
Closes #8009