TestKit and Akka.Remote.TestKit: diagnostic improvements and code modernization by Aaronontheweb · Pull Request #7321 · akkadotnet/akka.net

Aaronontheweb · 2024-08-16T19:42:26Z

Changes

Based on some of the racy spec output we're seeing, it looks to me like there's some cases where the BarrierCoordinator might be creating race conditions where tests can inadvertently have multiple nodes crossing different barriers at the same time despite being programmed correctly. Investigating.

Checklist

For significant changes, please ensure that the following have been completed (delete if not relevant):

This change follows the Akka.NET API Compatibility Guidelines.
This change follows the Akka.NET Wire Compatibility Guidelines.
I have reviewed my own pull request.

…rrier` messages

…eb/akka.net into mntr-barrier-cleanup

Aaronontheweb

Wasn't brave enough to address nullability enable inside the MNTR, but I thought this was an acceptable set of changes and improvements to help debug some of these tests.

Aaronontheweb · 2024-08-16T20:31:01Z

src/Directory.Build.props

    <HoconVersion>2.0.3</HoconVersion>
    <ConfigurationManagerVersion>6.0.1</ConfigurationManagerVersion>
-    <MultiNodeAdapterVersion>1.5.19</MultiNodeAdapterVersion>
+    <MultiNodeAdapterVersion>1.5.25</MultiNodeAdapterVersion>


Upgrade to latest MNTR version

Aaronontheweb · 2024-08-16T20:32:05Z

src/core/Akka.Remote.TestKit.Tests/BarrierSpec.cs

Tried to migrate as many tests as I could to use async Task and the async TestKit methods, since those tend to run faster and perform less thread-blocking. The rest of the changes in this file are just compilation fixes for the changes I made to the EnterBarrier / FailBarrier messages - which I'll describe in more detail on the files where those types are defined.

Aaronontheweb · 2024-08-16T20:33:17Z

src/core/Akka.Remote.TestKit/BarrierCoordinator.cs

No substantive changes here other than leveraging the additional diagnostic data added to EnterBarrier and FailBarrier - most of this code has been touched since 2014. I just modernized it a bit to clear up some compiler warnings surrounding the use of .GetHashCode() and mutable values.

Aaronontheweb · 2024-08-16T20:33:40Z

src/core/Akka.Remote.TestKit/BarrierCoordinator.cs


        public sealed class Data
        {
-            public Data(IEnumerable<Controller.NodeInfo> clients, string barrier, IEnumerable<IActorRef> arrived, Deadline deadline) :


Unused CTOR - removed it.

Aaronontheweb · 2024-08-16T20:34:19Z

src/core/Akka.Remote.TestKit/BarrierCoordinator.cs

-            public WrongBarrierException(string barrier, IActorRef client, Data barrierData)
-                : base($"[{client}] tried to enter '{barrier}' while we were waiting for '{barrierData.Barrier}'")
+            public WrongBarrierException(string barrier, IActorRef client, RoleName roleName, Data barrierData)
+                : base($"[{client}] [{roleName}] tried to enter '{barrier}' while we were waiting for '{barrierData.Barrier}'")


Expanded the description here of WHICH ROLE tried to enter the wrong barrier - this should make it much easier to debug tests in the future that have issues with nodes crossing barriers at inappropriate times.

Aaronontheweb · 2024-08-16T20:38:41Z

src/core/Akka.Remote.TestKit/MsgEncoder.cs

+                    {
+                        Name = failBarrier.Name,
+                        Op = Proto.Msg.EnterBarrier.Types.BarrierOp.Fail,
+                        RoleName = failBarrier.Role.Name


Added RoleName encoding to FailBarrier

Aaronontheweb · 2024-08-16T20:39:11Z

src/core/Akka.Remote.TestKit/MultiNodeSpec.cs

        public void EnterBarrier(params string[] name)
        {
-            TestConductor.Enter(RemainingOr(TestConductor.Settings.BarrierTimeout), name.ToImmutableList());
+            TestConductor.Enter(RemainingOr(TestConductor.Settings.BarrierTimeout), Myself, name.ToImmutableList());


Whenever we call EnterBarrier during testing now, we always pass in Myself - which is set to the value of our role during the testing system.

Aaronontheweb · 2024-08-16T20:39:43Z

src/core/Akka.Remote.TestKit/Player.cs

        /// throw an exception in case of timeouts or other errors.
        /// </summary>
-        public void Enter(string name)
+        public void Enter(RoleName roleName, string name)


Had to add RoleName support to the Enter methods for transmitting barrier information to the BarrierCoordinator

Aaronontheweb · 2024-08-16T20:40:37Z

src/core/Akka.TestKit/TestKitBase_AwaitAssert.cs

+                    var stopped = Now + t;
+                    if (stopped >= stop)
+                    {
+                        Sys.Log.Warning("AwaitAssert failed, timeout [{0}] is over after [{1}] attempts and [{2}] elapsed time", max, attempts, stopped - start);


Diagnostic improvements to AwaitAssert - I wanted to understand how many times an assertion actually ran before it failed, so I capture that data and the total elapsed time the test used and pipe it into a Warning log right before we throw the assertion exception.

Aaronontheweb · 2024-08-16T20:41:00Z

src/protobuf/TestConductorProtocol.proto

  string name = 1;
  BarrierOp op = 2;
  int64 timeout = 3;
+  string roleName = 4;


Added roleName to the wire format of the MNTR test conductor.

Aaronontheweb · 2024-08-16T20:42:30Z

Even though I made some changes to type signatures, they're all internal, and the wire format changes are irrelevant because they only occur during testing (so everyone gets synced all at once.)

Arkatufus

LGTM

Aaronontheweb added 3 commits August 16, 2024 14:20

working on cleaning up the MNTR

b75add6

added better logging to barrier entry

761c376

more type cleanup

9ec8e10

Aaronontheweb added the multi node spec label Aug 16, 2024

Aaronontheweb added this to the 1.5.28 milestone Aug 16, 2024

Aaronontheweb added 7 commits August 16, 2024 14:42

Merge branch 'dev' into mntr-barrier-cleanup

d4bea9d

fixed issues with encoding RoleName into EnterBarrier and `FailBa…

2377a57

…rrier` messages

Merge branch 'mntr-barrier-cleanup' of https://github.com/Aarononthew…

35bd5e2

…eb/akka.net into mntr-barrier-cleanup

added better pretty-printing for EnterBarrier and FailBarrier

fde81c5

cleaning up some mutability warnings

1275c0b

more cleanup

01d57fd

final MNTR fixes

4555e2a

Aaronontheweb changed the title ~~Akka.Remote.TestKit: resolving issues with BarrierCoordinator [WIP]~~ TestKit and Akka.Remote.TestKit: diagnostic improvements and code modernization Aug 16, 2024

Aaronontheweb added the akka-testkit Akka.NET Testkit issues label Aug 16, 2024

Aaronontheweb marked this pull request as ready for review August 16, 2024 20:30

Aaronontheweb commented Aug 16, 2024

View reviewed changes

Arkatufus approved these changes Aug 16, 2024

View reviewed changes

Merge branch 'dev' into mntr-barrier-cleanup

4254c4d

Aaronontheweb merged commit 355439e into akkadotnet:dev Aug 19, 2024

Aaronontheweb deleted the mntr-barrier-cleanup branch August 19, 2024 23:53

Arkatufus mentioned this pull request Aug 23, 2024

Update RELEASE_NOTES.md for 1.5.28-beta1 release #7329

Merged

Arkatufus mentioned this pull request Sep 4, 2024

Update RELEASE_NOTES.md for 1.5.28 release #7336

Merged

This was referenced Nov 6, 2025

Bump Akka.Persistence.Query from 1.4.43 to 1.5.55 Arkatufus/Akka.Persistence.Azure#76

Closed

Bump Akka.Persistence.TCK from 1.4.43 to 1.5.55 Arkatufus/Akka.Persistence.Azure#77

Closed

Conversation

Aaronontheweb commented Aug 16, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes

Checklist

Uh oh!

Aaronontheweb left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Aaronontheweb commented Aug 16, 2024

Uh oh!

Arkatufus left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Aaronontheweb commented Aug 16, 2024 •

edited

Loading