Optimize DistributedPubSub memory allocation#7642
Optimize DistributedPubSub memory allocation#7642Aaronontheweb merged 5 commits intoakkadotnet:devfrom
Conversation
|
Comparing the DPA recording of We can see that Note that this is not a 100% fix, some of the old memory allocation has been shifted to |
| private static readonly Dictionary<string, string> TopicToEncodedMap = new(); | ||
| private static readonly Dictionary<string, string> EncodedToTopicMap = new(); | ||
| private static readonly Dictionary<MakeKeyInfo, string> MakeKeyMap = new(); | ||
| private static readonly Dictionary<string, MakeKeyInfo> MakeKeyReverseMap = new(); |
There was a problem hiding this comment.
This strikes me as ever-so-slightly unsafe, in that if there are multiple ActorSystems running in the same app domain, you could have those systems callers potentially mutating a dictionary while another is trying to read...
There was a problem hiding this comment.
I think you're right, here's an option I thought of:
- We can use
ImmutableDictionary, but then we'll lose the memory allocation improvement - Another one is
ConcurrentDictionary, but then we'll lose the performance improvement
There was a problem hiding this comment.
Though let me think about this a bit, this might actually be safe.
There was a problem hiding this comment.
On second thought, after discussing it with @Aaronontheweb, I'm going to move this as an actor non-static field instead, moving it to the Mediator and/or Topic/TopicLike class.
| /// </summary> | ||
| internal static class Utils | ||
| { | ||
| private record MakeKeyInfo(ActorPath Path, string Topic); |
There was a problem hiding this comment.
Stupid Question, have we checked this vs a private record readonly struct as far as allocations vs performance?
There was a problem hiding this comment.
Not yet, I'll give that a go 👍
|
Here's the final benchmark for this PR: OSVersion: Microsoft Windows NT 6.2.9200.0 dev
Original PR
readonly record struct
non-concurrent non-static cache (final)
|
| /// <param name="path">TBD</param> | ||
| /// <returns>TBD</returns> | ||
| public static string MakeKey(ActorPath path) | ||
| public string MakeKey(ActorPath path, string topic) |
There was a problem hiding this comment.
You could still make this static and do inlining, btw - just refactor these to be extension methods instead.
There was a problem hiding this comment.
I'm trying to avoid extension methods like a plague since it requires the outermost class declaration to be public static, didn't want to pollute the public API with internal API methods.
There was a problem hiding this comment.
Never mind, I see what you're saying, that is a good idea
|
Latest benchmark and DPA report comparison:
dev:final PR: |




This change also results in a slight bump of performance. Note that, although this reduces total memory allocation (Small Object Heap), it actually increases memory usage because it caches objects in memory.
This is the main 2 issues this PR trying to address:

ActorPath.Join()ActorPath.op_Division()Note
Although the total bytes reported in this PR is quite scary (in the Gb and Mb range), note that this number represent worst case accumulated memory allocation of a stress benchmark application where DistributedPubSub is being subjected to a burst of more than 20,000,000 messages in a very short period of time.
These are not memory leak, all of the allocated memories are reclaimed during GC.
Changes
ActorPathString.Join()and immutable operationsChecklist
For significant changes, please ensure that the following have been completed (delete if not relevant):
Latest
devBenchmarksOSVersion: Microsoft Windows NT 6.2.9200.0
ProcessorCount: 24
ClockSpeed: 0 MHZ
Actor Count: 48
Messages sent/received per client: 200000 (2e5)
Is Server GC: True
Thread count: 55
This PR's Benchmarks
OSVersion: Microsoft Windows NT 6.2.9200.0
ProcessorCount: 24
ClockSpeed: 0 MHZ
Actor Count: 48
Messages sent/received per client: 200000 (2e5)
Is Server GC: True
Thread count: 53