Deep clone TaskRegistry and Toolset#8973
Conversation
rokonec
left a comment
There was a problem hiding this comment.
Good job.
This would be better in another PR, but I believe that we shall alter our tests so it would catch common failure mode when people add field/property and forget to Clone it.
We have unit tests comparer but those will most probably be forgotten to update as well.
I had success in pasty with AutoFixture for seeding data and Fluent Assertion equivalency tests
| ProjectInstance first = GetSampleProjectInstance(); | ||
| ProjectInstance second = first.DeepCopy(); | ||
|
|
||
| // Task registry object should be immutable |
There was a problem hiding this comment.
Was this assertion always wrong? Have we added mutability where we shouldn't have, or just "in a way that requires we change this"?
There was a problem hiding this comment.
Unfortunately this is older then GH history so I can only provide guessing - and my guess is the ProjectInstance and owned object graph (including TaskRegistry) was always meant to be instantiated once and then used only sequantially. So immutability and deepclonning was never strongly followed - as it was mostly ok in majority of cases (a specific set of VS actions and conditions needs to be made - e.g. in-proc build or cancel and rebuild - to hit the unwanted sharing of TaskRegistry/Toolset).
TaskRegistry looks almost immutable, with exception of the internal matching caches, lazy initialized state of held TaskFactoryWrapper instances (the reason for the 3rd state corruption perf issue), plus something else (I forgot now) - so the path of DeepCopying turned out to be easier and safer than attempting to make TaskRegistry and Toolset truly immutable
There was a problem hiding this comment.
tl;dr; the added mutability shouldn't be added without adding deep cloning IMHO
rainersigwald
left a comment
There was a problem hiding this comment.
Looking at this more, this feels wrong to me. I expect the Toolset and TaskRegistry to be essentially singletons (since we no longer respect a file's toolset specification), and now there will be many of them, causing us to have a lower cache hit rate on lookups.
Do we understand yet what's causing the multithreading in task execution? Or is it one thread executing tasks and one thread evaluating projects? Or what?
| public static PropertyDictionary<ProjectPropertyInstance>? DeepClone( | ||
| this PropertyDictionary<ProjectPropertyInstance>? properties) | ||
| => properties == null ? null : new(properties.Select<ProjectPropertyInstance, ProjectPropertyInstance>(p => p.DeepClone())); |
There was a problem hiding this comment.
The extra copies seem ok based on your results so far, but we could also consider using our ImmutableDictionary, our wrapper CopyOnWriteDictionary, or a similar type to get copy-on-write behavior and minimize allocations (since it seems modifications are rare).
| } | ||
|
|
||
| internal SubToolset DeepClone() | ||
| => new SubToolset(_subToolsetVersion, _properties?.DeepClone()); |
There was a problem hiding this comment.
What causes mutation of _properties? It feels like it should be static for a subtoolset, and indeed that subtoolset instances should be single-instanced rather than cloned themselves.
There was a problem hiding this comment.
👍
The code could (and likely should) verify the IsImmutable member of all _properties and skip clonning if they are all true.
Without that unfortunately the code gives no guarantees.
Or possibly the SubToolset itself should be created immutable (with properties flipped to immutable) - as currently this is not a case
Copying ProjectInstances is exposed publicly (with coment mentioning it's deep copy) - so there might be scenarios we don't know about. Overlaping build and evaluation seems to be one of them (as copy ctor is invoked between eval and build). The mutability concern and need for deep copy was voiced in the original code as well: https://github.com/dotnet/msbuild/pull/8973/files#diff-40f77147ccc6867c10324ec4ddf8110ede2e43499d939b73ef83c7f5febb7a75L564 However, let's make sure we have better understanding and better consensus on solution - so keeping this PR open for now. |
|
Upon detailed investigation with @rainersigwald and @rokonec we'll fix this differently: TaskRegistry is expected to be shareable and logicaly immutable, however it has internal caches, that can be altered on access. so TaskRegistry internal caches will be made thread safe, the actual task registrations are not expected to be changed after initialization - so safeguard code will be added to catch unintended state changes |
|
Superseded by #9032 |
Fixes ADO#1801351 and ADO#1801341 and possibly ADO#1824802
Credit for rootcausing: @rokonec
Context
TaskRegistry(and asToolsetas well) were not properly clonned during clonningProjectInstancedata - that could cause unintended sharing ofTaskRegistrydata between multiple independent evaluation/build threads - allowing concurrent access toTaskRegistryand its objects, while such concurency is not supported.Changes Made
Made
TaskRegistryandToolsetclasses DeepClonable and properly cloning them duringProjectInstanceclonningTesting - Perf
No observable impact.
msbuild.exe /nodeReuse:false /bl