perf(engine): batch per-test overhead cleanups (#5719) by thomhurst · Pull Request #5730 · thomhurst/TUnit

thomhurst · 2026-04-24T11:48:01Z

Summary

Batch of the smaller per-test overhead fixes surfaced in #5719. Individually modest (~0.2-1% each), collectively ~1% CPU across typical suites.

Closes #5719

Shipped (4 of 5)

Hoist EnsureEventReceiversCached out of the five per-test receiver getters. Replaced the per-getter cache-array probe with an EventReceiversBuilt bool on TestContext; TestCoordinator.ExecuteTestLifecycleAsync rebuilds once after ClassInstance is assigned. Hot-path getters (GetTestStartReceivers/GetTestEndReceivers/GetTestSkippedReceivers) are now plain field reads.
Cache ObjectGraphDiscoverer.TraverseInitializerProperties output per-type. The inheritance walk + HashSet<string> dedupe now runs once and is memoized in a ConcurrentDictionary<Type, InitializerPropertyInfo[]?>. ShouldSkipType is memoized too, so Namespace.StartsWith("System") no longer runs per call.
Drop Lazy<SemaphoreSlim> _maxParallelismSemaphore on the .NET 8+ path. Parallel.ForEachAsync caps concurrency via ParallelOptions.MaxDegreeOfParallelism; the semaphore is now only built on the netstandard2.0 fallback.

Skipped (2 of 5)

Per-TestContext slot for ObjectTracker counters. Requires a robust "shared vs per-test" signal at Track/Untrack time. The existing _initializedObjects set in EventReceiverOrchestrator only flags seen-before after the second test, so the bulk of per-test objects would still hit the shared dict on first track. Needs a dedicated sharing marker on the data-source side.
RestoreExecutionContext version-counter skip. Implementation attempted a thread-static "last restored context + version" guard, but regressed TUnit.TestProject/Bugs/1914/AsyncHookTests (AsyncLocal-flow chain from BeforeTestDiscovery through BeforeTestSession / BeforeAssembly / BeforeClass / BeforeTest). The captured ExecutionContext on a context can carry AsyncLocal values set by hooks on the active async flow between Restore sites; the skip misses those flows when a continuation lands on a thread whose last-Restored context points elsewhere. Held back for a safer design.

Test plan

dotnet build TUnit.slnx -c Release — 0 errors
TUnit.Core.SourceGenerator.Tests — 116 passed, 1 skipped (known)
TUnit.TestProject BasicTests filter — 3/3 passed
TUnit.TestProject AsyncHookTests filter (the 1914 regression guard) — 8/8 passed
TUnit.TestProject DataDrivenTests filter — 15/15 passed

Four of the five batched cleanups from #5719. Individually modest (~0.2-1% each), collectively ~1% CPU across typical suites. Shipped: - Hoist EnsureEventReceiversCached out of the five per-test receiver getters; gate with an EventReceiversBuilt bool instead of probing each cached-array field. TestCoordinator.ExecuteTestLifecycleAsync rebuilds once after ClassInstance is assigned, so the hot-path getters are now plain field reads. - Cache ObjectGraphDiscoverer.TraverseInitializerProperties output per-type. The inheritance walk + HashSet<string> dedupe now runs once and is memoized; per-test traversal iterates the flattened InitializerPropertyInfo[] directly. ShouldSkipType is memoized too, so Namespace.StartsWith("System") no longer fires per call. - Drop the Lazy<SemaphoreSlim> from TestScheduler on the .NET 8+ path. Parallel.ForEachAsync caps concurrency via ParallelOptions; the semaphore is now only built on the netstandard2.0 fallback. Skipped: - Per-TestContext slot for ObjectTracker counters: requires a robust "shared vs per-test" signal at Track/Untrack time. The existing _initializedObjects set in EventReceiverOrchestrator only flags seen-before after the second test, so the bulk of per-test objects would still hit the shared dict on first track. Held back pending a dedicated sharing marker. - RestoreExecutionContext version-counter skip: a working variant regressed the Bugs/1914 AsyncLocal-flow tests. The captured ExecutionContext on a context can carry AsyncLocal values set by hooks on the active async flow between Restore sites; a thread- static "same context + same version" skip misses those flows when the thread lands a continuation that diverges from its last Restore. Held back for a safer design.

codacy-production · 2026-04-24T11:51:16Z

Up to standards ✅

🟢 Issues 0 issues

Results:
0 new issues

View in Codacy

🟢 Metrics -7 complexity

Metric Results

Complexity -7

View in Codacy

_{NEW Get contextual insights on your PRs based on Codacy's metrics, along with PR and Jira context, without leaving GitHub. Enable AI reviewer}
_{TIP This summary will be updated as you push new changes.}

claude

Code Review: perf(engine): batch per-test overhead cleanups (#5730)

The PR is well-structured and the motivations are sound. The three shipped changes are individually correct. Below are my findings organized from most to least significant.

1. Implicit contract on null-forgiving receiver getters (correctness risk)

What changed: GetTestStartReceivers, GetTestEndReceivers, and GetTestSkippedReceivers had their EnsureEventReceiversCached() guards removed. They now directly return CachedTest*Receivers! with a null-forgiving operator, relying on the caller having previously invoked EnsureEventReceiversCached.

Why this is risky: The null-forgiving ! suppresses the compiler's null check, meaning a missed call site would produce a NullReferenceException at runtime rather than a compile-time warning. The contract is documented in XML comments, but comments don't enforce invariants.

The current execution path maintains safety through a chain of implicit dependencies:

InvokeTestStartEventReceiversAsync checks _registry.HasTestStartReceivers() first
_registry only gets populated via RegisterReceivers → GetEligibleEventObjects → EnsureEventReceiversCached
So by the time any GetTest*Receivers call executes in the hot path, the cache is built

This chain holds for current callers, but it's fragile. Any future caller that queries receivers outside the test lifecycle (e.g., a new diagnostic path, a plugin API, a test) without first going through RegisterReceivers would silently get a null and crash.

Suggestion: The GetTestDiscoveryReceivers and GetTestRegisteredReceivers methods still guard themselves — the PR documents this as "out-of-lifecycle paths." A cheap middle-ground for the hot-path methods would be an [Conditional("DEBUG")] assertion rather than removing all protection:

public static ITestSkippedEventReceiver[] GetTestSkippedReceivers(this TestContext testContext)
{
    Debug.Assert(testContext.EventReceiversBuilt,
        "GetTestSkippedReceivers called before EnsureEventReceiversCached");
    return testContext.CachedTestSkippedReceivers!;
}

This costs nothing in release builds but catches future violations immediately in debug/test runs.

2. `GetFlattenedInitializerProperties` uses a non-atomic TryGetValue + index-assign pattern

What changed: The method uses TryGetValue then dict[type] = result rather than GetOrAdd.

Why this matters: Under concurrent access, two threads can both miss TryGetValue, both compute the (identical) result, and both write to the dictionary. This is functionally correct (the computation is deterministic and idempotent), but:

Inconsistency: ShouldSkipType uses GetOrAdd. Having two different patterns for the same "memoize a stable, deterministic result" use case is surprising for future readers.
ConcurrentDictionary supports nullable values in GetOrAdd: The nullable value type InitializerPropertyInfo[]? works fine. The current pattern may have been chosen to avoid the "factory called multiple times" behavior of GetOrAdd, but for an inexpensive factory (array allocation + dictionary lookup) this isn't a concern.

Suggestion: Either document why the direct-assign pattern is chosen over GetOrAdd, or switch to GetOrAdd for consistency with ShouldSkipType. The current code is not wrong, just inconsistent.

3. Correctness on skip path before `EnsureEventReceiversCached` (confirmed safe, worth noting)

There's an early-exit at the top of ExecuteTestLifecycleAsync (checking SkipReason before CreateInstanceAsync) that calls InvokeTestSkippedEventReceiversAsync → GetTestSkippedReceivers — before the in-lifecycle EnsureEventReceiversCached. After tracing through the flow: this is safe because RegisterReceivers is always called earlier in ExecuteTestInternalAsync, and RegisterReceivers → GetEligibleEventObjects → EnsureEventReceiversCached. By the time we reach ExecuteTestLifecycleAsync, CachedTestSkippedReceivers is always non-null (at minimum an empty array).

Worth calling out in a comment near the ExecuteTestLifecycleAsync entry point, since the safety depends on the ordering of RegisterReceivers → ExecuteTestLifecycleAsync remaining intact.

4. `Parallel.ForEachAsync` — `ConfigureAwait` removal and `async → Task` conversion (correct)

The conversion from async Task + await ...ConfigureAwait(false) to Task + return ... eliminates one async state machine allocation per ExecuteTestsAsync call on .NET 8+. This is correct: all callers already add .ConfigureAwait(false) at their own await site, and Parallel.ForEachAsync returns a Task directly. No behavioral change.

5. `#if !NET8_0_OR_GREATER` vs `#if NET` inconsistency (minor)

The scheduler uses #if !NET8_0_OR_GREATER / #if NET8_0_OR_GREATER while TestContextExtensions uses #if NET. Both are semantically correct for the Engine assembly, but mixing the two conventions adds cognitive overhead. Worth standardizing at some point.

What works well

The EventReceiversBuilt bool replaces a #if NET / #else branch in the fast-path check with a single TFM-agnostic condition — a clean simplification.
Dropping Lazy<SemaphoreSlim> on .NET 8+ is the right call — Parallel.ForEachAsync makes the semaphore purely overhead.
The FlattenedInitializerPropertiesCache design correctly handles the null (no source-gen) vs empty-array (source-gen with zero properties) distinction, and ClearCache() is properly updated.
The StringComparison.Ordinal addition to ShouldSkipType is correct hardening alongside the memoization.
The PR description accurately characterizes what was skipped and why — the "RestoreExecutionContext version-counter skip" analysis in particular shows the right instinct to not ship a half-baked optimization.

Summary: The three shipped changes achieve their stated goal and are functionally correct. The main concern worth addressing before merge is adding Debug.Assert guards to the hot-path receiver getters to protect the implicit invariant that EnsureEventReceiversCached has been called. Additionally, either documenting or harmonizing the TryGetValue + []= pattern in GetFlattenedInitializerProperties would improve future maintainability.

- Add Debug.Assert guards on the three hot-path receiver getters (GetTestStartReceivers, GetTestEndReceivers, GetTestSkippedReceivers) to catch invariant violations in Debug/test builds without cost in Release. - Unify dictionary access in ObjectGraphDiscoverer: both ShouldSkipType and GetFlattenedInitializerProperties now use GetOrAdd with a static factory. - Document the RegisterReceivers -> EnsureEventReceiversCached ordering dependency that makes the early skip-path safe on ExecuteTestLifecycleAsync.

thomhurst · 2026-04-24T12:06:25Z

Addressed review feedback in 05f1046:

Added Debug.Assert(testContext.EventReceiversBuilt, ...) to the three hot-path getters (GetTestStartReceivers, GetTestEndReceivers, GetTestSkippedReceivers). Catches invariant violations in Debug/tests at zero Release cost.
Unified dictionary access in ObjectGraphDiscoverer: GetFlattenedInitializerProperties now uses GetOrAdd with a static factory, matching the existing ShouldSkipType pattern.
Added a <remarks> block on ExecuteTestLifecycleAsync (plus a pointer comment by the skip branch) documenting the RegisterReceivers -> EnsureEventReceiversCached ordering invariant that keeps the early skip path safe.

Verified locally: dotnet build TUnit.slnx -c Release clean, source-gen snapshot tests pass (116/116), and BasicTests smoke run passes (3/3).

claude

Code Review: perf(engine): batch per-test overhead cleanups

This is a follow-up review covering the updated state of the PR after the author addressed the previous round of feedback in commit `05f1046c7`.

Previous review feedback: all addressed ✓

All three concerns raised in the earlier round have been fully resolved:

Debug.Assert guards on hot-path getters — Added to all three of GetTestStartReceivers, GetTestEndReceivers, and GetTestSkippedReceivers with descriptive messages. Zero release cost, correct diagnostic signal in debug/test runs.
GetOrAdd consistency in ObjectGraphDiscoverer — GetFlattenedInitializerProperties now uses GetOrAdd with a static factory delegate, matching the pattern of ShouldSkipType. The TryGetValue + direct-assign pattern from the original PR is gone.
ExecuteTestLifecycleAsync ordering invariant — A <remarks> block documents the RegisterReceivers → EnsureEventReceiversCached prerequisite, and an inline comment at the skip-path branch makes the reasoning locally visible.

Remaining observations

1. `GetOrAdd` factory may run multiple times — correct behaviour, worth noting

ConcurrentDictionary<Type, InitializerPropertyInfo[]?> stores a nullable reference. The GetOrAdd(key, factory) overload does not guarantee single invocation under concurrent access; two racing threads may both execute the factory and produce two results, with one being discarded. This is fine here because the factory is deterministic and the result is effectively immutable — warmup cost is bounded and quickly amortised. No action needed.

2. `TestContext.EventReceiversBuilt` reset path — correct but slightly redundant

InvalidateEventReceiverCaches() resets both EventReceiversBuilt = false and CachedClassInstance = null. The fast-path check in EnsureEventReceiversCached combines both conditions, so they're both evaluated — but EventReceiversBuilt is the authoritative sentinel. The two-field check is redundant but harmless and was present before this PR. Not worth changing now, just noting it for awareness.

3. `#if NET8_0_OR_GREATER` vs `#if NET` inconsistency — pre-existing, still unresolved

TestContextExtensions.cs uses #if NET while TestScheduler.cs uses #if NET8_0_OR_GREATER / #if !NET8_0_OR_GREATER (lines ~30, ~62, ~71). Both are semantically equivalent in the Engine assembly (targets net8.0;net9.0;net10.0;netstandard2.0), but the inconsistency adds cognitive load for readers trying to understand which directive convention to follow.

This was flagged in the first review and was not addressed. Not a merge blocker, but a follow-up cleanup commit would be appreciated.

Summary

The PR is in good shape. All material feedback from the first review has been addressed cleanly. The EventReceiversBuilt bool, the flattened-property cache, and the semaphore removal collectively achieve the stated performance goal with minimal added complexity.

The only outstanding item is the #if NET vs #if NET8_0_OR_GREATER inconsistency in TestScheduler.cs — a pre-existing minor issue that could use a follow-on cleanup.

Ready to merge.

Over-engineered Item 1 from #5730 is simplified: - Kept the EventReceiversBuilt bool on TestContext so the fast-path gate is a single TFM-agnostic field check (unifies the NET / non-NET #if branches that previously poked different cache-array fields). - Dropped the hoisted EnsureEventReceiversCached call in TestCoordinator.ExecuteTestLifecycleAsync and the Debug.Assert ordering-guards on GetTestStartReceivers / GetTestEndReceivers / GetTestSkippedReceivers — the 4x fast-path savings (<1ms on a 10K suite) did not justify the invariant-maintenance burden. - Added [MethodImpl(AggressiveInlining)] to EnsureEventReceiversCached so the cache-hit check collapses to a field load + branch at every call site, recovering equivalent perf with none of the baggage. - Reverted EnsureEventReceiversCached back to private static; removed the load-bearing <remarks>, skip-path comment, and stale using. Items #2 (ObjectGraphDiscoverer GetOrAdd caching) and #3 (scheduler semaphore removal) from #5730 are unchanged.

thomhurst · 2026-04-24T14:17:19Z

Over-engineered Item 1 simplified — kept EventReceiversBuilt for TFM unification, dropped the hoist + Debug.Asserts, added [MethodImpl(AggressiveInlining)] to the EnsureEventReceiversCached fast path. Same perf, none of the invariant-maintenance cost. Items #2 and #3 unchanged.

claude

Code Review: third-commit update (perf(engine): simplify event-receiver cache handling)

This is a third-round review covering commit `51507250`, which postdates the second review's "Ready to merge" note.

What changed in commit 3

The author stepped back from the "hoist + Debug.Assert" approach from commit 2 and chose a simpler design:

EnsureEventReceiversCached reverted to private static (called inside every getter, not hoisted to the lifecycle coordinator).
[MethodImpl(AggressiveInlining)] added to recover the fast-path perf without the invariant-maintenance burden.
Debug.Assert guards on the three hot-path getters removed.
The ordering <remarks> block on ExecuteTestLifecycleAsync removed.

The motivation is sound: the invariant chain was fragile documentation, and inlining achieves equivalent perf by collapsing the call at each site. This is a better long-term design than load-bearing comments.

One actionable concern: `AggressiveInlining` on a 100-line method will not inline

The intent behind [MethodImpl(MethodImplOptions.AggressiveInlining)] is that the JIT collapses the fast-path check (EventReceiversBuilt + ReferenceEquals) into a field-load + branch at each call site, eliminating the call overhead in the 99%+ cache-hit case.

However, EnsureEventReceiversCached spans roughly 100 lines of code — two #if NET conditional blocks, a foreach loop over eligible objects, five ??=-initialized lists, and multiple SortAndFilter calls. Even with AggressiveInlining, the RyuJIT inliner has a hard size limit (tunable via env var, but the default caps out well under this method's IL footprint). The hint will almost certainly be silently ignored.

Verification: Add [MethodImpl(MethodImplOptions.NoInlining)] temporarily to force a call, then benchmark against the current code. If throughput is identical, the inlining wasn't happening.

Better approach — fast/slow split: The standard pattern for hot-path guarding is to inline only the check, and outline the slow path:

[MethodImpl(MethodImplOptions.AggressiveInlining)]
private static void EnsureEventReceiversCached(TestContext testContext)
{
    var currentClassInstance = testContext.Metadata.TestDetails.ClassInstance;
    if (!testContext.EventReceiversBuilt ||
        !ReferenceEquals(testContext.CachedClassInstance, currentClassInstance))
    {
        BuildEventReceiverCaches(testContext, currentClassInstance);
    }
}

[MethodImpl(MethodImplOptions.NoInlining)]
private static void BuildEventReceiverCaches(TestContext testContext, object? currentClassInstance)
{
    // The existing cache-population and invalidation logic, verbatim.
}

Why this is better: The fast path (two field loads + two branches + return) is ~10 bytes of IL and will reliably inline. The slow path is explicitly non-inlined — NoInlining keeps it as a single copy rather than duplicated at N call sites. This is the canonical pattern in System.Collections and the ASP.NET Core pipeline for exactly this situation.

Without the split, the AggressiveInlining attribute is aspirational rather than effective, and the rationale for removing the Debug.Assert guards ("recovered equivalent perf with none of the baggage") loses its foundation.

Pre-existing inconsistency: `#if NET` vs `#if NET8_0_OR_GREATER`

Flagged in both prior reviews. TestContextExtensions.cs uses #if NET while TestScheduler.cs uses #if NET8_0_OR_GREATER / #if !NET8_0_OR_GREATER. Both are semantically equivalent in the Engine assembly (targets net8.0;net9.0;net10.0;netstandard2.0), but the inconsistency adds cognitive load. Not a blocker; a follow-on cleanup commit would be appreciated.

Everything else

The ObjectGraphDiscoverer caching (GetOrAdd + static factory for both ShouldSkipTypeCache and FlattenedInitializerPropertiesCache) and the TestScheduler semaphore removal remain correct and unchanged from the second review. The EventReceiversBuilt bool design is clean.

Summary

The third commit's design decision is sound in principle. The one concrete issue is that AggressiveInlining on a ~100-line method is unlikely to achieve the intended JIT behaviour. A fast/slow split would make the performance claim hold and preserve the simpler invariant model. Recommend addressing before merge.

…tlined builder (#5719) RyuJIT's inliner has a per-method IL size budget (typically ~100 bytes after accounting for CORINFO flags and other penalties). The previous EnsureEventReceiversCached was ~100+ bytes of IL once the single-pass categorization, invalidation check, and 5x SortAndFilter calls were included, so [MethodImpl(AggressiveInlining)] was silently ignored — every call site kept paying full call overhead even on the 99%+ cache-hit fast path. Refactor into the canonical fast/slow split: - EnsureEventReceiversCached: tiny wrapper (~10 bytes IL) that reads the current class instance, checks the cache-validity flag + a ReferenceEquals, returns on hit, otherwise tail-calls the builder. AggressiveInlining now reliably fires, so every call site collapses to a field-load + branch + return in the hot case. - BuildEventReceiverCaches: explicitly NoInlining so the JIT keeps one outlined copy of the ~100 bytes of build logic rather than duplicating it into every caller. currentClassInstance is threaded through from the gate to avoid a redundant field re-read. Also unify #if NET8_0_OR_GREATER -> #if NET in TestScheduler.cs and TestContextExtensions.cs (both already touched on this branch) to match the majority convention across TUnit.Engine (87 uses of #if NET vs 42 of #if NET8_0_OR_GREATER). Both are equivalent for the netstandard2.0;net8.0;net9.0;net10.0 target set. No behavior change: invalidation logic, eligible-objects walk, all 5 list builds, all 5 SortAndFilter calls, and the CachedClassInstance/EventReceiversBuilt commit order are preserved.

thomhurst · 2026-04-24T14:50:29Z

Addressed the size-budget concern on EnsureEventReceiversCached.

Refactored into the canonical fast/slow split:

EnsureEventReceiversCached is now a ~10-byte IL wrapper: load ClassInstance, check EventReceiversBuilt && ReferenceEquals(CachedClassInstance, currentClassInstance), return on hit, otherwise tail-call the builder. Well under RyuJIT's inliner size budget, so [MethodImpl(AggressiveInlining)] now reliably fires and every call site collapses to a field-load + branch + return in the 99%+ hot case.
BuildEventReceiverCaches holds the ~100 bytes of build logic (invalidation, eligible-objects walk, 5 list builds, 5 SortAndFilter calls, CachedClassInstance + EventReceiversBuilt commit). Marked [MethodImpl(NoInlining)] so the JIT keeps a single outlined copy instead of duplicating it into every caller. currentClassInstance is threaded through from the gate to avoid a redundant field re-read.

No behavioral change — all invalidation semantics and cache write order are preserved.

Also piggybacked the #if NET8_0_OR_GREATER -> #if NET unification across the two files already in this PR (TestScheduler.cs, TestContextExtensions.cs). #if NET is the majority convention across TUnit.Engine (87 occurrences vs 42), and both forms are equivalent for the netstandard2.0;net8.0;net9.0;net10.0 target set.

Verification: dotnet build TUnit.slnx -c Release (0 errors), source-generator snapshot tests pass (116/116), BasicTests smoke pass (3/3).

claude

Code Review: perf(engine): batch per-test overhead cleanups

Overall this is a solid, well-reasoned performance PR. The changes are conservative, correctly scoped to hot paths, and demonstrate good judgment in deferring optimizations that had correctness risks.

`ObjectGraphDiscoverer.cs` — Memoized type caches

The addition of ShouldSkipTypeCache and FlattenedInitializerPropertiesCache is correct. A few observations:

Good: Using StringComparison.Ordinal in the namespace check is both a correctness improvement and a micro-perf win. The nullable InitializerPropertyInfo[]? trick to distinguish "no source-gen registration" from "empty result" is a clean sentinel that avoids a second lookup or separate flag.

Suggestion: FlattenedInitializerPropertiesCache is unbounded. For most usages this is fine (type cardinality is bounded by the assembly), but if TUnit is ever used in a plugin-loading or dynamic assembly scenario, this could grow unexpectedly. A low-priority future improvement would be to log or cap cache size, or at minimum document this assumption.

Cache invalidation: ClearCache() clears both new dictionaries — ✓ correct.

`TestContext.cs` — `EventReceiversBuilt` bool gate

Clean approach. An 8-byte sentinel bool replacing a more expensive null-inspection is the right call for a per-test hot path. The InvalidateEventReceiverCaches() reset ensures retry/class-instance-change scenarios work correctly.

No issues here.

`TestContextExtensions.cs` — Aggressive inline / NoInline split

This is the most architecturally interesting change. The pattern of:

[MethodImpl(MethodImplOptions.AggressiveInlining)]
static void EnsureEventReceiversCached(...) {
    if (testContext.EventReceiversBuilt && ...) return;
    BuildEventReceiverCaches(...);
}

[MethodImpl(MethodImplOptions.NoInlining)]
static void BuildEventReceiverCaches(...) { ... }

...is a well-established JIT tuning technique (sometimes called "outlined slow path"). It ensures the 99% fast-path collapses to a field-load + branch at call sites, while the cold build path doesn't bloat the caller's method body.

Good: Passing currentClassInstance through to BuildEventReceiverCaches to avoid re-reading from testContext.Metadata.TestDetails is a smart touch — avoids a redundant null-conditional chain in the already-unlikely build path.

Question / potential issue: The EventReceiversBuilt flag is set to true inside BuildEventReceiverCaches. Is there a race condition if two threads somehow call EnsureEventReceiversCached on the same TestContext concurrently before the flag is set? If TestContext is per-test and single-threaded during lifecycle setup, this is fine — but it's worth confirming that assumption is enforced (or at least documented). If it's already guaranteed by the scheduler, a brief comment would clarify intent.

`TestScheduler.cs` — Semaphore removed on modern .NET

Good call. Parallel.ForEachAsync with MaxDegreeOfParallelism handles the concurrency cap natively on .NET 6+. Removing the Lazy<SemaphoreSlim> eliminates dead code and frees a small allocation per-scheduler instance.

Minor: The switch from #if NET8_0_OR_GREATER to #if NET is intentional since TUnit targets net8.0;net9.0;net10.0 — #if NET covers all of these. Worth confirming there's no netstandard2.0 target in this specific project that would accidentally activate the semaphore path unintentionally.

Deferred optimizations (#2 and #4)

The decision to defer both is the right call:

#2 (ObjectTracker slot): The "first test hits shared dict" correctness gap is real. Holding back for a cleaner design is correct.
#4 (RestoreExecutionContext skip): The AsyncHookTests regression is a genuine correctness signal — ExecutionContext can carry AsyncLocal values from hooks, so skipping restore in that path is unsafe. Good catch; correctness > micro-perf.

Summary

The changes are well-targeted, correctly handle cache invalidation, and include regression guards for the scenarios that were considered risky. The one item worth a follow-up comment is the thread-safety assumption on EventReceiversBuilt — if it's already guaranteed, a comment would make the code easier to maintain safely.

Approved with minor comments.

Updated [TUnit](https://github.com/thomhurst/TUnit) from 1.39.0 to 1.40.5. <details> <summary>Release notes</summary> _Sourced from [TUnit's releases](https://github.com/thomhurst/TUnit/releases)._ ## 1.40.5  ## What's Changed ### Other Changes * Fix reflection property injection reuse by @thomhurst in thomhurst/TUnit#5763 * fix(assertions): gate IsEqualTo<TValue, TOther> overload to net9+ (#5765) by @thomhurst in thomhurst/TUnit#5767 ### Dependencies * chore(deps): update tunit to 1.40.0 by @thomhurst in thomhurst/TUnit#5762 **Full Changelog**: thomhurst/TUnit@v1.40.0...v1.40.5 ## 1.40.0  ## What's Changed ### Other Changes * perf(engine): collapse async forwarding wrappers in test execution (#5714) by @thomhurst in thomhurst/TUnit#5725 * perf(engine): skip Console.Out/Err FlushAsync when no output captured (#5712) by @thomhurst in thomhurst/TUnit#5724 * perf(engine): collapse async state machines on hook cache-hit / empty-hook path (#5713) by @thomhurst in thomhurst/TUnit#5726 * perf: eliminate per-test closure + GetOrAdd factory alloc (#5710) by @thomhurst in thomhurst/TUnit#5727 * perf(engine): replace global lock in EventReceiverRegistry with lock-free CAS by @thomhurst in thomhurst/TUnit#5731 * perf(engine): batch per-test overhead cleanups (#5719) by @thomhurst in thomhurst/TUnit#5730 * #5733 handling all arguments for Fact and Theory by @inyutin-maxim in thomhurst/TUnit#5734 * fix(assertions): prefer string overload of Member() over IEnumerable<char> (#5702) by @thomhurst in thomhurst/TUnit#5721 * fix(migration): preserve comments/XML docs when removing sole attributes (#5698) by @thomhurst in thomhurst/TUnit#5739 * perf(build): trim test TFMs and skip viewer dump by default by @thomhurst in thomhurst/TUnit#5741 * fix(pipeline): skip TestBaseModule frameworks with missing binaries by @thomhurst in thomhurst/TUnit#5752 * feat(assertions): focused diff messages for IsEqualTo/IsEquivalentTo (#5732) by @thomhurst in thomhurst/TUnit#5747 * fix(analyzers): remove incorrect AOT rules TUnit0300/0301/0302 (#5722) by @thomhurst in thomhurst/TUnit#5746 * perf(engine): lazy hook metadata registration (#5448) by @thomhurst in thomhurst/TUnit#5750 * chore(templates): unify TUnit version pinning to 1.* (#5709) by @thomhurst in thomhurst/TUnit#5743 * fix(templates): floating TUnit.Aspire version (#5708) by @thomhurst in thomhurst/TUnit#5742 * fix(assertions): preserve specialised source in .Count(itemAssertion) (#5707) by @thomhurst in thomhurst/TUnit#5749 * feat(assertions): IsEqualTo with implicitly-convertible wrappers (#5720) by @thomhurst in thomhurst/TUnit#5751 * feat(aspire): add ability to manually remove resources by @Odonno in thomhurst/TUnit#5586 * fix(fscheck): register default CancellationToken arbitrary that surfaces TestContext token by @JohnVerheij in thomhurst/TUnit#5758 * fix(engine): allow keyed NotInParallel tests to run alongside unconstrained tests (#5700) by @thomhurst in thomhurst/TUnit#5740 * perf: skip TimeoutHelper wrap when no explicit [Timeout] is set (#5711) by @thomhurst in thomhurst/TUnit#5728 ### Dependencies * chore(deps): update tunit to 1.39.0 by @thomhurst in thomhurst/TUnit#5701 * chore(deps): update aspire to 13.2.4 by @thomhurst in thomhurst/TUnit#5735 * chore(deps): bump postcss from 8.5.6 to 8.5.10 in /docs by @dependabot[bot] in thomhurst/TUnit#5736 * chore(deps): update dependency fscheck to 3.3.3 by @thomhurst in thomhurst/TUnit#5760 ## New Contributors * @inyutin-maxim made their first contribution in thomhurst/TUnit#5734 * @Odonno made their first contribution in thomhurst/TUnit#5586 **Full Changelog**: thomhurst/TUnit@v1.39.0...v1.40.0 Commits viewable in [compare view](thomhurst/TUnit@v1.39.0...v1.40.5). </details> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=TUnit&package-manager=nuget&previous-version=1.39.0&new-version=1.40.5)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

thomhurst had a problem deploying to Pull Requests April 24, 2026 11:48 — with GitHub Actions Error

claude Bot reviewed Apr 24, 2026

View reviewed changes

thomhurst temporarily deployed to Pull Requests April 24, 2026 12:06 — with GitHub Actions Inactive

claude Bot reviewed Apr 24, 2026

View reviewed changes

thomhurst temporarily deployed to Pull Requests April 24, 2026 14:17 — with GitHub Actions Inactive

thomhurst had a problem deploying to Pull Requests April 24, 2026 14:17 — with GitHub Actions Error

claude Bot reviewed Apr 24, 2026

View reviewed changes

thomhurst temporarily deployed to Pull Requests April 24, 2026 14:51 — with GitHub Actions Inactive

claude Bot reviewed Apr 24, 2026

View reviewed changes

thomhurst merged commit 5bb433a into main Apr 24, 2026
15 checks passed

thomhurst deleted the perf/5719-per-test-cleanups branch April 24, 2026 15:53

This was referenced Apr 27, 2026

Bump TUnit from 1.37.0 to 1.40.5 JLdgu/PeripheralAudit#146

Merged

Bump TUnit from 1.37.0 to 1.40.5 nkz-soft/NKZSoft.FluentResults.Extensions.Functional#147

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

perf(engine): batch per-test overhead cleanups (#5719)#5730

perf(engine): batch per-test overhead cleanups (#5719)#5730
thomhurst merged 4 commits intomainfrom
perf/5719-per-test-cleanups

thomhurst commented Apr 24, 2026

Uh oh!

codacy-production Bot commented Apr 24, 2026 •

edited

Loading

Uh oh!

claude Bot left a comment

Uh oh!

thomhurst commented Apr 24, 2026

Uh oh!

claude Bot left a comment

Uh oh!

thomhurst commented Apr 24, 2026

Uh oh!

claude Bot left a comment

Uh oh!

thomhurst commented Apr 24, 2026

Uh oh!

claude Bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

thomhurst commented Apr 24, 2026

Summary

Shipped (4 of 5)

Skipped (2 of 5)

Test plan

Uh oh!

codacy-production Bot commented Apr 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Up to standards ✅

Uh oh!

claude Bot left a comment

Choose a reason for hiding this comment

Code Review: perf(engine): batch per-test overhead cleanups (#5730)

1. Implicit contract on null-forgiving receiver getters (correctness risk)

2. GetFlattenedInitializerProperties uses a non-atomic TryGetValue + index-assign pattern

3. Correctness on skip path before EnsureEventReceiversCached (confirmed safe, worth noting)

4. Parallel.ForEachAsync — ConfigureAwait removal and async → Task conversion (correct)

5. #if !NET8_0_OR_GREATER vs #if NET inconsistency (minor)

What works well

Uh oh!

thomhurst commented Apr 24, 2026

Uh oh!

claude Bot left a comment

Choose a reason for hiding this comment

Code Review: perf(engine): batch per-test overhead cleanups

Previous review feedback: all addressed ✓

Remaining observations

1. GetOrAdd factory may run multiple times — correct behaviour, worth noting

2. TestContext.EventReceiversBuilt reset path — correct but slightly redundant

3. #if NET8_0_OR_GREATER vs #if NET inconsistency — pre-existing, still unresolved

Summary

Uh oh!

thomhurst commented Apr 24, 2026

Uh oh!

claude Bot left a comment

Choose a reason for hiding this comment

Code Review: third-commit update (perf(engine): simplify event-receiver cache handling)

What changed in commit 3

One actionable concern: AggressiveInlining on a 100-line method will not inline

Pre-existing inconsistency: #if NET vs #if NET8_0_OR_GREATER

Everything else

Summary

Uh oh!

thomhurst commented Apr 24, 2026

Uh oh!

claude Bot left a comment

Choose a reason for hiding this comment

Code Review: perf(engine): batch per-test overhead cleanups

ObjectGraphDiscoverer.cs — Memoized type caches

TestContext.cs — EventReceiversBuilt bool gate

TestContextExtensions.cs — Aggressive inline / NoInline split

TestScheduler.cs — Semaphore removed on modern .NET

Deferred optimizations (#2 and #4)

Summary

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

codacy-production Bot commented Apr 24, 2026 •

edited

Loading

2. `GetFlattenedInitializerProperties` uses a non-atomic TryGetValue + index-assign pattern

3. Correctness on skip path before `EnsureEventReceiversCached` (confirmed safe, worth noting)

4. `Parallel.ForEachAsync` — `ConfigureAwait` removal and `async → Task` conversion (correct)

5. `#if !NET8_0_OR_GREATER` vs `#if NET` inconsistency (minor)

1. `GetOrAdd` factory may run multiple times — correct behaviour, worth noting

2. `TestContext.EventReceiversBuilt` reset path — correct but slightly redundant

3. `#if NET8_0_OR_GREATER` vs `#if NET` inconsistency — pre-existing, still unresolved

One actionable concern: `AggressiveInlining` on a 100-line method will not inline

Pre-existing inconsistency: `#if NET` vs `#if NET8_0_OR_GREATER`

`ObjectGraphDiscoverer.cs` — Memoized type caches

`TestContext.cs` — `EventReceiversBuilt` bool gate

`TestContextExtensions.cs` — Aggressive inline / NoInline split

`TestScheduler.cs` — Semaphore removed on modern .NET