fix: stabilize flaky tests across analyzer, OTel, and engine suites#5609
fix: stabilize flaky tests across analyzer, OTel, and engine suites#5609
Conversation
- Normalize line endings on all analyzer test sources (including stubs) to prevent cross-platform mismatches - Replace racy WhenAny timeout with explicit HasRequest check in OTLP receiver no-forward test - Serialize HookExecutorHookTests_ClassLevel via NotInParallel to avoid hook-count races - Use ConcurrentBag and reference-equality loop for KeyedDataSource shared-instance assertion
There was a problem hiding this comment.
Code Review
Good set of targeted flakiness fixes. The changes are generally clean and address real root causes. A few observations:
OtlpReceiverForwardingTests — redundant delay
After , the receiver has already finished processing the request. If WhenIdle() provides that guarantee, the forwarding decision has already been made — no forwarding will happen later. The extra await Task.Delay(500) is then dead time:
await client.PostAsync($"http://127.0.0.1:{receiver.Port}/v1/traces", content);
await receiver.WhenIdle();
// If WhenIdle() guarantees processing is complete, this delay is redundant:
await Task.Delay(500);
await Assert.That(upstream.HasRequest("/v1/traces")).IsFalse();If WhenIdle() does guarantee completion, the delay can be dropped entirely for a faster, more deterministic test. If it doesn't — i.e., forwarding can race beyond the idle point — then that's a bug in WhenIdle()'s contract that should be fixed there rather than papered over with a sleep. A comment explaining why the delay is still needed would at minimum remove ambiguity.
KeyedDataSourceTests — weakened count assertion
// Before (precise):
await Assert.That(AlphaInstances).Count().IsEqualTo(2);
// After (weaker):
await Assert.That(snapshot.Length).IsGreaterThanOrEqualTo(2);The >= 2 relaxation means a future regression (e.g., 3 or 4 instances created when only 2 are expected) would go undetected. Using ConcurrentBag was the right move for thread-safety, but the count check doesn't need to be loosened — ConcurrentBag.Count is safe to read. Consider keeping IsEqualTo(2) (or the exact expected count) to preserve regression-catching precision.
The rest looks solid
- Line ending normalization: Making
NormalizeLineEndingsinternal staticand calling it at the call site inVerifyCodeFixAsyncis the right approach — normalizing early prevents subtle platform-specific comparison failures. HasRequestmethod: Correctly reuses_stateLockand mirrors the pattern already inWaitForRequestAsync. Could use_requests.Any(r => r.Path == path)inside the lock for conciseness, but the explicit loop is equally correct.[NotInParallel]onHookExecutorHookTests_ClassLevel: Proper fix for shared mutable static state. Usingnameoffor the constraint key is good practice.
There was a problem hiding this comment.
Code Review
Good set of targeted flakiness fixes. The changes are generally clean and address real root causes. A few observations:
OtlpReceiverForwardingTests — redundant delay
After await receiver.WhenIdle(), the receiver has already finished processing the request. If WhenIdle() provides that guarantee, the forwarding decision has already been made — no forwarding will happen later. The extra await Task.Delay(500) is then dead time:
await client.PostAsync($"http://127.0.0.1:{receiver.Port}/v1/traces", content);
await receiver.WhenIdle();
// If WhenIdle() guarantees processing is complete, this delay is redundant:
await Task.Delay(500);
await Assert.That(upstream.HasRequest("/v1/traces")).IsFalse();If WhenIdle() does guarantee completion, the delay can be dropped entirely for a faster, more deterministic test. If it doesn't — i.e., forwarding can race beyond the idle point — then that's a bug in WhenIdle()'s contract that should be fixed there rather than papered over with a sleep. A comment explaining why the delay is still needed would at minimum remove ambiguity.
KeyedDataSourceTests — weakened count assertion
// Before (precise):
await Assert.That(AlphaInstances).Count().IsEqualTo(2);
// After (weaker):
await Assert.That(snapshot.Length).IsGreaterThanOrEqualTo(2);The >= 2 relaxation means a future regression (e.g., 3 or 4 instances created when only 2 are expected) would go undetected. Using ConcurrentBag was the right move for thread-safety, but the count check doesn't need to be loosened — ConcurrentBag.Count is safe to read. Consider keeping IsEqualTo(2) (or the exact expected count) to preserve regression-catching precision.
The rest looks solid
- Line ending normalization: Making
NormalizeLineEndingsinternal staticand calling it inVerifyCodeFixAsyncis the right approach — normalizing early prevents subtle platform-specific comparison failures. HasRequestmethod: Correctly reuses_stateLockand mirrors the pattern already inWaitForRequestAsync. Could use_requests.Any(r => r.Path == path)inside the lock for conciseness, but the explicit loop is equally correct.[NotInParallel]onHookExecutorHookTests_ClassLevel: Proper fix for shared mutable static state. Usingnameoffor the constraint key is good practice.
Up to standards ✅🟢 Issues
|
| Metric | Results |
|---|---|
| Complexity | 0 |
TIP This summary will be updated as you push new changes. Give us feedback
- WhenIdle() already awaits in-flight forwarding tasks; Task.Delay(500) was dead time - Restore IsEqualTo(2) precision on AlphaInstances snapshot — ConcurrentBag preserves exact count
There was a problem hiding this comment.
Code Review (follow-up)
Previous review feedback has been fully addressed in commit e7cc031e1d:
Task.Delay(500)removed — test now relies correctly onWhenIdle()to guarantee forwarding decisions are complete before asserting.- Count assertion restored to
IsEqualTo(2)— preserves regression-catching precision.
Overall assessment
This is a clean set of targeted fixes for four distinct flakiness root causes. The changes are minimal, correct, and avoid papering over races with sleeps or relaxed assertions.
What's solid:
- Line-ending normalization applied at the call site in rather than buried inside the verifier — the right layer to normalize.
- ** method** correctly acquires and mirrors the existing thread-safety contract in .
- **** with for the key is the proper fix for shared mutable static state.
- ** + snapshot** correctly eliminates the index-race; since both alpha tests inject the same keyed instance, and reference-equality is a valid (and now race-free) check.
One minor observation:
could use inside the lock for conciseness, but the explicit loop is equally correct and readable. This is purely cosmetic.
No blockers. This PR is good to merge.
There was a problem hiding this comment.
Code Review (follow-up)
Previous review feedback has been fully addressed in commit e7cc031e1d:
Task.Delay(500)removed — test now relies correctly onWhenIdle()to guarantee forwarding decisions are complete before asserting.- Count assertion restored to
IsEqualTo(2)— preserves regression-catching precision.
Overall assessment
This is a clean set of targeted fixes for four distinct flakiness root causes. The changes are minimal, correct, and avoid papering over races with sleeps or relaxed assertions.
What's solid:
- Line-ending normalization applied at the call site in
VerifyCodeFixAsyncrather than buried inside the verifier — the right layer to normalize. HasRequestmethod correctly acquires_stateLockand mirrors the existing thread-safety contract inWaitForRequestAsync.[NotInParallel]withnameoffor the key is the proper fix for shared mutable static state.ConcurrentBag+ snapshot correctly eliminates the index-race; since both alpha tests inject the same keyed instance,snapshot[0]andsnapshot[1]reference-equality is a valid (and now race-free) check.
One minor observation:
HasRequest could use _requests.Any(r => r.Path == path) inside the lock for conciseness, but the explicit loop is equally correct and readable. Purely cosmetic.
No blockers. This PR is good to merge.
Summary
TestCode,FixedCode, and stub sources to prevent CRLF/LF mismatches across platforms.Task.WhenAnytimeout pattern with an explicitHasRequestcheck after a short settle delay.HookExecutorHookTests_ClassLevelmarked[NotInParallel]so the shared static hook counter cannot race.KeyedDataSourceTestsusesConcurrentBagand asserts all captured instances share the same reference, removing index-order assumptions.Test plan
OtlpReceiverForwardingTests.Receiver_WithoutUpstream_DoesNotForwardpasses under loadHookExecutorHookTests_ClassLevelstable across repeated runsKeyedDataSourceTests.SameKey_ReturnsSameInstancestable across repeated runs