Skip to content

fix: stabilize flaky tests across analyzer, OTel, and engine suites#5609

Merged
thomhurst merged 2 commits intomainfrom
fix/flaky-test-stability
Apr 17, 2026
Merged

fix: stabilize flaky tests across analyzer, OTel, and engine suites#5609
thomhurst merged 2 commits intomainfrom
fix/flaky-test-stability

Conversation

@thomhurst
Copy link
Copy Markdown
Owner

Summary

  • Analyzer verifier now normalizes line endings for TestCode, FixedCode, and stub sources to prevent CRLF/LF mismatches across platforms.
  • OTLP no-forward test replaces the racy Task.WhenAny timeout pattern with an explicit HasRequest check after a short settle delay.
  • HookExecutorHookTests_ClassLevel marked [NotInParallel] so the shared static hook counter cannot race.
  • KeyedDataSourceTests uses ConcurrentBag and asserts all captured instances share the same reference, removing index-order assumptions.

Test plan

  • CI green on Windows + Linux
  • Analyzer tests pass regardless of source line-ending style
  • OtlpReceiverForwardingTests.Receiver_WithoutUpstream_DoesNotForward passes under load
  • HookExecutorHookTests_ClassLevel stable across repeated runs
  • KeyedDataSourceTests.SameKey_ReturnsSameInstance stable across repeated runs

- Normalize line endings on all analyzer test sources (including stubs) to prevent cross-platform mismatches
- Replace racy WhenAny timeout with explicit HasRequest check in OTLP receiver no-forward test
- Serialize HookExecutorHookTests_ClassLevel via NotInParallel to avoid hook-count races
- Use ConcurrentBag and reference-equality loop for KeyedDataSource shared-instance assertion
Copy link
Copy Markdown
Contributor

@claude claude Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

Good set of targeted flakiness fixes. The changes are generally clean and address real root causes. A few observations:


OtlpReceiverForwardingTests — redundant delay

After , the receiver has already finished processing the request. If WhenIdle() provides that guarantee, the forwarding decision has already been made — no forwarding will happen later. The extra await Task.Delay(500) is then dead time:

await client.PostAsync($"http://127.0.0.1:{receiver.Port}/v1/traces", content);
await receiver.WhenIdle();

// If WhenIdle() guarantees processing is complete, this delay is redundant:
await Task.Delay(500);

await Assert.That(upstream.HasRequest("/v1/traces")).IsFalse();

If WhenIdle() does guarantee completion, the delay can be dropped entirely for a faster, more deterministic test. If it doesn't — i.e., forwarding can race beyond the idle point — then that's a bug in WhenIdle()'s contract that should be fixed there rather than papered over with a sleep. A comment explaining why the delay is still needed would at minimum remove ambiguity.


KeyedDataSourceTests — weakened count assertion

// Before (precise):
await Assert.That(AlphaInstances).Count().IsEqualTo(2);

// After (weaker):
await Assert.That(snapshot.Length).IsGreaterThanOrEqualTo(2);

The >= 2 relaxation means a future regression (e.g., 3 or 4 instances created when only 2 are expected) would go undetected. Using ConcurrentBag was the right move for thread-safety, but the count check doesn't need to be loosened — ConcurrentBag.Count is safe to read. Consider keeping IsEqualTo(2) (or the exact expected count) to preserve regression-catching precision.


The rest looks solid

  • Line ending normalization: Making NormalizeLineEndings internal static and calling it at the call site in VerifyCodeFixAsync is the right approach — normalizing early prevents subtle platform-specific comparison failures.
  • HasRequest method: Correctly reuses _stateLock and mirrors the pattern already in WaitForRequestAsync. Could use _requests.Any(r => r.Path == path) inside the lock for conciseness, but the explicit loop is equally correct.
  • [NotInParallel] on HookExecutorHookTests_ClassLevel: Proper fix for shared mutable static state. Using nameof for the constraint key is good practice.

Copy link
Copy Markdown
Contributor

@claude claude Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

Good set of targeted flakiness fixes. The changes are generally clean and address real root causes. A few observations:


OtlpReceiverForwardingTests — redundant delay

After await receiver.WhenIdle(), the receiver has already finished processing the request. If WhenIdle() provides that guarantee, the forwarding decision has already been made — no forwarding will happen later. The extra await Task.Delay(500) is then dead time:

await client.PostAsync($"http://127.0.0.1:{receiver.Port}/v1/traces", content);
await receiver.WhenIdle();

// If WhenIdle() guarantees processing is complete, this delay is redundant:
await Task.Delay(500);

await Assert.That(upstream.HasRequest("/v1/traces")).IsFalse();

If WhenIdle() does guarantee completion, the delay can be dropped entirely for a faster, more deterministic test. If it doesn't — i.e., forwarding can race beyond the idle point — then that's a bug in WhenIdle()'s contract that should be fixed there rather than papered over with a sleep. A comment explaining why the delay is still needed would at minimum remove ambiguity.


KeyedDataSourceTests — weakened count assertion

// Before (precise):
await Assert.That(AlphaInstances).Count().IsEqualTo(2);

// After (weaker):
await Assert.That(snapshot.Length).IsGreaterThanOrEqualTo(2);

The >= 2 relaxation means a future regression (e.g., 3 or 4 instances created when only 2 are expected) would go undetected. Using ConcurrentBag was the right move for thread-safety, but the count check doesn't need to be loosened — ConcurrentBag.Count is safe to read. Consider keeping IsEqualTo(2) (or the exact expected count) to preserve regression-catching precision.


The rest looks solid

  • Line ending normalization: Making NormalizeLineEndings internal static and calling it in VerifyCodeFixAsync is the right approach — normalizing early prevents subtle platform-specific comparison failures.
  • HasRequest method: Correctly reuses _stateLock and mirrors the pattern already in WaitForRequestAsync. Could use _requests.Any(r => r.Path == path) inside the lock for conciseness, but the explicit loop is equally correct.
  • [NotInParallel] on HookExecutorHookTests_ClassLevel: Proper fix for shared mutable static state. Using nameof for the constraint key is good practice.

@codacy-production
Copy link
Copy Markdown

codacy-production Bot commented Apr 17, 2026

Up to standards ✅

🟢 Issues 0 issues

Results:
0 new issues

View in Codacy

🟢 Metrics 0 complexity

Metric Results
Complexity 0

View in Codacy

TIP This summary will be updated as you push new changes. Give us feedback

- WhenIdle() already awaits in-flight forwarding tasks; Task.Delay(500) was dead time
- Restore IsEqualTo(2) precision on AlphaInstances snapshot — ConcurrentBag preserves exact count
Copy link
Copy Markdown
Contributor

@claude claude Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review (follow-up)

Previous review feedback has been fully addressed in commit e7cc031e1d:

  • Task.Delay(500) removed — test now relies correctly on WhenIdle() to guarantee forwarding decisions are complete before asserting.
  • Count assertion restored to IsEqualTo(2) — preserves regression-catching precision.

Overall assessment

This is a clean set of targeted fixes for four distinct flakiness root causes. The changes are minimal, correct, and avoid papering over races with sleeps or relaxed assertions.

What's solid:

  • Line-ending normalization applied at the call site in rather than buried inside the verifier — the right layer to normalize.
  • ** method** correctly acquires and mirrors the existing thread-safety contract in .
  • **** with for the key is the proper fix for shared mutable static state.
  • ** + snapshot** correctly eliminates the index-race; since both alpha tests inject the same keyed instance, and reference-equality is a valid (and now race-free) check.

One minor observation:

could use inside the lock for conciseness, but the explicit loop is equally correct and readable. This is purely cosmetic.

No blockers. This PR is good to merge.

Copy link
Copy Markdown
Contributor

@claude claude Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review (follow-up)

Previous review feedback has been fully addressed in commit e7cc031e1d:

  • Task.Delay(500) removed — test now relies correctly on WhenIdle() to guarantee forwarding decisions are complete before asserting.
  • Count assertion restored to IsEqualTo(2) — preserves regression-catching precision.

Overall assessment

This is a clean set of targeted fixes for four distinct flakiness root causes. The changes are minimal, correct, and avoid papering over races with sleeps or relaxed assertions.

What's solid:

  • Line-ending normalization applied at the call site in VerifyCodeFixAsync rather than buried inside the verifier — the right layer to normalize.
  • HasRequest method correctly acquires _stateLock and mirrors the existing thread-safety contract in WaitForRequestAsync.
  • [NotInParallel] with nameof for the key is the proper fix for shared mutable static state.
  • ConcurrentBag + snapshot correctly eliminates the index-race; since both alpha tests inject the same keyed instance, snapshot[0] and snapshot[1] reference-equality is a valid (and now race-free) check.

One minor observation:

HasRequest could use _requests.Any(r => r.Path == path) inside the lock for conciseness, but the explicit loop is equally correct and readable. Purely cosmetic.

No blockers. This PR is good to merge.

This was referenced Apr 27, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant