Skip to content

Add reusable named-events timeline correlation query API#229

Merged
PrzemyslawKlys merged 3 commits intomasterfrom
feature/ix-timeline-correlation-core
Feb 20, 2026
Merged

Add reusable named-events timeline correlation query API#229
PrzemyslawKlys merged 3 commits intomasterfrom
feature/ix-timeline-correlation-core

Conversation

@PrzemyslawKlys
Copy link
Member

Summary

  • add new EventViewerX.Reports.Correlation contracts for reusable named-events timeline/correlation queries
  • add NamedEventsTimelineQueryExecutor.TryBuildAsync(...) to build ordered timeline rows, correlation groups, and time buckets without hardcoded report templates
  • support explicit correlation dimensions (who, object_affected, computer, action, etc.), optional uncorrelated filtering, group caps, bucket sizing, and payload projection
  • add focused unit tests for request validation and error contracts

Validation

  • dotnet build .\\Sources\\EventViewerX\\EventViewerX.csproj -c Release
  • dotnet test .\\Sources\\EventViewerX.Tests\\EventViewerX.Tests.csproj -c Release --filter "FullyQualifiedName~TestNamedEventsTimelineQueryExecutor"

@chatgpt-codex-connector
Copy link

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.
To continue using code reviews, add credits to your account and enable them for code reviews in your settings.

@claude
Copy link

claude bot commented Feb 20, 2026

Claude finished @PrzemyslawKlys's task —— View job


Code Review Complete

Reviewing PR #229 for:

  • Read repository guidelines (CLAUDE.md) - No repo guidelines found
  • Analyze contract definitions (NamedEventsTimelineQueryContracts.cs)
  • Review executor implementation (NamedEventsTimelineQueryExecutor.cs)
  • Examine test coverage (TestNamedEventsTimelineQueryExecutor.cs)
  • Evaluate code quality and best practices
  • Check for potential bugs and security concerns
  • Assess performance considerations
  • Provide comprehensive feedback

Summary

This PR introduces a well-architected timeline correlation query API with solid defensive coding practices. The implementation demonstrates strong attention to performance optimization and error handling. Overall code quality is high with good separation of concerns.

Code Quality & Best Practices ✅

Strengths:

  • Excellent separation of contracts from implementation (NamedEventsTimelineQueryContracts.cs:1-372)
  • Comprehensive input validation with clear error messages (NamedEventsTimelineQueryExecutor.cs:303-486)
  • Proper use of readonly interfaces and immutable collections for public APIs
  • Good naming conventions and XML documentation coverage
  • Performance optimizations with caching strategies (PayloadExtractionPlanCache at line 38, SnakeCaseCache at line 39)

Minor Issues:

  • Consider using IAsyncEnumerable<T> instead of await foreach for better async composition in TryBuildAsync (NamedEventsTimelineQueryExecutor.cs:105-113)
  • Magic numbers like MaxCorrelationKeys = 8 (line 14) could benefit from XML documentation explaining the rationale

Security Concerns ✅

Good Security Practices:

  • No direct SQL injection risks - uses strongly typed parameters
  • Proper input sanitization and validation throughout
  • Safe string handling with null checks and trimming
  • Bounded cache sizes prevent memory exhaustion attacks (MaxSnakeCaseCacheEntries = 4096)
  • Reasonable resource limits (MaxGroupsCap = 2000, MaxThreadsCap = 8)

Cryptographic Usage:

  • SHA-256 usage for correlation IDs is appropriate (NamedEventsTimelineQueryExecutor.cs:588-592)
  • Only uses first 8 bytes of hash, which is documented as suitable for investigation-scale grouping (line 137-139)
  • Proper disposal of SHA256 instance with using statement

Performance Considerations ✅

Optimizations Implemented:

  • Reflection metadata caching in PayloadExtractionPlanCache prevents repeated reflection calls (line 634-636)
  • Snake case conversion caching with bounded size (lines 793-802)
  • Efficient StringBuilder capacity estimation (line 573)
  • Proper use of StringComparer.OrdinalIgnoreCase for case-insensitive operations

Potential Improvements:

  • Consider using ArrayPool<T> for temporary collections if processing very large datasets
  • The ToArray() calls in group materialization (lines 262-264) could be optimized to use ToList() if order doesn't matter

Potential Bugs & Issues 🔍

Threading Safety:

  • ConcurrentDictionary usage is correct for the caches (lines 38-39)
  • No apparent race conditions in the main execution flow

Error Handling:

  • Comprehensive exception handling with proper error categorization (lines 152-167)
  • Good graceful handling of reflection failures with debug logging (lines 648-653)

Edge Cases:

  • Null input handling is thorough throughout
  • UTC date parsing is robust with culture-invariant parsing (lines 726-738)
  • Bucket time calculations handle edge cases properly (lines 618-627)

Test Coverage 🟡

Current Tests Cover:

  • Input validation scenarios (all 5 test methods)
  • Error condition handling
  • Invalid argument detection

Missing Test Coverage:

  • No tests for successful execution paths with actual data
  • No tests for correlation grouping logic
  • No tests for timeline ordering
  • No tests for bucket calculations
  • No performance/load tests for caching behavior

Recommended Additional Tests:

[Fact]
public async Task TryBuildAsync_ShouldReturnTimelineWithValidRequest()

[Fact]  
public async Task TryBuildAsync_ShouldCorrelateEventsByKeys()

[Fact]
public async Task TryBuildAsync_ShouldRespectMaxEventsLimit()

Specific Code Issues

  1. NamedEventsTimelineQueryExecutor.cs:494 - Null guard for item parameter is good, but consider using ArgumentNullException.ThrowIfNull(item) (.NET 6+) for consistency

  2. NamedEventsTimelineQueryExecutor.cs:591 - Consider using Convert.ToHexString() (.NET 5+) instead of BitConverter.ToString() for better performance

  3. NamedEventsTimelineQueryExecutor.cs:280-283 - The Distinct() and OrderBy() chain could be optimized by using ToHashSet() followed by OrderBy()

Recommendations

  1. Add Integration Tests: The current test suite only covers validation. Add tests that verify the actual correlation logic works correctly.

  2. Performance Testing: Consider adding benchmarks for the caching mechanisms and large dataset processing.

  3. Documentation: The correlation ID documentation is excellent (lines 137-139). Consider similar detail for other complex algorithms.

  4. Monitoring: Consider adding telemetry/metrics for cache hit rates and query performance.

This is a solid implementation that follows good practices. The main gap is comprehensive testing of the happy path scenarios and the correlation logic itself.

@PrzemyslawKlys
Copy link
Member Author

Addressed review feedback in latest push (8f2487b):

  • Added explicit documentation that CorrelationId uses an 8-byte SHA-256 prefix and advised retaining raw correlation values for high-volume archival use.
  • Switched Distinct() on named events to explicit comparer.
  • Added debug visibility for reflection property read failures instead of silent swallow.
  • Removed duplicate correlation-value computation in timeline materialization by caching once on accumulator rows.
  • Improved StringBuilder capacity estimation in BuildCorrelationToken.
  • Hardened date parsing in ReadPayloadUtc with invariant DateTimeOffset.TryParse + UTC normalization.
  • Added explicit null guard for ToAccumulator item input.

Validation rerun:

  • dotnet build .\\Sources\\EventViewerX\\EventViewerX.csproj -c Release
  • dotnet test .\\Sources\\EventViewerX.Tests\\EventViewerX.Tests.csproj -c Release --filter "FullyQualifiedName~TestNamedEventsTimelineQueryExecutor"

Deferred by design for now:

  • strict payload key schema validation against all possible rule payloads (currently remains permissive to avoid false negatives across heterogeneous event types).

@PrzemyslawKlys
Copy link
Member Author

Follow-up updates from the latest performance/code-quality review are now pushed in commit 0607da4.

Addressed in this commit:

  • Cached reflection metadata for payload extraction (ExtractPayload) via PayloadExtractionPlanCache, so we no longer rebuild field/property maps per event.
  • Added bounded snake_case memoization (SnakeCaseCache, capped) to avoid repeating normalization work for common keys/type names.
  • Replaced the "-" literal in correlation-id hex normalization with a named constant (HexSeparator).
  • Removed two avoidable list->array copies in result shaping (EffectiveMachines, CorrelationKeys) by returning existing normalized lists as IReadOnlyList.

Intentionally deferred:

  • await foreach batching for very high-volume streams. That changes flow-control/memory behavior and is better done as a dedicated benchmarked change (separate PR) to avoid mixing behavioral/perf risk in this one.

@PrzemyslawKlys PrzemyslawKlys merged commit 48967e5 into master Feb 20, 2026
5 checks passed
@PrzemyslawKlys PrzemyslawKlys deleted the feature/ix-timeline-correlation-core branch February 20, 2026 14:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant