Add testing readiness review and action plan#2
Merged
Chris0Jeky merged 1 commit intomainfrom Nov 18, 2025
Merged
Conversation
Chris0Jeky
added a commit
that referenced
this pull request
Feb 16, 2026
…ove-testing-strategy Add testing readiness review and action plan
This was referenced Mar 29, 2026
This was referenced Apr 8, 2026
Chris0Jeky
added a commit
that referenced
this pull request
Apr 9, 2026
TryConsumeAtomicAsync now includes ExpiresAt > now in the WHERE clause to close the TOCTOU race window between application-level expiry check and SQL execution. DeleteExpiredAsync now uses raw SQL instead of loading all rows into memory (DoS prevention). Also deletes consumed codes to prevent unbounded table growth. Uses EF Core SQLite DateTimeOffset format for correct string comparison. Addresses findings #2 (CRITICAL), #4 (HIGH), #6 (HIGH), #13 (LOW).
This was referenced Apr 16, 2026
Chris0Jeky
added a commit
that referenced
this pull request
Apr 22, 2026
- Fix config path: WorkerSettings:MaxBatchSize -> Workers:MaxBatchSize - Document queue backlog threshold divergence from HealthController's dynamic formula Math.Max(MaxBatchSize * 20, 100) - Fix PromQL examples: metrics are Histograms, not gauges -- use _sum/_count series with appropriate caveats - Add threshold reconciliation section explaining differences with CLOUD_REFERENCE_ARCHITECTURE.md alarm stubs - Fix Known Gap #2: use exact default (30s) instead of approximate (~30s) and show the full Math.Max formula
Chris0Jeky
added a commit
that referenced
this pull request
Apr 22, 2026
* docs: define monitoring and alerting rules (OPS-30) Add docs/ops/ALERTING_RULES.md with 10 alert rules covering API error rate, latency, worker heartbeat, disk, memory, queue backlog, database connectivity, health endpoint, CPU, and Redis backplane. Each rule specifies metric source, threshold, evaluation window, priority (P1/P2), runbook steps, and escalation triggers. Includes integration guidance for Grafana, AWS CloudWatch, PagerDuty, and external uptime monitoring with example PromQL queries and Terraform alarm definitions. Closes #868 * docs: add ALERTING_RULES.md to ops README index Cross-reference the new alerting rules document from the ops directory index alongside the existing observability docs. * docs: update OBSERVABILITY_BASELINE alert thresholds and cross-reference Update the alert threshold baseline section to match the authoritative thresholds in ALERTING_RULES.md and add a callout directing operators to the comprehensive alerting rules document. * docs: add known gaps section to alerting rules Document three known gaps found during adversarial review: 1. OutboundWebhookDeliveryWorker not monitored by health endpoint 2. Health endpoint staleness thresholds differ from alert thresholds 3. No dedicated LLM provider error rate alert Also clarify that Alert 3 applies to workers with OTLP metric emission (LlmQueueToProposalWorker and ProposalHousekeepingWorker only). * fix: correct alerting rules accuracy issues from adversarial review - Fix config path: WorkerSettings:MaxBatchSize -> Workers:MaxBatchSize - Document queue backlog threshold divergence from HealthController's dynamic formula Math.Max(MaxBatchSize * 20, 100) - Fix PromQL examples: metrics are Histograms, not gauges -- use _sum/_count series with appropriate caveats - Add threshold reconciliation section explaining differences with CLOUD_REFERENCE_ARCHITECTURE.md alarm stubs - Fix Known Gap #2: use exact default (30s) instead of approximate (~30s) and show the full Math.Max formula
10 tasks
Chris0Jeky
added a commit
that referenced
this pull request
Apr 25, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Testing
Codex Task