[NA] [E2E] fix: increase online scoring test timeouts for GitHub-hosted runners by AndreiCautisanu · Pull Request #6024 · comet-ml/opik

AndreiCautisanu · 2026-04-01T11:37:59Z

Details

Online scoring E2E tests were consistently timing out (60s) on GitHub-hosted runners in post-merge CI while passing on self-hosted runners. Investigation via Allure TestOps showed the scoring pipeline (rule activation → LLM API call → score storage) takes longer on resource-constrained ubuntu-latest runners compared to self-hosted automation-q5pjv-* pods.

Test timeout increased from 60s to 120s
Polling attempts increased from 15 to 25
Page refresh wait increased from 2s to 3s

Change checklist

User facing
Documentation update

Issues

NA — flaky test fix for post-merge CI

AI-WATERMARK

AI-WATERMARK: yes

Tools: Claude Code
Model(s): Claude Opus 4.6
Scope: investigation + fix
Human verification: code review, will verify in next post-merge run

Testing

Verified by analyzing Allure TestOps data:

Post-merge [OPIK-374] Add bulk actions to all tables #804 (launch 53489): 6/8 online scoring tests timed out at ~62s
Post-merge [OPIK-515] implement opik client auth_check method #805 (launch 53491): 2/8 timed out at ~63s
Local Happy Paths (launch 53456): all passed in 28-51s
New timeout budget: ~15s setup + 25 retries × ~3.5s ≈ 102s, within 120s limit

Documentation

N/A

…ed runners Online scoring E2E tests were consistently timing out (60s) on GitHub-hosted runners in post-merge CI, while passing on self-hosted runners. The scoring pipeline (rule activation -> trace creation -> LLM API call -> score storage) takes longer on resource-constrained GitHub-hosted runners. - Increase test timeout from 60s to 120s - Increase polling attempts from 15 to 25 - Increase page refresh wait from 2s to 3s Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

baz-reviewer · 2026-04-01T11:41:04Z

tests_end_to_end/typescript-tests/tests/online-scoring/online-scoring.spec.ts

 // Timeout constants
 const RULE_ACTIVATION_TIMEOUT = 10000; // 10 seconds for rule to fully activate in backend
-const PAGE_REFRESH_TIMEOUT = 2000; // 2 seconds wait after page refresh
+const PAGE_REFRESH_TIMEOUT = 3000; // 3 seconds wait after page refresh


Should we replace PAGE_REFRESH_TIMEOUT = 3000 and any page.waitForTimeout uses with retry-capable assertions or helper waits per .agents/skills/playwright-e2e/test-conventions.md, and remove the sleep in the moderation scoring retry loop?

_{Finding type: AI Coding Guidelines | Severity: 🟢 Low}

Want Baz to fix this for you? Activate Fixer

Other fix methods

Prompt for AI Agents:

Before applying, verify this suggestion against the current code. In tests_end_to_end/typescript-tests/tests/online-scoring/online-scoring.spec.ts around lines 10 to 10, the PAGE_REFRESH_TIMEOUT constant is set to 3000ms which introduces a hard-coded delay >2s; per test conventions remove or reduce this constant and do not use it for page.waitForTimeout. Replace any uses of PAGE_REFRESH_TIMEOUT and any page.waitForTimeout calls (search around lines 120-140 and specifically 129-133) with retry-capable helpers such as expect(...).toBeVisible({ timeout: 2000 }) or helperClient.waitUntil/polling helpers. Also in the moderation scoring retry loop around lines 106 to 113 (and any loop using sleeps), remove the sleep-based waits and implement polling using expect/assert-with-timeout or a helper that retries until the moderation column appears or a total timeout is reached, ensuring no single hard-coded pause exceeds 2s.

github-actions bot added tests Including test files, or tests related like configuration. typescript *.ts *.tsx labels Apr 1, 2026

github-actions bot assigned AndreiCautisanu Apr 1, 2026

AndreiCautisanu marked this pull request as ready for review April 1, 2026 11:38

AndreiCautisanu requested review from a team as code owners April 1, 2026 11:38

NatZol approved these changes Apr 1, 2026

View reviewed changes

baz-reviewer bot reviewed Apr 1, 2026

View reviewed changes

AndreiCautisanu merged commit 5f2de55 into main Apr 1, 2026
10 of 11 checks passed

AndreiCautisanu deleted the andreicautisanu/NA-increase-online-scoring-test-timeouts branch April 1, 2026 11:41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[NA] [E2E] fix: increase online scoring test timeouts for GitHub-hosted runners#6024

[NA] [E2E] fix: increase online scoring test timeouts for GitHub-hosted runners#6024
AndreiCautisanu merged 1 commit intomainfrom
andreicautisanu/NA-increase-online-scoring-test-timeouts

AndreiCautisanu commented Apr 1, 2026

Uh oh!

baz-reviewer bot Apr 1, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

AndreiCautisanu commented Apr 1, 2026

Details

Change checklist

Issues

AI-WATERMARK

Testing

Documentation

Uh oh!

baz-reviewer bot Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants