Skip to content

[OPIK-5750] [SDK] feat: add error visibility to runner TUI for crashes and crash loops#6154

Merged
petrotiurin merged 6 commits intomainfrom
petrotiurin/OPIK-5750-runner-tui-error-visibility
Apr 9, 2026
Merged

[OPIK-5750] [SDK] feat: add error visibility to runner TUI for crashes and crash loops#6154
petrotiurin merged 6 commits intomainfrom
petrotiurin/OPIK-5750-runner-tui-error-visibility

Conversation

@petrotiurin
Copy link
Copy Markdown
Contributor

@petrotiurin petrotiurin commented Apr 9, 2026

Details

The runner supervisor silently restarted the child process on crashes without notifying the user via the TUI. This adds two missing visibility paths:

  1. When a child crashes and is restarted, the TUI now shows "Restarting: process exited with code N" with the reason.
  2. When crash-loop protection activates, the TUI shows a red error: "Crash loop detected — waiting for file change to retry".

A general-purpose on_error callback is added to Supervisor (separate from on_child_restart) so the TUI can distinguish informational restarts from error states.

With this change:
image

Change checklist

  • User facing
  • Documentation update

Issues

  • OPIK-5750

AI-WATERMARK

AI-WATERMARK: yes

  • Tools: Claude Code
  • Model(s): Claude Opus 4.6
  • Scope: full implementation
  • Human verification: code review + manual testing

Testing

  • Verified pre-commit hooks pass (ruff, ruff-format, mypy)
  • Changes are in callback wiring and TUI rendering — manual testing with opik connect against a crashing child process validates the error and restart messages appear

Documentation

N/A

…s and crash loops

The supervisor now notifies the TUI when a child process crashes and is
restarted, and when crash-loop protection activates. Adds a general-purpose
on_error callback to Supervisor and a corresponding error() method on
RunnerTUI. The child_restarted message now includes the reason.

Implements OPIK-5750

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@github-actions github-actions bot added python Pull requests that update Python code Python SDK labels Apr 9, 2026
petrotiurin and others added 2 commits April 9, 2026 13:19
…isibility

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@github-actions github-actions bot added the tests Including test files, or tests related like configuration. label Apr 9, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 9, 2026

Python SDK Unit Tests Results (Python 3.11)

2 820 tests   2 820 ✅  1m 47s ⏱️
    1 suites      0 💤
    1 files        0 ❌

Results for commit d51a459.

♻️ This comment has been updated with latest results.

petrotiurin and others added 2 commits April 9, 2026 13:39
…akiness

All tests that call sup.run() now pass watch=False to _make_supervisor to
avoid the file watcher interfering with test timing and shutdown events.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…or child exit logic

TestChildExit now uses mocks to test exit handling logic directly instead of
spawning actual Python processes. Tests now run in 0.05s instead of 8+ seconds.

- test_restarts_on_nonzero_exit_if_stable: verify callback fires on crash restart
- test_triggers_error_on_crash_loop: verify error callback fires when guard unstable
- test_clean_exit_zero_no_restart: verify clean exit sets shutdown event

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The test had a nonsensical if 0 == 0: placeholder that didn't actually test anything.
The behavior it attempted to document (exit code 0 → shutdown) is already covered by
_main_loop integration tests in TestShutdown. Removed to keep TestChildExit focused
on testing _handle_child_exit behavior with fast unit tests.

Implements OPIK-5750
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 9, 2026

Python SDK E2E Tests Results (Python 3.11)

365 tests   363 ✅  16m 32s ⏱️
  1 suites    2 💤
  1 files      0 ❌

Results for commit 7a72b31.

♻️ This comment has been updated with latest results.

@petrotiurin petrotiurin marked this pull request as ready for review April 9, 2026 14:08
@petrotiurin petrotiurin requested a review from a team as a code owner April 9, 2026 14:08
@petrotiurin petrotiurin merged commit b19974e into main Apr 9, 2026
215 of 225 checks passed
@petrotiurin petrotiurin deleted the petrotiurin/OPIK-5750-runner-tui-error-visibility branch April 9, 2026 14:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Python SDK python Pull requests that update Python code tests Including test files, or tests related like configuration.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants