Skip to content

[OPIK-5205] [SDK] fix: improve error messages for opik connect CLI command#6004

Merged
petrotiurin merged 2 commits intomainfrom
worktree-opik_connect_error_messages
Mar 31, 2026
Merged

[OPIK-5205] [SDK] fix: improve error messages for opik connect CLI command#6004
petrotiurin merged 2 commits intomainfrom
worktree-opik_connect_error_messages

Conversation

@petrotiurin
Copy link
Copy Markdown
Contributor

Details

Improves the opik connect CLI command to fail fast with clear, actionable error messages instead of confusing failures deep in the connection flow:

  • Validates that a command is provided before making any API calls, with usage example
  • Checks the command exists on PATH and is executable before connecting to the server
  • Clarifies the OPIK_RUNNER_ID missing error to tell users to use opik connect instead of setting env vars manually
  • Warns on the first poll failure (previously silent at debug level), then drops to debug for subsequent retries to avoid spam
  • Includes the actual timeout value in job timeout warnings and error reports
  • Adds debug log when skipping cancelled jobs for traceability

Change checklist

  • User facing
  • Documentation update

Issues

  • OPIK-5205

AI-WATERMARK

AI-WATERMARK: yes

  • Tools: Claude Code
  • Model(s): claude-sonnet-4-6
  • Scope: full implementation
  • Human verification: code review + manual testing

Testing

  • python -m pytest tests/unit/runner/ -v — all 32 tests pass
  • Pre-commit hooks pass (ruff, ruff-format, mypy)
  • Manually verified opik connect with no command, nonexistent command, and non-executable file shows correct errors with exit code 2

Documentation

N/A

…mmand

- Fail fast with clear message when no command is provided (before API call)
- Validate command exists on PATH and is executable before connecting
- Improve OPIK_RUNNER_ID missing error to explain opik connect is required
- Warn on first poll failure instead of silently debug-logging
- Include timeout value in job timeout warning and error report
- Log debug when skipping cancelled jobs
- Update tests to supply command argument and mock os.execvpe

Implements OPIK-5205: [SDK] improve: better error messages for opik connect with missing or invalid executable

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@github-actions github-actions bot added python Pull requests that update Python code tests Including test files, or tests related like configuration. Python SDK labels Mar 31, 2026
Comment on lines 24 to +34
) -> None:
"""Connect a local runner to Opik and exec the user command."""
if not command:
click.echo(
"Error: Missing command.\n\n"
"Usage: opik connect [OPTIONS] COMMAND [ARGS]...\n\n"
"Example: opik connect --pair <code> python3 main.py",
err=True,
)
raise SystemExit(2)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

connect rejects bare opik connect but the onboarding UI still instructs opik connect --pair <code> — should we restore the no-command mode or update the onboarding/docs/UX to require and document a concrete COMMAND?

Finding type: Breaking Changes | Severity: 🔴 High


Want Baz to fix this for you? Activate Fixer

Other fix methods

Fix in Cursor

Prompt for AI Agents:

Before applying, verify this suggestion against the current code. In
sdks/python/src/opik/cli/connect.py around lines 24-34, the connect function was changed
to reject calls with no COMMAND and exits with SystemExit(2). Restore backward
compatibility by removing the early error/exit for missing command: if command is empty,
do not validate or exec an executable; instead set the env (as already built later),
print a short informational message like "No command specified. Set env vars and
exiting.", call client.end(), and return successfully. Only perform the executable
resolution, access checks, and os.execvpe when command is non-empty; keep other error
handling unchanged.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

UI will be updated. There's no reason to run opik connect without the command as it becomes a no-op.

…cript runner loop

- Warn on first poll failure (API or network), drop to debug for subsequent retries
- Debug log when skipping cancelled jobs
- Include timeout value in job timeout warning and error message
- Escalate reportJobResult failure from debug to warn

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@petrotiurin
Copy link
Copy Markdown
Contributor Author

Langchain issues unrelated.

@petrotiurin petrotiurin marked this pull request as ready for review March 31, 2026 15:55
@petrotiurin petrotiurin requested a review from a team as a code owner March 31, 2026 15:55
@petrotiurin petrotiurin merged commit 3fb9640 into main Mar 31, 2026
222 of 237 checks passed
@petrotiurin petrotiurin deleted the worktree-opik_connect_error_messages branch March 31, 2026 16:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Python SDK python Pull requests that update Python code tests Including test files, or tests related like configuration. TypeScript SDK typescript *.ts *.tsx

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants