Skip to content

feat(web): fix jobs UI parity for non-sandbox mode#491

Merged
think-in-universe merged 2 commits intomainfrom
worktree-jobs-ui-parity-fixes
Mar 3, 2026
Merged

feat(web): fix jobs UI parity for non-sandbox mode#491
think-in-universe merged 2 commits intomainfrom
worktree-jobs-ui-parity-fixes

Conversation

@henrypark133
Copy link
Copy Markdown
Collaborator

Summary

  • Agent jobs now broadcast live SSE events to the web UI Activity tab (previously only sandbox jobs had real-time streaming)
  • Agent job restart via scheduler.dispatch_job — creates a real job with failure context instead of sending a chat message
  • Follow-up prompts for agent jobs via WorkerMessage::UserMessage injection through the scheduler
  • Capability flags (can_restart, can_prompt, job_kind) in job detail API, with conditional UI rendering
  • Rate-limit resilienceRetry-After header parsing (both seconds and HTTP-date), capped retry (10 consecutive), marks Stuck instead of infinite loop
  • Plan interruption on user message — breaks out of plan execution and falls through to direct selection loop for LLM re-evaluation
  • Correct SSE status in mark_completed/mark_failed/mark_stuck (previously all broadcast status: "completed")
  • SseManager preserved across rebuild_state calls via from_sender() constructor

Test plan

  • cargo clippy --all --benches --tests --examples --all-features — zero warnings
  • cargo test — all pass (1 pre-existing failure in registry::artifacts unrelated)
  • Start without sandbox (SANDBOX_ENABLED=false), verify jobs tab works
  • Create a job via chat, verify Activity tab shows live SSE events
  • Verify failed agent jobs show Retry button in detail view (not list view)
  • Verify follow-up prompts work for in-progress agent jobs
  • Verify prompt bar hidden for completed/stuck jobs
  • Verify rate-limit errors show retry status in Activity tab instead of failing

🤖 Generated with Claude Code

The web gateway Jobs UI was built primarily for sandbox (Docker) jobs.
When running without sandbox (common for NEAR AI hosted envs), multiple
features were broken. This change fixes all of them:

- Agent jobs now broadcast live SSE events to the web UI (Activity tab)
- Agent job restart via scheduler.dispatch_job (not chat message)
- Follow-up prompts for agent jobs via WorkerMessage injection
- Capability flags (can_restart, can_prompt, job_kind) in job detail API
- Rate-limit retry with cap (10 consecutive) and Retry-After header parsing
- Plan interruption on user message (breaks out of plan, re-evaluates)
- Correct SSE status field in mark_completed/mark_failed/mark_stuck
- SseManager preserved across rebuild_state calls

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@github-actions github-actions Bot added size: XL 500+ changed lines scope: agent Agent core (agent loop, router, scheduler) scope: channel/web Web gateway channel scope: db Database trait / abstraction scope: db/postgres PostgreSQL backend scope: llm LLM integration risk: medium Business logic, config, or moderate-risk modules contributor: core 20+ merged PRs and removed size: XL 500+ changed lines labels Mar 3, 2026
@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the web UI's interaction with agent jobs by introducing real-time event streaming, enabling job restarts with failure context, and allowing users to send follow-up prompts. It also bolsters the system's resilience against LLM rate limits and refines the UI's dynamic display of job capabilities, bringing non-sandbox job parity closer to sandbox mode.

Highlights

  • Live SSE Events for Agent Jobs: Agent jobs now broadcast live Server-Sent Events (SSE) to the web UI Activity tab, providing real-time streaming previously only available for sandbox jobs.
  • Agent Job Restart Functionality: Agent jobs can now be restarted via scheduler.dispatch_job, creating a new job with failure context instead of just sending a chat message.
  • Follow-up Prompts for Agent Jobs: The system now supports follow-up prompts for agent jobs through WorkerMessage::UserMessage injection via the scheduler.
  • UI Capability Flags: New capability flags (can_restart, can_prompt, job_kind) have been added to the job detail API to enable conditional UI rendering for restart and prompt functionalities.
  • Rate-Limit Resilience: Improved rate-limit handling includes Retry-After header parsing (both seconds and HTTP-date formats), capped retries (10 consecutive), and marking jobs as 'Stuck' instead of infinite loops.
  • Plan Interruption by User Message: User messages can now interrupt ongoing plan execution, causing the agent to break out of the plan and fall through to a direct selection loop for LLM re-evaluation.
  • Correct SSE Status Broadcasts: The mark_completed, mark_failed, and mark_stuck methods now broadcast the correct SSE status, addressing a previous issue where all broadcast status: "completed".
  • SSE Manager Preservation: The SseManager is now preserved across rebuild_state calls using a from_sender() constructor, ensuring sender handles remain valid.
Changelog
  • src/agent/agent_loop.rs
    • Added sse_tx field to AgentDeps struct to enable SSE broadcast.
    • Configured Scheduler to receive the sse_tx sender during initialization.
  • src/agent/dispatcher.rs
    • Updated test setups to include sse_tx: None in AgentDeps initialization.
  • src/agent/scheduler.rs
    • Imported SseEvent for use in SSE broadcasting.
    • Added UserMessage variant to WorkerMessage enum for injecting user prompts.
    • Introduced sse_tx field to Scheduler struct for broadcasting live events.
    • Implemented set_sse_sender method to configure the SSE broadcast channel.
    • Added send_message method to allow sending follow-up user messages to running jobs.
    • Passed sse_tx to WorkerDeps when creating a new worker.
  • src/agent/worker.rs
    • Imported SseEvent for use in SSE broadcasting.
    • Added sse_tx field to WorkerDeps struct for broadcasting live events.
    • Modified log_event to not only persist events to DB but also broadcast them as SSE events for live web UI updates, mapping various event types to SseEvent variants.
    • Updated the worker's main loop (run) to handle WorkerMessage::UserMessage for injecting follow-up prompts.
    • Implemented rate-limit retry logic in select_tools and respond_with_tools methods, including Retry-After header parsing and a cap on consecutive retries.
    • Added logic to reset the rate-limit counter after a successful LLM interaction.
    • Modified execute_plan to handle WorkerMessage::UserMessage, interrupting the plan and returning control to the direct selection loop.
    • Adjusted execute_plan to fall through to the direct selection loop if the plan completes but work remains, rather than marking the job stuck.
    • Corrected the status field in result SSE events for mark_completed, mark_failed, and mark_stuck to reflect the actual job state.
  • src/channels/web/handlers/jobs.rs
    • Modified jobs_detail_handler to retrieve job mode and is_claude_code status.
    • Added can_restart, can_prompt, and job_kind fields to JobDetailResponse for both sandbox and agent jobs, with conditional logic for their values.
    • Updated jobs_restart_handler to support restarting both sandbox jobs (via job manager) and agent jobs (via scheduler), enriching agent job restarts with failure context.
    • Refactored jobs_prompt_handler to route follow-up prompts to either Claude Code sandbox jobs (via prompt queue) or agent jobs (via scheduler), with appropriate error handling for unsupported scenarios.
  • src/channels/web/mod.rs
    • Added scheduler field to GatewayState to hold the scheduler slot.
    • Implemented SseManager::from_sender to preserve the existing broadcast channel across rebuild_state calls, ensuring continuous SSE streaming.
    • Added with_scheduler method to GatewayChannel to inject the scheduler into the gateway state.
  • src/channels/web/server.rs
    • Added scheduler field of type SchedulerSlot to GatewayState.
  • src/channels/web/sse.rs
    • Added from_sender constructor to SseManager to allow reusing an existing broadcast sender, crucial for maintaining SSE connections across state rebuilds.
  • src/channels/web/static/app.js
    • Removed the restart button from the job list view.
    • Modified the job detail view to conditionally display a 'Retry' button based on job.can_restart and changed its text from 'Restart' to 'Retry'.
    • Implemented conditional rendering for the activity input bar (prompt bar) based on job.can_prompt.
    • Added null checks for sendBtn, doneBtn, and input before attaching event listeners to the activity input bar.
  • src/channels/web/types.rs
    • Added can_restart, can_prompt, and job_kind fields to JobDetailResponse struct for UI capability flags.
  • src/channels/web/ws.rs
    • Updated test setups to include scheduler: None in GatewayState initialization.
  • src/db/libsql/jobs.rs
    • Implemented get_agent_job_failure_reason for LibSqlBackend to retrieve the failure reason for a specific agent job.
  • src/db/mod.rs
    • Added get_agent_job_failure_reason trait method to the JobStore trait.
  • src/db/postgres.rs
    • Implemented get_agent_job_failure_reason for PgBackend by delegating to the underlying store.
  • src/history/store.rs
    • Implemented get_agent_job_failure_reason to query the database for an agent job's failure reason.
  • src/llm/nearai_chat.rs
    • Added logic to parse the Retry-After header from LLM API responses, supporting both delay-seconds and HTTP-date formats.
    • Passed the parsed retry_after_header to the LlmError::RateLimited error variant.
  • src/main.rs
    • Wired the scheduler_slot into the GatewayChannel using the new with_scheduler method.
    • Ensured the sse_sender is correctly cloned and passed to the extension manager.
    • Passed the sse_sender to AgentDeps during agent initialization.
  • src/testing.rs
    • Updated test harness builder to include sse_tx: None in AgentDeps initialization.
  • tests/openai_compat_integration.rs
    • Updated test setups to include scheduler: None in GatewayState initialization.
  • tests/ws_gateway_integration.rs
    • Updated test setups to include scheduler: None in GatewayState initialization.
Activity
  • Verified zero warnings with cargo clippy --all --benches --tests --examples --all-features.
  • Confirmed all cargo test cases pass (excluding one pre-existing failure in registry::artifacts).
  • Tested starting the application without sandbox mode (SANDBOX_ENABLED=false) and verified the jobs tab functionality.
  • Created a job via chat and confirmed live SSE events are displayed in the Activity tab.
  • Verified that failed agent jobs correctly show a 'Retry' button in the detail view.
  • Confirmed that follow-up prompts function as expected for in-progress agent jobs.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces significant improvements to the web UI's parity with non-sandbox jobs, enabling live SSE event broadcasting, job restart functionality, and follow-up prompts. The changes also enhance rate-limit resilience with intelligent retry mechanisms and improve the robustness of the web gateway's state management by preserving SSE broadcast channels across rebuilds. The UI now conditionally renders prompt bars and restart buttons based on job capabilities, providing a more intuitive user experience. Overall, the changes are well-implemented and contribute positively to the agent's functionality and user interface.

Note: Security Review did not run due to the size of the PR.

Comment thread src/agent/worker.rs
.unwrap_or(50) as usize;
let max_iterations = max_iterations.min(MAX_WORKER_ITERATIONS);
let mut iteration = 0;
const MAX_CONSECUTIVE_RATE_LIMITS: usize = 10;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The value 10 for MAX_CONSECUTIVE_RATE_LIMITS is a hardcoded threshold. While this value might be reasonable, consider making this limit configurable via the agent's configuration. This would allow for easier tuning of the rate-limiting behavior in different deployment environments without requiring code changes and recompilation, improving operational flexibility.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@think-in-universe think-in-universe merged commit f18fb51 into main Mar 3, 2026
14 checks passed
@think-in-universe think-in-universe deleted the worktree-jobs-ui-parity-fixes branch March 3, 2026 14:23
@github-actions github-actions Bot mentioned this pull request Mar 3, 2026
bkutasi pushed a commit to bkutasi/ironclaw that referenced this pull request Mar 28, 2026
* feat(web): fix jobs UI parity for non-sandbox mode

The web gateway Jobs UI was built primarily for sandbox (Docker) jobs.
When running without sandbox (common for NEAR AI hosted envs), multiple
features were broken. This change fixes all of them:

- Agent jobs now broadcast live SSE events to the web UI (Activity tab)
- Agent job restart via scheduler.dispatch_job (not chat message)
- Follow-up prompts for agent jobs via WorkerMessage injection
- Capability flags (can_restart, can_prompt, job_kind) in job detail API
- Rate-limit retry with cap (10 consecutive) and Retry-After header parsing
- Plan interruption on user message (breaks out of plan, re-evaluates)
- Correct SSE status field in mark_completed/mark_failed/mark_stuck
- SseManager preserved across rebuild_state calls

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* style: fix formatting in db/mod.rs and nearai_chat.rs

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

contributor: core 20+ merged PRs risk: medium Business logic, config, or moderate-risk modules scope: agent Agent core (agent loop, router, scheduler) scope: channel/web Web gateway channel scope: db/postgres PostgreSQL backend scope: db Database trait / abstraction scope: llm LLM integration size: XL 500+ changed lines

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants