Skip to content

feat: restart#531

Merged
nickpismenkov merged 14 commits intomainfrom
feat/restart
Mar 6, 2026
Merged

feat: restart#531
nickpismenkov merged 14 commits intomainfrom
feat/restart

Conversation

@nickpismenkov
Copy link
Copy Markdown
Contributor

No description provided.

@github-actions github-actions bot added size: XL 500+ changed lines scope: agent Agent core (agent loop, router, scheduler) scope: channel/web Web gateway channel scope: tool Tool infrastructure scope: tool/builtin Built-in tools risk: medium Business logic, config, or moderate-risk modules contributor: regular 2-5 merged PRs and removed size: XL 500+ changed lines labels Mar 4, 2026
@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a new feature that enables users to gracefully restart the application process directly from the web interface. It provides a robust mechanism for initiating a restart via a system command and a dedicated tool, coupled with a comprehensive frontend experience that includes confirmation, progress indication, and automatic UI updates upon successful reconnection.

Highlights

  • Graceful Restart Command: Implemented a new /restart command in the agent, allowing users to initiate a graceful process restart directly.
  • Restart Tool Integration: Introduced a dedicated RestartTool that dispatches a system job to handle the restart, ensuring a clean exit and automatic re-launch of the process.
  • Interactive Frontend UI: Developed a comprehensive frontend experience for the restart functionality, including a new restart button, a confirmation modal, and a progress modal to inform the user during the restart sequence.
  • Client-Side Restart Handling: Added JavaScript logic to manage the restart state, display appropriate modals, and automatically update the UI upon successful server reconnection after a restart.
Changelog
  • src/agent/commands.rs
    • Added /restart command to the agent's command list.
    • Implemented logic to dispatch a system job with restart tool metadata when /restart is invoked.
  • src/agent/submission.rs
    • Added parsing for the /restart system command.
  • src/channels/web/static/app.js
    • Added isRestarting state variable to track restart status.
    • Included /restart in the list of slash commands for autocomplete.
    • Implemented logic in connectSSE to handle post-restart reconnection, closing modals and updating status.
    • Added triggerRestart, confirmRestart, cancelRestart, and showRestartModal functions to manage the restart UI flow.
    • Integrated restart modal display logic into addMessage, appendToLastAssistant, and addToolCard functions based on restart-related messages or tool execution.
  • src/channels/web/static/index.html
    • Added HTML for the 'Restart Confirmation Modal' and 'Restart Progress Modal'.
    • Introduced a 'Restart' button with an SVG icon to the main application header.
  • src/channels/web/static/style.css
    • Added comprehensive CSS styling for the new restart button, confirmation modal, and progress modal, including animations.
  • src/tools/builtin/mod.rs
    • Declared and exported the new restart module and RestartTool.
  • src/tools/builtin/restart.rs
    • Created the RestartTool struct and implemented the Tool trait.
    • Defined the tool's name, description, and parameters schema (for delay_secs).
    • Implemented the execute method to spawn a background task that sleeps and then exits the process cleanly, facilitating a graceful restart.
  • src/tools/registry.rs
    • Imported RestartTool.
    • Added RestartTool to the list of PROTECTED_TOOL_NAMES.
    • Registered RestartTool in the ToolRegistry.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@github-actions github-actions bot added the size: XL 500+ changed lines label Mar 4, 2026
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a graceful restart feature, accessible via a /restart command and a UI button, with frontend updates for user confirmation and progress indication. While the feature is functional, significant security vulnerabilities exist: the RestartTool does not properly implement approval checks for destructive operations, and the /restart command lacks proper authorization in multi-user environments. These issues could be exploited to cause a Denial of Service (DoS) via prompt injection or unauthorized access. Beyond security, the review also highlights opportunities to improve JavaScript code maintainability by reducing duplication and enhance the robustness of the Rust tool implementation through parameter validation.

Comment thread src/tools/builtin/restart.rs
Comment thread src/agent/commands.rs
Comment thread src/channels/web/static/app.js Outdated
Comment thread src/channels/web/static/app.js Outdated
Comment thread src/channels/web/static/app.js Outdated
Copy link
Copy Markdown
Collaborator

@zmanian zmanian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feature needs significant revision before merge. The concept is sound -- supervised restart is useful for deployments -- but the security surface is too exposed.

Critical Issues

1. No approval gate on a destructive operation (Security)
RestartTool doesn't override requires_approval(), inheriting ApprovalRequirement::Never. The LLM can invoke this tool without user confirmation. An injected prompt could trigger a restart loop (DoS). Override to return ApprovalRequirement::Always.

2. std::process::exit(0) is a hard exit with no cleanup
This terminates immediately -- no destructors, no flushing buffered I/O (database writes, logs), no graceful shutdown of tokio tasks or HTTP server, no checkpointing in-progress jobs. Use a proper shutdown signal path (CancellationToken or watch channel) so Axum drains, database connections close, and in-flight jobs checkpoint.

3. No authorization on /restart command
The slash command dispatches without permission checks. In multi-user environments (web gateway), any authenticated user can restart the entire instance. On non-web channels (TUI, REPL, webhooks), /restart fires with zero confirmation since the web modal is the only gate.

Must-fix

  • Add requires_approval returning Always on RestartTool
  • Replace std::process::exit(0) with graceful shutdown, or at minimum document why hard exit is acceptable
  • Add authorization gate on /restart slash command
  • Clamp delay_secs to the declared schema bounds (1-30) -- currently not validated in code despite schema declaring min/max
  • Add tests (project policy: regression tests required)

Other items

  • Remove hardcoded agent.near.ai link in the progress modal -- self-hosted instances won't have logs there
  • Consolidate the 3 duplicated JS restart detection checks ('restart initiated' string matching) -- fragile approach
  • CSS additions look fine and consistent with existing design system

@nickpismenkov nickpismenkov requested a review from zmanian March 4, 2026 23:37
@nickpismenkov nickpismenkov added contributor: core 20+ merged PRs and removed contributor: regular 2-5 merged PRs labels Mar 4, 2026
Comment on lines +5 to +11
//! IronClaw runs inside a Docker container with an entrypoint loop that monitors exit codes:
//! - **Exit code 0** (clean): Reset failure counter, wait `IRONCLAW_RESTART_DELAY` (default 5s), restart
//! - **Exit code ≠ 0** (failure): Increment failure counter, exit after `IRONCLAW_MAX_FAILURES` (default 10)
//!
//! This tool triggers a restart by calling `std::process::exit(0)` after a brief delay, allowing
//! the HTTP response to be flushed before the process terminates. The entrypoint loop then
//! detects the clean exit and automatically restarts the process.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is amazing Nick! but how would this work for customers not running it via docker? Just worried that we have this tool where we have specific exit codes that restarts containers for our setup that might not extend to other people's setup including local ones.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add env var

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new “restart” capability intended to let the IronClaw process restart cleanly (exit code 0) via a dedicated built-in tool and a web UI control.

Changes:

  • Introduces a new built-in restart tool and registers it in the tool registry.
  • Adds /restart command handling and a web UI Restart button + confirmation/progress modals.
  • Expands schema-validation coverage to include the new tool.

Reviewed changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
tests/tool_schema_validation.rs Adds restart to the expected core tool set validated in tests.
src/tools/registry.rs Registers RestartTool and protects the restart name from shadowing.
src/tools/builtin/restart.rs Implements the new tool (delay + std::process::exit(0)) and adds unit tests.
src/tools/builtin/mod.rs Wires the new builtin module + re-export.
src/channels/web/static/style.css Adds styling for the restart button, confirmation modal, and loader.
src/channels/web/static/index.html Adds restart confirmation and progress modals + restart button in the header.
src/channels/web/static/app.js Implements restart UI flow and SSE-based “restart complete” behavior.
src/agent/submission.rs Parses /restart as a system command.
src/agent/commands.rs Adds a web-only restart authorization check (currently bypassed) and dispatches a restart job.
Comments suppressed due to low confidence (3)

src/tools/builtin/restart.rs:63

  • parameters_schema() omits a top-level required field. In the built-in tools, schemas consistently include "required": [...] (often [] when all params are optional). Adding "required": [] here would keep the schema format consistent and avoids downstream tooling needing to special-case missing required.
    fn parameters_schema(&self) -> serde_json::Value {
        serde_json::json!({
            "type": "object",
            "properties": {
                "delay_secs": {
                    "type": "integer",
                    "description": "Seconds to wait before exiting (default: 2, min: 1, max: 30)",
                    "minimum": 1,
                    "maximum": 30
                }
            }
        })

src/tools/builtin/restart.rs:31

  • The #[allow(unused_imports)] is masking the fact that ApprovalRequirement is only needed for the test module. Prefer moving ApprovalRequirement into a #[cfg(test)] use ... (or importing it inside the test module) and drop the allow lint so genuine unused imports in this module aren’t accidentally hidden.
use crate::context::JobContext;
#[allow(unused_imports)]
use crate::tools::tool::{ApprovalRequirement, Tool, ToolError, ToolOutput};

src/channels/web/static/index.html:56

  • The restart confirmation UI is missing basic dialog accessibility semantics (e.g. role="dialog", aria-modal="true", and aria-labelledby pointing at the title). Also, the close button uses a visual “×” without an aria-label, which can be unclear to screen readers. Adding these attributes (and ideally focus management) would make the modal usable with assistive tech.
  <div id="restart-confirm-modal" class="restart-modal" style="display: none;">
    <div class="restart-modal-overlay" onclick="cancelRestart()"></div>
    <div class="restart-modal-content">
      <div class="restart-modal-header">
        <h2>Restart IronClaw Instance</h2>
        <button class="restart-modal-close" onclick="cancelRestart()" title="Close">×</button>
      </div>
      <div class="restart-modal-body">
        <p class="restart-modal-description">
          Are you sure you want to restart the IronClaw instance? This will gracefully restart the process.
        </p>
        <div class="restart-modal-warning">
          <span class="restart-modal-warning-icon">⚠️</span>
          <p>Any in-progress jobs may be interrupted. The restart will complete within a few seconds.</p>
        </div>
      </div>
      <div class="restart-modal-footer">
        <button class="restart-modal-btn cancel" onclick="cancelRestart()">Cancel</button>
        <button class="restart-modal-btn confirm" onclick="confirmRestart()">Confirm Restart</button>
      </div>

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/agent/commands.rs Outdated
Comment thread src/tools/registry.rs
Comment thread src/tools/builtin/restart.rs
Comment thread src/agent/submission.rs
@github-actions github-actions bot added contributor: regular 2-5 merged PRs and removed contributor: core 20+ merged PRs labels Mar 5, 2026
@github-actions github-actions bot added scope: sandbox Docker sandbox scope: docs Documentation labels Mar 5, 2026
Comment thread deploy/env.example Outdated
henrypark133
henrypark133 previously approved these changes Mar 5, 2026
henrypark133
henrypark133 previously approved these changes Mar 5, 2026
@nickpismenkov nickpismenkov enabled auto-merge (squash) March 5, 2026 02:27
@nickpismenkov
Copy link
Copy Markdown
Contributor Author

@zmanian hey, could you please take a look again? thanks!

Copy link
Copy Markdown
Collaborator

@zmanian zmanian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review: feat: restart

Improvements since last review

Good progress addressing prior feedback:

  • Channel check (channel != "gateway") in handle_system_command for /restart
  • IRONCLAW_IN_DOCKER env var gating for non-Docker deployments
  • RestartTool excluded from register_builtin_tools() so it doesn't appear in tool_definitions() sent to the LLM
  • IRONCLAW_DISABLE_RESTART env var for test safety

Blocker: get("restart") special case in registry.rs defeats the LLM-hiding strategy

The tool is intentionally excluded from tool_definitions() so the LLM doesn't see it. But registry.get() has a special case that always returns it:

// registry.rs ~line 163
pub async fn get(&self, name: &str) -> Option<Arc<dyn Tool>> {
    // ...normal lookup...
    if name == "restart" {
        return Some(Arc::new(RestartTool));  // Always returns it
    }
    None
}

The worker's tool execution loop calls registry.get(tool_name) to resolve tools. If the LLM hallucinated or was prompt-injected into calling a tool named "restart", the worker would find it here and execute it. Combined with requires_approval() -> Never, it runs without approval.

Fix: Remove the get() special case and the definitions_for() special case (~15 lines total). The /restart command in commands.rs already instantiates RestartTool directly -- it never goes through the registry. The feature works exactly the same way without these lines.

Minor items (non-blocking)

  1. unsafe { std::env::set_var() } in tests -- mutates global state, can cause flaky parallel tests. Consider #[serial_test::serial] or a config struct parameter.

  2. String-based restart detection in JS -- data.content.toLowerCase().includes('restart initiated') is fragile. A dedicated SSE event type (e.g., restart_initiated) would be more reliable.

  3. Debug logging -- the added tracing::debug! calls that log message content in chat_send_handler and SubmissionParser could leak PII in production logs. Consider removing content-logging lines.

@nickpismenkov
Copy link
Copy Markdown
Contributor Author

Before (vulnerable):
pub async fn get(&self, name: &str) -> Option<Arc> {
let tools = self.tools.read().await;
if let Some(tool) = tools.get(name) {
return Some(Arc::clone(tool));
}
if name == "restart" { // ← VULNERABILITY: always returns RestartTool
return Some(Arc::new(RestartTool));
}
None
}

After (fixed):
pub async fn get(&self, name: &str) -> Option<Arc> {
let tools = self.tools.read().await;
tools.get(name).map(Arc::clone) // ← Normal lookup only
}

Changes made:

  • ✅ Removed special case from get()
  • ✅ Removed special case from tool_definitions_for()
  • ✅ Removed RestartTool import from registry.rs
  • ✅ Removed explanatory comments about the special handling

Result: RestartTool can no longer be reached via registry.get("restart"). If the LLM hallucinates a call to "restart", the worker's execute_tool_inner() will get None from the registry and
properly return a ToolError::NotFound.

@nickpismenkov nickpismenkov requested a review from zmanian March 6, 2026 00:48
zmanian
zmanian previously approved these changes Mar 6, 2026
@nickpismenkov nickpismenkov merged commit 9ae04f1 into main Mar 6, 2026
15 checks passed
@nickpismenkov nickpismenkov deleted the feat/restart branch March 6, 2026 01:12
logicminds pushed a commit to logicminds/ironclaw that referenced this pull request Mar 6, 2026
* feat: restart

* review fixes

* add IRONCLAW_IN_DOCKER env variable

* review fixes

* fix tests

* set default value as false
This was referenced Mar 6, 2026
bkutasi pushed a commit to bkutasi/ironclaw that referenced this pull request Mar 28, 2026
* feat: restart

* review fixes

* add IRONCLAW_IN_DOCKER env variable

* review fixes

* fix tests

* set default value as false
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

contributor: regular 2-5 merged PRs risk: medium Business logic, config, or moderate-risk modules scope: agent Agent core (agent loop, router, scheduler) scope: channel/web Web gateway channel scope: docs Documentation scope: sandbox Docker sandbox scope: tool/builtin Built-in tools scope: tool Tool infrastructure size: XL 500+ changed lines

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Let people restart IronClaw from chat or IronClaw UI

4 participants