Conversation
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request introduces a new feature that enables users to gracefully restart the application process directly from the web interface. It provides a robust mechanism for initiating a restart via a system command and a dedicated tool, coupled with a comprehensive frontend experience that includes confirmation, progress indication, and automatic UI updates upon successful reconnection. Highlights
Changelog
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request introduces a graceful restart feature, accessible via a /restart command and a UI button, with frontend updates for user confirmation and progress indication. While the feature is functional, significant security vulnerabilities exist: the RestartTool does not properly implement approval checks for destructive operations, and the /restart command lacks proper authorization in multi-user environments. These issues could be exploited to cause a Denial of Service (DoS) via prompt injection or unauthorized access. Beyond security, the review also highlights opportunities to improve JavaScript code maintainability by reducing duplication and enhance the robustness of the Rust tool implementation through parameter validation.
zmanian
left a comment
There was a problem hiding this comment.
This feature needs significant revision before merge. The concept is sound -- supervised restart is useful for deployments -- but the security surface is too exposed.
Critical Issues
1. No approval gate on a destructive operation (Security)
RestartTool doesn't override requires_approval(), inheriting ApprovalRequirement::Never. The LLM can invoke this tool without user confirmation. An injected prompt could trigger a restart loop (DoS). Override to return ApprovalRequirement::Always.
2. std::process::exit(0) is a hard exit with no cleanup
This terminates immediately -- no destructors, no flushing buffered I/O (database writes, logs), no graceful shutdown of tokio tasks or HTTP server, no checkpointing in-progress jobs. Use a proper shutdown signal path (CancellationToken or watch channel) so Axum drains, database connections close, and in-flight jobs checkpoint.
3. No authorization on /restart command
The slash command dispatches without permission checks. In multi-user environments (web gateway), any authenticated user can restart the entire instance. On non-web channels (TUI, REPL, webhooks), /restart fires with zero confirmation since the web modal is the only gate.
Must-fix
- Add
requires_approvalreturningAlwaysonRestartTool - Replace
std::process::exit(0)with graceful shutdown, or at minimum document why hard exit is acceptable - Add authorization gate on
/restartslash command - Clamp
delay_secsto the declared schema bounds (1-30) -- currently not validated in code despite schema declaring min/max - Add tests (project policy: regression tests required)
Other items
- Remove hardcoded
agent.near.ailink in the progress modal -- self-hosted instances won't have logs there - Consolidate the 3 duplicated JS restart detection checks (
'restart initiated'string matching) -- fragile approach - CSS additions look fine and consistent with existing design system
| //! IronClaw runs inside a Docker container with an entrypoint loop that monitors exit codes: | ||
| //! - **Exit code 0** (clean): Reset failure counter, wait `IRONCLAW_RESTART_DELAY` (default 5s), restart | ||
| //! - **Exit code ≠ 0** (failure): Increment failure counter, exit after `IRONCLAW_MAX_FAILURES` (default 10) | ||
| //! | ||
| //! This tool triggers a restart by calling `std::process::exit(0)` after a brief delay, allowing | ||
| //! the HTTP response to be flushed before the process terminates. The entrypoint loop then | ||
| //! detects the clean exit and automatically restarts the process. |
There was a problem hiding this comment.
This is amazing Nick! but how would this work for customers not running it via docker? Just worried that we have this tool where we have specific exit codes that restarts containers for our setup that might not extend to other people's setup including local ones.
There was a problem hiding this comment.
Pull request overview
Adds a new “restart” capability intended to let the IronClaw process restart cleanly (exit code 0) via a dedicated built-in tool and a web UI control.
Changes:
- Introduces a new built-in
restarttool and registers it in the tool registry. - Adds
/restartcommand handling and a web UI Restart button + confirmation/progress modals. - Expands schema-validation coverage to include the new tool.
Reviewed changes
Copilot reviewed 9 out of 9 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/tool_schema_validation.rs | Adds restart to the expected core tool set validated in tests. |
| src/tools/registry.rs | Registers RestartTool and protects the restart name from shadowing. |
| src/tools/builtin/restart.rs | Implements the new tool (delay + std::process::exit(0)) and adds unit tests. |
| src/tools/builtin/mod.rs | Wires the new builtin module + re-export. |
| src/channels/web/static/style.css | Adds styling for the restart button, confirmation modal, and loader. |
| src/channels/web/static/index.html | Adds restart confirmation and progress modals + restart button in the header. |
| src/channels/web/static/app.js | Implements restart UI flow and SSE-based “restart complete” behavior. |
| src/agent/submission.rs | Parses /restart as a system command. |
| src/agent/commands.rs | Adds a web-only restart authorization check (currently bypassed) and dispatches a restart job. |
Comments suppressed due to low confidence (3)
src/tools/builtin/restart.rs:63
parameters_schema()omits a top-levelrequiredfield. In the built-in tools, schemas consistently include"required": [...](often[]when all params are optional). Adding"required": []here would keep the schema format consistent and avoids downstream tooling needing to special-case missingrequired.
fn parameters_schema(&self) -> serde_json::Value {
serde_json::json!({
"type": "object",
"properties": {
"delay_secs": {
"type": "integer",
"description": "Seconds to wait before exiting (default: 2, min: 1, max: 30)",
"minimum": 1,
"maximum": 30
}
}
})
src/tools/builtin/restart.rs:31
- The
#[allow(unused_imports)]is masking the fact thatApprovalRequirementis only needed for the test module. Prefer movingApprovalRequirementinto a#[cfg(test)] use ...(or importing it inside the test module) and drop the allow lint so genuine unused imports in this module aren’t accidentally hidden.
use crate::context::JobContext;
#[allow(unused_imports)]
use crate::tools::tool::{ApprovalRequirement, Tool, ToolError, ToolOutput};
src/channels/web/static/index.html:56
- The restart confirmation UI is missing basic dialog accessibility semantics (e.g.
role="dialog",aria-modal="true", andaria-labelledbypointing at the title). Also, the close button uses a visual “×” without anaria-label, which can be unclear to screen readers. Adding these attributes (and ideally focus management) would make the modal usable with assistive tech.
<div id="restart-confirm-modal" class="restart-modal" style="display: none;">
<div class="restart-modal-overlay" onclick="cancelRestart()"></div>
<div class="restart-modal-content">
<div class="restart-modal-header">
<h2>Restart IronClaw Instance</h2>
<button class="restart-modal-close" onclick="cancelRestart()" title="Close">×</button>
</div>
<div class="restart-modal-body">
<p class="restart-modal-description">
Are you sure you want to restart the IronClaw instance? This will gracefully restart the process.
</p>
<div class="restart-modal-warning">
<span class="restart-modal-warning-icon">⚠️</span>
<p>Any in-progress jobs may be interrupted. The restart will complete within a few seconds.</p>
</div>
</div>
<div class="restart-modal-footer">
<button class="restart-modal-btn cancel" onclick="cancelRestart()">Cancel</button>
<button class="restart-modal-btn confirm" onclick="confirmRestart()">Confirm Restart</button>
</div>
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
@zmanian hey, could you please take a look again? thanks! |
zmanian
left a comment
There was a problem hiding this comment.
Review: feat: restart
Improvements since last review
Good progress addressing prior feedback:
- Channel check (
channel != "gateway") inhandle_system_commandfor/restart IRONCLAW_IN_DOCKERenv var gating for non-Docker deployments- RestartTool excluded from
register_builtin_tools()so it doesn't appear intool_definitions()sent to the LLM IRONCLAW_DISABLE_RESTARTenv var for test safety
Blocker: get("restart") special case in registry.rs defeats the LLM-hiding strategy
The tool is intentionally excluded from tool_definitions() so the LLM doesn't see it. But registry.get() has a special case that always returns it:
// registry.rs ~line 163
pub async fn get(&self, name: &str) -> Option<Arc<dyn Tool>> {
// ...normal lookup...
if name == "restart" {
return Some(Arc::new(RestartTool)); // Always returns it
}
None
}The worker's tool execution loop calls registry.get(tool_name) to resolve tools. If the LLM hallucinated or was prompt-injected into calling a tool named "restart", the worker would find it here and execute it. Combined with requires_approval() -> Never, it runs without approval.
Fix: Remove the get() special case and the definitions_for() special case (~15 lines total). The /restart command in commands.rs already instantiates RestartTool directly -- it never goes through the registry. The feature works exactly the same way without these lines.
Minor items (non-blocking)
-
unsafe { std::env::set_var() }in tests -- mutates global state, can cause flaky parallel tests. Consider#[serial_test::serial]or a config struct parameter. -
String-based restart detection in JS --
data.content.toLowerCase().includes('restart initiated')is fragile. A dedicated SSE event type (e.g.,restart_initiated) would be more reliable. -
Debug logging -- the added
tracing::debug!calls that log message content inchat_send_handlerandSubmissionParsercould leak PII in production logs. Consider removing content-logging lines.
|
Before (vulnerable): After (fixed): Changes made:
Result: RestartTool can no longer be reached via registry.get("restart"). If the LLM hallucinates a call to "restart", the worker's execute_tool_inner() will get None from the registry and |
* feat: restart * review fixes * add IRONCLAW_IN_DOCKER env variable * review fixes * fix tests * set default value as false
* feat: restart * review fixes * add IRONCLAW_IN_DOCKER env variable * review fixes * fix tests * set default value as false
No description provided.