feat: per-template capability sandbox (#222) by qhkm · Pull Request #279 · qhkm/zeptoclaw

qhkm · 2026-03-07T06:48:32Z

Summary

Adds three new optional fields to AgentTemplate for declarative per-template sandboxing:

shell_allowlist — restrict which shell binaries the agent can execute (Strict mode). Applied to ShellTool, CustomTool, AND PluginTool via a shared ShellSecurityConfig built once at kernel boot.
max_token_budget — per-agent-run token cap. Resolved via min(template, global) so templates can restrict but never expand beyond global.
max_tool_calls — hard cap on total tool calls across the entire agent run via a dedicated ToolCallLimitTracker (AtomicU32). Intentionally NOT wired into LoopGuard.

Also adds TOML template loading — .toml files in the template directory are now parsed alongside .json.

All fields are optional with None defaults — zero behavior change for existing templates.

Key design decisions

Decision	Choice
Shell enforcement scope	Shared config injected into ShellTool, CustomTool, AND PluginTool
`max_tool_calls` enforcement	Dedicated atomic counter, NOT LoopGuard (which only counts repetitions)
Token budget naming	`max_token_budget` (not `max_tokens` — avoids collision with response generation field)
Template can restrict, not expand	`min()` for numeric limits

Files changed

src/config/templates.rs — 3 new fields + TOML loading + 7 tests
src/kernel/registrar.rs — build_shell_config() + ToolDeps.template + 4 tests
src/kernel/mod.rs — pass template to ToolDeps
src/tools/custom.rs — with_security() constructor + 1 test
src/tools/plugin.rs — security field + validate_command() before sh -c + 1 test
src/agent/budget.rs — resolve_token_budget() + 4 tests
src/agent/tool_call_limit.rs — NEW: ToolCallLimitTracker + 5 tests
src/agent/mod.rs — module + re-export
src/agent/loop.rs — tracker field + enforcement in both streaming/non-streaming loops
src/config/types.rs — max_tool_calls on AgentDefaults
src/config/mod.rs — env var override
src/config/validate.rs — known field
src/cli/common.rs — template overrides for budget + tool calls

Test plan

3055 lib tests pass (22 new tests added)
127 doc tests pass
cargo clippy -- -D warnings clean
cargo fmt -- --check clean
Shell allowlist: template with ["git"] blocks curl; empty list blocks all; None uses default
Token budget: min(template, global) logic verified for all 4 branches
Tool call limit: tracker increments per-call, stops at limit, unlimited when None

Closes #222

🤖 Generated with Claude Code

Summary by CodeRabbit

New Features
- Per-run tool-call limits with enforcement, truncation, logging, and final-synthesis fallback.
- Templates can specify max token budgets, max tool calls, and shell command allowlists.
- Token-budget resolution combines global and template overrides.
- Shell security configs applied to tools; commands validated against allowlists.
- Templates loadable from JSON or TOML; env var to override default max tool calls.
Tests
- Comprehensive unit tests for budgeting, tool-call limits, template parsing, and security behavior.

…s fields Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…mTool, PluginTool Build ShellSecurityConfig from template shell_allowlist and inject it into all 3 shell execution paths. When a template defines an allowlist, Strict mode ensures only listed binaries can execute. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Add resolve_token_budget() helper that picks the effective budget from global config and optional template override (lower-wins when both set). Wire it into the template override block in create_agent so templates can cap per-session token usage. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Adds ToolCallLimitTracker (AtomicU32) that enforces per-template max_tool_calls limits. Wired into AgentLoop with enforcement in both streaming and non-streaming tool execution loops. The tracker is initialized from config.agents.defaults.max_tool_calls, which is set from the template's max_tool_calls field. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

coderabbitai · 2026-03-07T06:48:45Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

📝 Walkthrough

Walkthrough

Threads per-template sandbox data through boot/registration and runtime: adds shell allowlists, per-template token-budget resolution, and a per-run tool-call limiter; registers tools with security configs and enforces tool-call caps and shell command validation at execution.

Changes

Cohort / File(s)	Summary
Token Budget `src/agent/budget.rs`	Add `pub fn resolve_token_budget(global: u64, template: Option<u64>) -> u64` and unit tests to compute effective token budget between global and template overrides.
Tool Call Limiter `src/agent/tool_call_limit.rs`, `src/agent/mod.rs`, `src/agent/loop.rs`	Add `ToolCallLimitTracker` (atomic counter + optional limit); re-export in `agent` mod; wire into `AgentLoop` constructors and tool-execution paths with pre-checks, truncation, increments, logging, and metrics.
Template Extensions & Loading `src/config/templates.rs`	Extend `AgentTemplate` with `shell_allowlist`, `max_token_budget`, `max_tool_calls`; add TOML support for templates; update built-in templates and test coverage.
Config Defaults & CLI Overrides `src/config/types.rs`, `src/config/mod.rs`, `src/config/validate.rs`, `src/cli/common.rs`	Add `max_tool_calls: Option<u32>` to agent defaults (serde/default/Default); env var override `ZEPTOCLAW_AGENTS_DEFAULTS_MAX_TOOL_CALLS`; validation acceptance; CLI wiring to apply template token and tool-call overrides.
Kernel & Registrar Wiring `src/kernel/mod.rs`, `src/kernel/registrar.rs`	Propagate `AgentTemplate` via `ToolDeps.template`; add `build_shell_config(template: Option<&AgentTemplate>) -> ShellSecurityConfig` and reuse shell security config for tool registration.
Tool Security Integration `src/tools/plugin.rs`, `src/tools/custom.rs`	Add `with_security` constructors and `ShellSecurityConfig` field; validate commands via `security.validate_command()` before execution; update tests.
Tests & Docs `src/config/...`, `src/kernel/...`, `src/tools/...`, `src/agent/...`	New/updated tests for TOML loading, shell allowlists, token budget resolution, tool-call limits, and security enforcement; comments/docs updated to reflect TOML and new fields.

Sequence Diagram

sequenceDiagram
    participant CLI as CLI / Template loader
    participant Config as Config Loader
    participant Kernel as ZeptoKernel::boot
    participant Registrar as registrar::register_all_tools
    participant Tools as Tool Factory (Shell/Plugin/Custom)
    participant AgentLoop as AgentLoop runtime
    participant ToolLimit as ToolCallLimitTracker

    CLI->>Config: load template (JSON/TOML) with shell_allowlist, max_token_budget, max_tool_calls
    Config->>Kernel: provide AgentTemplate + defaults
    Kernel->>Registrar: boot(pass ToolDeps { template: AgentTemplate })
    Registrar->>Registrar: build_shell_config(template)
    Registrar->>Tools: register tools with ShellSecurityConfig
    Tools->>Tools: store security config per tool
    AgentLoop->>ToolLimit: remaining()
    alt remaining >= batch
        AgentLoop->>Tools: execute tool batch
        Tools->>Tools: security.validate_command(command)
        Tools-->>AgentLoop: results
        AgentLoop->>ToolLimit: increment(batch)
    else remaining < batch
        AgentLoop->>AgentLoop: truncate/skip tool calls, log, run final synthesis
    end

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

security: add structured command parsing to shell.rs #269 — related changes to ShellSecurityConfig and command validation used by registrar and tools.
fix: harden spawned task crash isolation #265 — overlaps in AgentLoop tool-execution flow (timeouts/isolation vs. tool-call limits).
refactor(kernel): extract thin kernel from create_agent_with_template() #224 — prior registrar/AgentLoop refactor that this PR extends by propagating templates and adding tooling limits.

Poem

🐰 I hopped through templates, tidy and neat,
Shell rules snug, token caps light on my feet.
I counted tool calls, careful and bright,
Truncated the batch, then logged through the night.
Safe hops, small bounds — a rabbit’s delight.

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'feat: per-template capability sandbox' accurately and concisely describes the main change—adding optional template fields for capability sandboxing, including shell allowlists, token budgets, and tool call limits.
Linked Issues check	✅ Passed	The PR successfully implements all primary coding objectives from issue `#222`: shell command allowlist enforcement via ShellSecurityConfig, per-template token budget resolution, tool call limiting via ToolCallLimitTracker, TOML template loading, and integration into kernel registration and enforcement points.
Out of Scope Changes check	✅ Passed	All changes are directly aligned with issue `#222` requirements: template field additions (shell_allowlist, max_token_budget, max_tool_calls), enforcement mechanism implementation, TOML parsing support, and kernel/tool integration. No unrelated modifications detected.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch feat/template-capability-sandbox

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/agent/budget.rs`:
- Around line 172-180: The resolve_token_budget function treats 0 as a valid
template cap and thus allows Some(0) to disable a finite global cap; change the
branching in resolve_token_budget so that 0 always means “unlimited” (i.e., if
global == 0 return tpl, else if tpl == 0 return global, otherwise return
global.min(tpl)), and add a regression test (e.g.,
test_resolve_token_budget_template_unlimited_does_not_bypass_global) asserting
resolve_token_budget(100_000, Some(0)) == 100_000 to prevent regressions.

In `@src/agent/loop.rs`:
- Around line 397-398: The ToolCallLimitTracker is stored on AgentLoop and never
reset, causing one request to exhaust tool-call budget for subsequent runs;
remove the struct field usage and instead create a fresh local tracker at the
start of each run by instantiating
ToolCallLimitTracker::new(self.config.agents.defaults.max_tool_calls) inside
process_message() and process_message_streaming(), replace references to the old
self.tool_call_limit with the new local variable, and remove the tool_call_limit
field from AgentLoop so each message run has its own per-agent-run tracker.
- Around line 1011-1021: The current logic increments self.tool_call_limit via
self.tool_call_limit.increment(...) before executing tool calls and then breaks
if self.tool_call_limit.is_exceeded(), which makes max_tool_calls behave like
N-1 and can leave response.content empty; fix by checking the limit against the
upcoming call count before incrementing (or use a would_exceed-like check) and
only call self.tool_call_limit.increment(...) after allowing/executing the tool
calls; additionally, when you do break because the limit would be exceeded,
ensure you set or preserve response.content (e.g., set a clear “tool call limit
reached” message or retain prior assistant content) so the final assistant reply
is not empty.

In `@src/config/mod.rs`:
- Around line 200-204: The current env var parsing for
ZEPTOCLAW_AGENTS_DEFAULTS_MAX_TOOL_CALLS only accepts numeric values and ignores
empty strings, so you cannot clear an existing Option<u32>; update the logic
around reading std::env::var so that if the variable is present and
val.trim().is_empty() you set self.agents.defaults.max_tool_calls = None,
otherwise attempt val.parse::<u32>() and set Some(v) on success (and keep
current behavior for parse failures or surface/log an error as appropriate).
Target the block that reads the env var and assigns
self.agents.defaults.max_tool_calls to implement this empty-string -> None
behavior.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: b1a44f85-a4b7-42a0-a7e3-c2ba9033c509

📥 Commits

Reviewing files that changed from the base of the PR and between d8d9cb2 and 7a3bdbc.

📒 Files selected for processing (13)

src/agent/budget.rs
src/agent/loop.rs
src/agent/mod.rs
src/agent/tool_call_limit.rs
src/cli/common.rs
src/config/mod.rs
src/config/templates.rs
src/config/types.rs
src/config/validate.rs
src/kernel/mod.rs
src/kernel/registrar.rs
src/tools/custom.rs
src/tools/plugin.rs

coderabbitai · 2026-03-07T06:58:10Z

+    /// Per-agent-run tool call limit tracker.
+    tool_call_limit: ToolCallLimitTracker,


⚠️ Potential issue | 🔴 Critical

Create a fresh tool-call tracker for each message run.

The field comment says “per-agent-run”, but this tracker is initialized once on the long-lived AgentLoop and never reset. One request can exhaust the budget for every later request handled by the same agent instance.

💡 Fix direction

pub struct AgentLoop { /// Per-session token budget tracker. token_budget: Arc<TokenBudget>, - /// Per-agent-run tool call limit tracker. - tool_call_limit: ToolCallLimitTracker, /// Tool approval gate for policy-based tool gating. approval_gate: Arc<ApprovalGate>,

let tool_call_limit = ToolCallLimitTracker::new(self.config.agents.defaults.max_tool_calls);

Instantiate that local tracker at the start of process_message() and process_message_streaming() instead of storing it on the struct.

Also applies to: 477-479, 514-516, 546-547, 582-583

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/agent/loop.rs` around lines 397 - 398, The ToolCallLimitTracker is stored on AgentLoop and never reset, causing one request to exhaust tool-call budget for subsequent runs; remove the struct field usage and instead create a fresh local tracker at the start of each run by instantiating ToolCallLimitTracker::new(self.config.agents.defaults.max_tool_calls) inside process_message() and process_message_streaming(), replace references to the old self.tool_call_limit with the new local variable, and remove the tool_call_limit field from AgentLoop so each message run has its own per-agent-run tracker.

coderabbitai · 2026-03-07T06:58:10Z

+        if let Ok(val) = std::env::var("ZEPTOCLAW_AGENTS_DEFAULTS_MAX_TOOL_CALLS") {
+            if let Ok(v) = val.parse::<u32>() {
+                self.agents.defaults.max_tool_calls = Some(v);
+            }
+        }


⚠️ Potential issue | 🟡 Minor

Allow the env override to clear max_tool_calls as well.

This only handles numeric values, so a deployment cannot use ZEPTOCLAW_AGENTS_DEFAULTS_MAX_TOOL_CALLS to remove a cap that is already set in config.json. For an Option<u32>, an empty value should clear the override instead of being ignored.

Suggested fix

if let Ok(val) = std::env::var("ZEPTOCLAW_AGENTS_DEFAULTS_MAX_TOOL_CALLS") { - if let Ok(v) = val.parse::<u32>() { + let val = val.trim(); + if val.is_empty() { + self.agents.defaults.max_tool_calls = None; + } else if let Ok(v) = val.parse::<u32>() { self.agents.defaults.max_tool_calls = Some(v); } }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

if let Ok(val) = std::env::var("ZEPTOCLAW_AGENTS_DEFAULTS_MAX_TOOL_CALLS") {

if let Ok(v) = val.parse::<u32>() {

self.agents.defaults.max_tool_calls = Some(v);

}

}

if let Ok(val) = std::env::var("ZEPTOCLAW_AGENTS_DEFAULTS_MAX_TOOL_CALLS") {

let val = val.trim();

if val.is_empty() {

self.agents.defaults.max_tool_calls = None;

} else if let Ok(v) = val.parse::<u32>() {

self.agents.defaults.max_tool_calls = Some(v);

}

}

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/config/mod.rs` around lines 200 - 204, The current env var parsing for ZEPTOCLAW_AGENTS_DEFAULTS_MAX_TOOL_CALLS only accepts numeric values and ignores empty strings, so you cannot clear an existing Option<u32>; update the logic around reading std::env::var so that if the variable is present and val.trim().is_empty() you set self.agents.defaults.max_tool_calls = None, otherwise attempt val.parse::<u32>() and set Some(v) on success (and keep current behavior for parse failures or surface/log an error as appropriate). Target the block that reads the env var and assigns self.agents.defaults.max_tool_calls to implement this empty-string -> None behavior.

…nd loop - resolve_token_budget: treat Some(0) as None so template cannot disable a finite global budget (0 = unlimited in TokenBudget) - agent loop: move tool_call_limit increment+check after execution in both streaming and non-streaming paths (fixes off-by-one that blocked the first allowed call) - common.rs: intersect template max_tool_calls with global via min() instead of direct override Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

coderabbitai

♻️ Duplicate comments (2)

src/agent/loop.rs (2)

397-398: ⚠️ Potential issue | 🔴 Critical

Instantiate the tool-call tracker per request, not on AgentLoop.

This tracker is stateful, but these lines make it live for the entire agent instance. One run can exhaust the budget for later runs, including other sessions processed by the same loop.

Possible fix direction

 pub struct AgentLoop {
     /// Per-session token budget tracker.
     token_budget: Arc<TokenBudget>,
-    /// Per-agent-run tool call limit tracker.
-    tool_call_limit: ToolCallLimitTracker,
     /// Tool approval gate for policy-based tool gating.
     approval_gate: Arc<ApprovalGate>,

-        let tool_call_limit = ToolCallLimitTracker::new(config.agents.defaults.max_tool_calls);
         let approval_gate = Arc::new(ApprovalGate::new(config.approval.clone()));
...
-            tool_call_limit,
             approval_gate,

let tool_call_limit = ToolCallLimitTracker::new(self.config.agents.defaults.max_tool_calls);

Create that local tracker at the start of process_message() and process_message_streaming(), then use the local instance throughout each run.

Also applies to: 478-479, 515-515, 546-547, 583-583

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/agent/loop.rs` around lines 397 - 398, The ToolCallLimitTracker is
currently stored on AgentLoop as the field tool_call_limit, causing its state to
persist across runs; instead, remove that field and instantiate a fresh local
tracker at the start of each run (e.g., inside process_message() and
process_message_streaming()) using
ToolCallLimitTracker::new(self.config.agents.defaults.max_tool_calls), then pass
that local tracker through the call chain for the duration of the single request
so each run gets an independent budget.

1298-1308: ⚠️ Potential issue | 🟠 Major

Don’t end the run with the tool-call turn as the final assistant reply.

When this branch trips, the loop exits before a post-tool LLM turn. The returned response.content is still from the tool-call response, which is often empty when tool_calls are present, so exact-limit runs can finish with a blank or stale answer.

Also applies to: 1785-1795

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/agent/loop.rs` around lines 1298 - 1308, The current branch increments
tool_call_limit and breaks immediately when exceeded, causing the loop to return
the last tool-call response (often empty) instead of performing the required
post-tool LLM turn; change the behavior in the tool_call_limit handling (the
block that touches self.tool_call_limit, response.tool_calls, and does the info!
log) to not break immediately but instead set a flag (e.g., stop_after_llm_turn
or pending_post_tool_llm) or adjust the loop control so the loop completes one
more iteration to run the post-tool LLM turn and produce an assistant response
derived from the LLM rather than the tool-call response; apply the same fix to
the analogous branch around the other occurrence noted (the block at 1785-1795).

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@src/agent/loop.rs`:
- Around line 397-398: The ToolCallLimitTracker is currently stored on AgentLoop
as the field tool_call_limit, causing its state to persist across runs; instead,
remove that field and instantiate a fresh local tracker at the start of each run
(e.g., inside process_message() and process_message_streaming()) using
ToolCallLimitTracker::new(self.config.agents.defaults.max_tool_calls), then pass
that local tracker through the call chain for the duration of the single request
so each run gets an independent budget.
- Around line 1298-1308: The current branch increments tool_call_limit and
breaks immediately when exceeded, causing the loop to return the last tool-call
response (often empty) instead of performing the required post-tool LLM turn;
change the behavior in the tool_call_limit handling (the block that touches
self.tool_call_limit, response.tool_calls, and does the info! log) to not break
immediately but instead set a flag (e.g., stop_after_llm_turn or
pending_post_tool_llm) or adjust the loop control so the loop completes one more
iteration to run the post-tool LLM turn and produce an assistant response
derived from the LLM rather than the tool-call response; apply the same fix to
the analogous branch around the other occurrence noted (the block at 1785-1795).

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 1a59dc65-836d-4832-9a27-af9be6b1286a

📥 Commits

Reviewing files that changed from the base of the PR and between 7a3bdbc and b25c70e.

📒 Files selected for processing (3)

src/agent/budget.rs
src/agent/loop.rs
src/cli/common.rs

The limit check was only performed after the entire batch of tool calls had already executed, allowing max_tool_calls=0 to still run one batch and any budget smaller than the batch size to be overshot. Now: - Check is_exceeded() BEFORE building futures — breaks immediately if limit already reached (handles max_tool_calls=0 correctly) - Truncate response.tool_calls to remaining() budget before building futures, so a batch of 10 with 3 remaining only executes 3 - Truncation happens before compute_tool_result_budget so the per-tool budget is computed from the actual execution count - Add ToolCallLimitTracker::remaining() method with tests Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

The limit check and batch truncation happened after the assistant tool-call message was already written to the session transcript. This caused: - max_tool_calls=0 to leave an orphaned tool-call message with no matching results - Partial truncation to record tool IDs that were never executed, producing an inconsistent transcript for the next LLM call - metrics.record_tool_calls() to over-report the original batch size Now in both paths (non-streaming and streaming): 1. is_exceeded() check + break — before anything is written 2. truncate() — before anything is written 3. record_tool_calls() — after truncation, reflects actual count 4. assistant message — built from the truncated tool_calls list Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

coderabbitai

♻️ Duplicate comments (1)

src/agent/loop.rs (1)
397-398: ⚠️ Potential issue | 🔴 Critical

Per-run tracker must be instantiated per run, not stored on the struct.

The comment says "per-agent-run" but tool_call_limit is initialized once when AgentLoop is constructed and never reset. This causes one request to exhaust the tool-call budget for all subsequent requests handled by the same agent instance.
💡 Recommended fix

Remove the field from AgentLoop and instantiate a fresh tracker at the start of each process_message() and process_message_streaming():
 pub struct AgentLoop {
     // ...
-    /// Per-agent-run tool call limit tracker.
-    tool_call_limit: ToolCallLimitTracker,
     // ...
 }
Then in process_message() and process_message_streaming():
// At the start of the method, after acquiring session lock:
let tool_call_limit = ToolCallLimitTracker::new(self.config.agents.defaults.max_tool_calls);
Replace all self.tool_call_limit references with the local tool_call_limit variable.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/agent/loop.rs` around lines 397 - 398, The tool-call tracker is currently
stored on the AgentLoop struct as tool_call_limit but should be recreated per
request; remove the tool_call_limit field from the AgentLoop struct and any
initialization in its constructor, then at the start of each request-handling
method (process_message and process_message_streaming) instantiate a local
ToolCallLimitTracker (e.g. via
ToolCallLimitTracker::new(self.config.agents.defaults.max_tool_calls) after
acquiring the session lock) and replace all uses of self.tool_call_limit with
the new local tool_call_limit variable so each run gets a fresh per-run tracker.

🧹 Nitpick comments (1)

src/agent/loop.rs (1)
1321-1331: Post-execution increment is correct; consider setting response content on limit break.

The increment-after-execute pattern correctly avoids the off-by-one issue. However, when breaking due to limit, response.content may be empty (LLM responses with tool calls often have minimal content), potentially leaving the user without a meaningful final message.

Compare to the loop guard breaks at lines 1335-1345 which explicitly set response.content before breaking.
💡 Optional: Set informative content on limit break
 if self.tool_call_limit.is_exceeded() {
     info!(
         count = self.tool_call_limit.count(),
         limit = ?self.tool_call_limit.limit(),
         "Tool call limit reached, stopping tool execution"
     );
+    if response.content.is_empty() {
+        response.content = "Tool call limit reached.".to_string();
+    }
     break;
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/agent/loop.rs` around lines 1321 - 1331, The tool call limit is
incremented after executing tools (self.tool_call_limit.increment(...)) which is
correct, but when is_exceeded() triggers the loop break the response may lack
user-facing content; update the break path to set a meaningful response.content
(similar to the other loop guard breaks) before breaking—e.g., assign a concise
informative message to response.content indicating the tool call limit was
reached and include debugging info (self.tool_call_limit.count() /
self.tool_call_limit.limit()) so callers of the function receive a final message
when the loop exits due to tool_call_limit.is_exceeded().

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@src/agent/loop.rs`:
- Around line 397-398: The tool-call tracker is currently stored on the
AgentLoop struct as tool_call_limit but should be recreated per request; remove
the tool_call_limit field from the AgentLoop struct and any initialization in
its constructor, then at the start of each request-handling method
(process_message and process_message_streaming) instantiate a local
ToolCallLimitTracker (e.g. via
ToolCallLimitTracker::new(self.config.agents.defaults.max_tool_calls) after
acquiring the session lock) and replace all uses of self.tool_call_limit with
the new local tool_call_limit variable so each run gets a fresh per-run tracker.

---

Nitpick comments:
In `@src/agent/loop.rs`:
- Around line 1321-1331: The tool call limit is incremented after executing
tools (self.tool_call_limit.increment(...)) which is correct, but when
is_exceeded() triggers the loop break the response may lack user-facing content;
update the break path to set a meaningful response.content (similar to the other
loop guard breaks) before breaking—e.g., assign a concise informative message to
response.content indicating the tool call limit was reached and include
debugging info (self.tool_call_limit.count() / self.tool_call_limit.limit()) so
callers of the function receive a final message when the loop exits due to
tool_call_limit.is_exceeded().

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 48c26f3d-b43f-4aa7-9341-dd86f28766e5

📥 Commits

Reviewing files that changed from the base of the PR and between b25c70e and c4aa4dd.

📒 Files selected for processing (2)

src/agent/loop.rs
src/agent/tool_call_limit.rs

🚧 Files skipped from review as they are similar to previous changes (1)

src/agent/tool_call_limit.rs

coderabbitai

♻️ Duplicate comments (2)

src/agent/loop.rs (2)

1012-1018: ⚠️ Potential issue | 🟠 Major

Return a fallback assistant message when the cap aborts the tool loop.

If the limit is already exhausted here, this breaks without updating response.content. On tool-call turns that content is often empty, so the caller can get a blank final assistant reply. Line 1565 in the streaming path has the same gap.

Suggested fix

-            if self.tool_call_limit.is_exceeded() {
+            if self.tool_call_limit.is_exceeded() {
+                if let Some(limit) = self.tool_call_limit.limit() {
+                    response.content = format!(
+                        "Stopped tool execution after reaching the max_tool_calls limit ({}).",
+                        limit
+                    );
+                }
                 info!(
                     count = self.tool_call_limit.count(),
                     limit = ?self.tool_call_limit.limit(),
                     "Tool call limit already reached, skipping tool execution"
                 );
                 break;
             }

Apply the same change to the streaming branch before its break.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/agent/loop.rs` around lines 1012 - 1018, When the tool-call cap causes
the loop to abort (the branch that checks self.tool_call_limit.is_exceeded()),
ensure you set a fallback assistant message into response.content before
breaking so callers do not receive an empty final assistant reply; locate the
non-streaming tool loop where the code currently logs and breaks and assign a
suitable assistant content string (e.g., a short "Tool call limit reached,
aborting" assistant message) to response.content, and mirror the exact same
assignment in the streaming branch (the analogous branch around the streaming
path) before its break so both paths return the fallback content.

397-398: ⚠️ Potential issue | 🔴 Critical

Reset the tool-call tracker per request, not per AgentLoop.

This counter now lives for the entire agent instance, so one request can exhaust max_tool_calls for every later request handled by the same AgentLoop. That contradicts the "per-agent-run" contract.

Suggested fix

 pub struct AgentLoop {
     /// Per-session token budget tracker.
     token_budget: Arc<TokenBudget>,
-    /// Per-agent-run tool call limit tracker.
-    tool_call_limit: ToolCallLimitTracker,
     /// Tool approval gate for policy-based tool gating.
     approval_gate: Arc<ApprovalGate>,

 pub async fn process_message(&self, msg: &InboundMessage) -> Result<String> {
+    let tool_call_limit = ToolCallLimitTracker::new(self.config.agents.defaults.max_tool_calls);
     // Acquire a per-session lock to serialize concurrent messages for the
     // same session key. Different sessions can still proceed concurrently.

 pub async fn process_message_streaming(
     &self,
     msg: &InboundMessage,
 ) -> Result<tokio::sync::mpsc::Receiver<crate::providers::StreamEvent>> {
+    let tool_call_limit = ToolCallLimitTracker::new(self.config.agents.defaults.max_tool_calls);
     use crate::providers::StreamEvent;

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/agent/loop.rs` around lines 397 - 398, The ToolCallLimitTracker currently
stored as the AgentLoop struct field (tool_call_limit: ToolCallLimitTracker)
causes the counter to persist across multiple requests; change the lifecycle so
the tracker is created or reset at the start of each agent run/request instead
of living on AgentLoop. Specifically, remove or stop using the persistent
tool_call_limit field for per-request counting and instead instantiate a fresh
ToolCallLimitTracker (or call its reset method) inside the entry point that
handles a single run/request (e.g., AgentLoop::run or the request-handling
method), and wire that per-run tracker into any functions that reference
tool_call_limit so each request starts with a fresh counter.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@src/agent/loop.rs`:
- Around line 1012-1018: When the tool-call cap causes the loop to abort (the
branch that checks self.tool_call_limit.is_exceeded()), ensure you set a
fallback assistant message into response.content before breaking so callers do
not receive an empty final assistant reply; locate the non-streaming tool loop
where the code currently logs and breaks and assign a suitable assistant content
string (e.g., a short "Tool call limit reached, aborting" assistant message) to
response.content, and mirror the exact same assignment in the streaming branch
(the analogous branch around the streaming path) before its break so both paths
return the fallback content.
- Around line 397-398: The ToolCallLimitTracker currently stored as the
AgentLoop struct field (tool_call_limit: ToolCallLimitTracker) causes the
counter to persist across multiple requests; change the lifecycle so the tracker
is created or reset at the start of each agent run/request instead of living on
AgentLoop. Specifically, remove or stop using the persistent tool_call_limit
field for per-request counting and instead instantiate a fresh
ToolCallLimitTracker (or call its reset method) inside the entry point that
handles a single run/request (e.g., AgentLoop::run or the request-handling
method), and wire that per-run tracker into any functions that reference
tool_call_limit so each request starts with a fresh counter.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 1e8de97e-8224-411d-bc1f-e6a8bc594812

📥 Commits

Reviewing files that changed from the base of the PR and between c4aa4dd and f45ef38.

📒 Files selected for processing (1)

src/agent/loop.rs

…atch When the tool call limit was reached after executing the last allowed batch, the loop broke immediately and returned the stale tool-call stub content from the previous LLM response. The executed tool results were in the session but the user got the assistant's earlier text instead of a final answer derived from those results. Non-streaming path: now makes one final LLM call with empty tool definitions (vec![]) so the model is forced to produce a text synthesis of the tool results before returning. Streaming path: clears response.tool_calls before breaking so the post-loop code enters the streaming final call branch, which re-issues the full conversation (with tool results in session) as a proper streamed response. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…oken budget Two issues in the tool-call-limit post-batch path: 1. Streaming: the post-loop final streaming branch rebuilt full tool definitions and passed them to chat_stream, allowing the model to emit ToolCalls after the limit was supposedly enforced. Now tracks tool_limit_hit flag and passes vec![] when the cap was reached. 2. Non-streaming: the synthesis LLM call bypassed the token budget gate that guards regular loop iterations. Now checks token_budget.is_exceeded() before issuing the synthesis call; if over budget, returns a descriptive message instead. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Add MockBatchToolProvider and 4 e2e tests for per-template max_tool_calls enforcement: - zero budget blocks all tool execution - exact budget allows full batch then synthesizes - over-budget batch gets truncated - unlimited (None) allows normal flow Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Two review fixes: 1. (High) ToolCallLimitTracker persisted across process_message() calls, so once any conversation exhausted max_tool_calls the agent was permanently blocked. Add reset() and call it at the top of both process_message and process_message_streaming. 2. (Medium) GitTool shelled out via Command::new("git") bypassing ShellSecurityConfig, so templates with shell_allowlist=[] could still run git commands. Thread ShellSecurityConfig into GitTool and validate before execution. Tests: ToolCallLimitTracker::reset unit test, 3 git allowlist unit tests, e2e regression test for counter reset between runs. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

TokenBudget persisted across process_message() calls on the same AgentLoop, so once any conversation exhausted max_token_budget, later runs on the same agent were permanently blocked. Add token_budget.reset() alongside tool_call_limit.reset() at the top of both process_message and process_message_streaming. Regression test: test_token_budget_resets_between_runs verifies the second run on the same agent succeeds after the first exhausts the budget. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

qhkm and others added 5 commits March 7, 2026 14:20

feat(templates): add shell_allowlist, max_token_budget, max_tool_call…

3f2a5f9

…s fields Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

feat(templates): support TOML template files alongside JSON

f4151d0

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

coderabbitai bot reviewed Mar 7, 2026

View reviewed changes

qhkm and others added 2 commits March 7, 2026 16:35

coderabbitai bot reviewed Mar 7, 2026

View reviewed changes

qhkm and others added 5 commits March 7, 2026 16:44

qhkm merged commit 5244281 into main Mar 7, 2026
9 checks passed

qhkm deleted the feat/template-capability-sandbox branch March 7, 2026 17:21

coderabbitai bot mentioned this pull request Mar 7, 2026

feat: add grep, find, and unified diff edit tools #283

Merged

7 tasks

This was referenced Mar 8, 2026

🦞 OpenClaw 生态日报 2026-03-08 duanyytop/agents-radar#96

Open

🦞 OpenClaw 生态日报 2026-03-08 rollysys/agents-radar#53

Open

This was referenced Mar 13, 2026

feat: stream by default + response metadata footer #342

Merged

feat: interactive approval prompts, streaming parity, safer defaults #353

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: per-template capability sandbox (#222)#279

feat: per-template capability sandbox (#222)#279
qhkm merged 13 commits intomainfrom
feat/template-capability-sandbox

qhkm commented Mar 7, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Mar 7, 2026 •

edited

Loading

Reviews paused

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Possibly related PRs

Poem

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

coderabbitai bot Mar 7, 2026

Uh oh!

Uh oh!

coderabbitai bot Mar 7, 2026

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

		/// Per-agent-run tool call limit tracker.
		tool_call_limit: ToolCallLimitTracker,

Conversation

qhkm commented Mar 7, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Key design decisions

Files changed

Test plan

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Mar 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Possibly related PRs

Poem

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot Mar 7, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot Mar 7, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

qhkm commented Mar 7, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Mar 7, 2026 •

edited

Loading