feat: per-template capability sandbox (#222)#279
Conversation
…s fields Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…mTool, PluginTool Build ShellSecurityConfig from template shell_allowlist and inject it into all 3 shell execution paths. When a template defines an allowlist, Strict mode ensures only listed binaries can execute. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add resolve_token_budget() helper that picks the effective budget from global config and optional template override (lower-wins when both set). Wire it into the template override block in create_agent so templates can cap per-session token usage. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Adds ToolCallLimitTracker (AtomicU32) that enforces per-template max_tool_calls limits. Wired into AgentLoop with enforcement in both streaming and non-streaming tool execution loops. The tracker is initialized from config.agents.defaults.max_tool_calls, which is set from the template's max_tool_calls field. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughThreads per-template sandbox data through boot/registration and runtime: adds shell allowlists, per-template token-budget resolution, and a per-run tool-call limiter; registers tools with security configs and enforces tool-call caps and shell command validation at execution. Changes
Sequence DiagramsequenceDiagram
participant CLI as CLI / Template loader
participant Config as Config Loader
participant Kernel as ZeptoKernel::boot
participant Registrar as registrar::register_all_tools
participant Tools as Tool Factory (Shell/Plugin/Custom)
participant AgentLoop as AgentLoop runtime
participant ToolLimit as ToolCallLimitTracker
CLI->>Config: load template (JSON/TOML) with shell_allowlist, max_token_budget, max_tool_calls
Config->>Kernel: provide AgentTemplate + defaults
Kernel->>Registrar: boot(pass ToolDeps { template: AgentTemplate })
Registrar->>Registrar: build_shell_config(template)
Registrar->>Tools: register tools with ShellSecurityConfig
Tools->>Tools: store security config per tool
AgentLoop->>ToolLimit: remaining()
alt remaining >= batch
AgentLoop->>Tools: execute tool batch
Tools->>Tools: security.validate_command(command)
Tools-->>AgentLoop: results
AgentLoop->>ToolLimit: increment(batch)
else remaining < batch
AgentLoop->>AgentLoop: truncate/skip tool calls, log, run final synthesis
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 4
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/agent/budget.rs`:
- Around line 172-180: The resolve_token_budget function treats 0 as a valid
template cap and thus allows Some(0) to disable a finite global cap; change the
branching in resolve_token_budget so that 0 always means “unlimited” (i.e., if
global == 0 return tpl, else if tpl == 0 return global, otherwise return
global.min(tpl)), and add a regression test (e.g.,
test_resolve_token_budget_template_unlimited_does_not_bypass_global) asserting
resolve_token_budget(100_000, Some(0)) == 100_000 to prevent regressions.
In `@src/agent/loop.rs`:
- Around line 397-398: The ToolCallLimitTracker is stored on AgentLoop and never
reset, causing one request to exhaust tool-call budget for subsequent runs;
remove the struct field usage and instead create a fresh local tracker at the
start of each run by instantiating
ToolCallLimitTracker::new(self.config.agents.defaults.max_tool_calls) inside
process_message() and process_message_streaming(), replace references to the old
self.tool_call_limit with the new local variable, and remove the tool_call_limit
field from AgentLoop so each message run has its own per-agent-run tracker.
- Around line 1011-1021: The current logic increments self.tool_call_limit via
self.tool_call_limit.increment(...) before executing tool calls and then breaks
if self.tool_call_limit.is_exceeded(), which makes max_tool_calls behave like
N-1 and can leave response.content empty; fix by checking the limit against the
upcoming call count before incrementing (or use a would_exceed-like check) and
only call self.tool_call_limit.increment(...) after allowing/executing the tool
calls; additionally, when you do break because the limit would be exceeded,
ensure you set or preserve response.content (e.g., set a clear “tool call limit
reached” message or retain prior assistant content) so the final assistant reply
is not empty.
In `@src/config/mod.rs`:
- Around line 200-204: The current env var parsing for
ZEPTOCLAW_AGENTS_DEFAULTS_MAX_TOOL_CALLS only accepts numeric values and ignores
empty strings, so you cannot clear an existing Option<u32>; update the logic
around reading std::env::var so that if the variable is present and
val.trim().is_empty() you set self.agents.defaults.max_tool_calls = None,
otherwise attempt val.parse::<u32>() and set Some(v) on success (and keep
current behavior for parse failures or surface/log an error as appropriate).
Target the block that reads the env var and assigns
self.agents.defaults.max_tool_calls to implement this empty-string -> None
behavior.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: b1a44f85-a4b7-42a0-a7e3-c2ba9033c509
📒 Files selected for processing (13)
src/agent/budget.rssrc/agent/loop.rssrc/agent/mod.rssrc/agent/tool_call_limit.rssrc/cli/common.rssrc/config/mod.rssrc/config/templates.rssrc/config/types.rssrc/config/validate.rssrc/kernel/mod.rssrc/kernel/registrar.rssrc/tools/custom.rssrc/tools/plugin.rs
| /// Per-agent-run tool call limit tracker. | ||
| tool_call_limit: ToolCallLimitTracker, |
There was a problem hiding this comment.
Create a fresh tool-call tracker for each message run.
The field comment says “per-agent-run”, but this tracker is initialized once on the long-lived AgentLoop and never reset. One request can exhaust the budget for every later request handled by the same agent instance.
💡 Fix direction
pub struct AgentLoop {
/// Per-session token budget tracker.
token_budget: Arc<TokenBudget>,
- /// Per-agent-run tool call limit tracker.
- tool_call_limit: ToolCallLimitTracker,
/// Tool approval gate for policy-based tool gating.
approval_gate: Arc<ApprovalGate>,let tool_call_limit = ToolCallLimitTracker::new(self.config.agents.defaults.max_tool_calls);Instantiate that local tracker at the start of process_message() and process_message_streaming() instead of storing it on the struct.
Also applies to: 477-479, 514-516, 546-547, 582-583
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/agent/loop.rs` around lines 397 - 398, The ToolCallLimitTracker is stored
on AgentLoop and never reset, causing one request to exhaust tool-call budget
for subsequent runs; remove the struct field usage and instead create a fresh
local tracker at the start of each run by instantiating
ToolCallLimitTracker::new(self.config.agents.defaults.max_tool_calls) inside
process_message() and process_message_streaming(), replace references to the old
self.tool_call_limit with the new local variable, and remove the tool_call_limit
field from AgentLoop so each message run has its own per-agent-run tracker.
| if let Ok(val) = std::env::var("ZEPTOCLAW_AGENTS_DEFAULTS_MAX_TOOL_CALLS") { | ||
| if let Ok(v) = val.parse::<u32>() { | ||
| self.agents.defaults.max_tool_calls = Some(v); | ||
| } | ||
| } |
There was a problem hiding this comment.
Allow the env override to clear max_tool_calls as well.
This only handles numeric values, so a deployment cannot use ZEPTOCLAW_AGENTS_DEFAULTS_MAX_TOOL_CALLS to remove a cap that is already set in config.json. For an Option<u32>, an empty value should clear the override instead of being ignored.
Suggested fix
if let Ok(val) = std::env::var("ZEPTOCLAW_AGENTS_DEFAULTS_MAX_TOOL_CALLS") {
- if let Ok(v) = val.parse::<u32>() {
+ let val = val.trim();
+ if val.is_empty() {
+ self.agents.defaults.max_tool_calls = None;
+ } else if let Ok(v) = val.parse::<u32>() {
self.agents.defaults.max_tool_calls = Some(v);
}
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if let Ok(val) = std::env::var("ZEPTOCLAW_AGENTS_DEFAULTS_MAX_TOOL_CALLS") { | |
| if let Ok(v) = val.parse::<u32>() { | |
| self.agents.defaults.max_tool_calls = Some(v); | |
| } | |
| } | |
| if let Ok(val) = std::env::var("ZEPTOCLAW_AGENTS_DEFAULTS_MAX_TOOL_CALLS") { | |
| let val = val.trim(); | |
| if val.is_empty() { | |
| self.agents.defaults.max_tool_calls = None; | |
| } else if let Ok(v) = val.parse::<u32>() { | |
| self.agents.defaults.max_tool_calls = Some(v); | |
| } | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/config/mod.rs` around lines 200 - 204, The current env var parsing for
ZEPTOCLAW_AGENTS_DEFAULTS_MAX_TOOL_CALLS only accepts numeric values and ignores
empty strings, so you cannot clear an existing Option<u32>; update the logic
around reading std::env::var so that if the variable is present and
val.trim().is_empty() you set self.agents.defaults.max_tool_calls = None,
otherwise attempt val.parse::<u32>() and set Some(v) on success (and keep
current behavior for parse failures or surface/log an error as appropriate).
Target the block that reads the env var and assigns
self.agents.defaults.max_tool_calls to implement this empty-string -> None
behavior.
…nd loop - resolve_token_budget: treat Some(0) as None so template cannot disable a finite global budget (0 = unlimited in TokenBudget) - agent loop: move tool_call_limit increment+check after execution in both streaming and non-streaming paths (fixes off-by-one that blocked the first allowed call) - common.rs: intersect template max_tool_calls with global via min() instead of direct override Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
♻️ Duplicate comments (2)
src/agent/loop.rs (2)
397-398:⚠️ Potential issue | 🔴 CriticalInstantiate the tool-call tracker per request, not on
AgentLoop.This tracker is stateful, but these lines make it live for the entire agent instance. One run can exhaust the budget for later runs, including other sessions processed by the same loop.
Possible fix direction
pub struct AgentLoop { /// Per-session token budget tracker. token_budget: Arc<TokenBudget>, - /// Per-agent-run tool call limit tracker. - tool_call_limit: ToolCallLimitTracker, /// Tool approval gate for policy-based tool gating. approval_gate: Arc<ApprovalGate>,- let tool_call_limit = ToolCallLimitTracker::new(config.agents.defaults.max_tool_calls); let approval_gate = Arc::new(ApprovalGate::new(config.approval.clone())); ... - tool_call_limit, approval_gate,let tool_call_limit = ToolCallLimitTracker::new(self.config.agents.defaults.max_tool_calls);Create that local tracker at the start of
process_message()andprocess_message_streaming(), then use the local instance throughout each run.Also applies to: 478-479, 515-515, 546-547, 583-583
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/agent/loop.rs` around lines 397 - 398, The ToolCallLimitTracker is currently stored on AgentLoop as the field tool_call_limit, causing its state to persist across runs; instead, remove that field and instantiate a fresh local tracker at the start of each run (e.g., inside process_message() and process_message_streaming()) using ToolCallLimitTracker::new(self.config.agents.defaults.max_tool_calls), then pass that local tracker through the call chain for the duration of the single request so each run gets an independent budget.
1298-1308:⚠️ Potential issue | 🟠 MajorDon’t end the run with the tool-call turn as the final assistant reply.
When this branch trips, the loop exits before a post-tool LLM turn. The returned
response.contentis still from the tool-call response, which is often empty whentool_callsare present, so exact-limit runs can finish with a blank or stale answer.Also applies to: 1785-1795
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/agent/loop.rs` around lines 1298 - 1308, The current branch increments tool_call_limit and breaks immediately when exceeded, causing the loop to return the last tool-call response (often empty) instead of performing the required post-tool LLM turn; change the behavior in the tool_call_limit handling (the block that touches self.tool_call_limit, response.tool_calls, and does the info! log) to not break immediately but instead set a flag (e.g., stop_after_llm_turn or pending_post_tool_llm) or adjust the loop control so the loop completes one more iteration to run the post-tool LLM turn and produce an assistant response derived from the LLM rather than the tool-call response; apply the same fix to the analogous branch around the other occurrence noted (the block at 1785-1795).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@src/agent/loop.rs`:
- Around line 397-398: The ToolCallLimitTracker is currently stored on AgentLoop
as the field tool_call_limit, causing its state to persist across runs; instead,
remove that field and instantiate a fresh local tracker at the start of each run
(e.g., inside process_message() and process_message_streaming()) using
ToolCallLimitTracker::new(self.config.agents.defaults.max_tool_calls), then pass
that local tracker through the call chain for the duration of the single request
so each run gets an independent budget.
- Around line 1298-1308: The current branch increments tool_call_limit and
breaks immediately when exceeded, causing the loop to return the last tool-call
response (often empty) instead of performing the required post-tool LLM turn;
change the behavior in the tool_call_limit handling (the block that touches
self.tool_call_limit, response.tool_calls, and does the info! log) to not break
immediately but instead set a flag (e.g., stop_after_llm_turn or
pending_post_tool_llm) or adjust the loop control so the loop completes one more
iteration to run the post-tool LLM turn and produce an assistant response
derived from the LLM rather than the tool-call response; apply the same fix to
the analogous branch around the other occurrence noted (the block at 1785-1795).
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 1a59dc65-836d-4832-9a27-af9be6b1286a
📒 Files selected for processing (3)
src/agent/budget.rssrc/agent/loop.rssrc/cli/common.rs
The limit check was only performed after the entire batch of tool calls had already executed, allowing max_tool_calls=0 to still run one batch and any budget smaller than the batch size to be overshot. Now: - Check is_exceeded() BEFORE building futures — breaks immediately if limit already reached (handles max_tool_calls=0 correctly) - Truncate response.tool_calls to remaining() budget before building futures, so a batch of 10 with 3 remaining only executes 3 - Truncation happens before compute_tool_result_budget so the per-tool budget is computed from the actual execution count - Add ToolCallLimitTracker::remaining() method with tests Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The limit check and batch truncation happened after the assistant tool-call message was already written to the session transcript. This caused: - max_tool_calls=0 to leave an orphaned tool-call message with no matching results - Partial truncation to record tool IDs that were never executed, producing an inconsistent transcript for the next LLM call - metrics.record_tool_calls() to over-report the original batch size Now in both paths (non-streaming and streaming): 1. is_exceeded() check + break — before anything is written 2. truncate() — before anything is written 3. record_tool_calls() — after truncation, reflects actual count 4. assistant message — built from the truncated tool_calls list Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
♻️ Duplicate comments (1)
src/agent/loop.rs (1)
397-398:⚠️ Potential issue | 🔴 CriticalPer-run tracker must be instantiated per run, not stored on the struct.
The comment says "per-agent-run" but
tool_call_limitis initialized once whenAgentLoopis constructed and never reset. This causes one request to exhaust the tool-call budget for all subsequent requests handled by the same agent instance.💡 Recommended fix
Remove the field from
AgentLoopand instantiate a fresh tracker at the start of eachprocess_message()andprocess_message_streaming():pub struct AgentLoop { // ... - /// Per-agent-run tool call limit tracker. - tool_call_limit: ToolCallLimitTracker, // ... }Then in
process_message()andprocess_message_streaming():// At the start of the method, after acquiring session lock: let tool_call_limit = ToolCallLimitTracker::new(self.config.agents.defaults.max_tool_calls);Replace all
self.tool_call_limitreferences with the localtool_call_limitvariable.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/agent/loop.rs` around lines 397 - 398, The tool-call tracker is currently stored on the AgentLoop struct as tool_call_limit but should be recreated per request; remove the tool_call_limit field from the AgentLoop struct and any initialization in its constructor, then at the start of each request-handling method (process_message and process_message_streaming) instantiate a local ToolCallLimitTracker (e.g. via ToolCallLimitTracker::new(self.config.agents.defaults.max_tool_calls) after acquiring the session lock) and replace all uses of self.tool_call_limit with the new local tool_call_limit variable so each run gets a fresh per-run tracker.
🧹 Nitpick comments (1)
src/agent/loop.rs (1)
1321-1331: Post-execution increment is correct; consider setting response content on limit break.The increment-after-execute pattern correctly avoids the off-by-one issue. However, when breaking due to limit,
response.contentmay be empty (LLM responses with tool calls often have minimal content), potentially leaving the user without a meaningful final message.Compare to the loop guard breaks at lines 1335-1345 which explicitly set
response.contentbefore breaking.💡 Optional: Set informative content on limit break
if self.tool_call_limit.is_exceeded() { info!( count = self.tool_call_limit.count(), limit = ?self.tool_call_limit.limit(), "Tool call limit reached, stopping tool execution" ); + if response.content.is_empty() { + response.content = "Tool call limit reached.".to_string(); + } break; }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/agent/loop.rs` around lines 1321 - 1331, The tool call limit is incremented after executing tools (self.tool_call_limit.increment(...)) which is correct, but when is_exceeded() triggers the loop break the response may lack user-facing content; update the break path to set a meaningful response.content (similar to the other loop guard breaks) before breaking—e.g., assign a concise informative message to response.content indicating the tool call limit was reached and include debugging info (self.tool_call_limit.count() / self.tool_call_limit.limit()) so callers of the function receive a final message when the loop exits due to tool_call_limit.is_exceeded().
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@src/agent/loop.rs`:
- Around line 397-398: The tool-call tracker is currently stored on the
AgentLoop struct as tool_call_limit but should be recreated per request; remove
the tool_call_limit field from the AgentLoop struct and any initialization in
its constructor, then at the start of each request-handling method
(process_message and process_message_streaming) instantiate a local
ToolCallLimitTracker (e.g. via
ToolCallLimitTracker::new(self.config.agents.defaults.max_tool_calls) after
acquiring the session lock) and replace all uses of self.tool_call_limit with
the new local tool_call_limit variable so each run gets a fresh per-run tracker.
---
Nitpick comments:
In `@src/agent/loop.rs`:
- Around line 1321-1331: The tool call limit is incremented after executing
tools (self.tool_call_limit.increment(...)) which is correct, but when
is_exceeded() triggers the loop break the response may lack user-facing content;
update the break path to set a meaningful response.content (similar to the other
loop guard breaks) before breaking—e.g., assign a concise informative message to
response.content indicating the tool call limit was reached and include
debugging info (self.tool_call_limit.count() / self.tool_call_limit.limit()) so
callers of the function receive a final message when the loop exits due to
tool_call_limit.is_exceeded().
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 48c26f3d-b43f-4aa7-9341-dd86f28766e5
📒 Files selected for processing (2)
src/agent/loop.rssrc/agent/tool_call_limit.rs
🚧 Files skipped from review as they are similar to previous changes (1)
- src/agent/tool_call_limit.rs
There was a problem hiding this comment.
♻️ Duplicate comments (2)
src/agent/loop.rs (2)
1012-1018:⚠️ Potential issue | 🟠 MajorReturn a fallback assistant message when the cap aborts the tool loop.
If the limit is already exhausted here, this breaks without updating
response.content. On tool-call turns that content is often empty, so the caller can get a blank final assistant reply. Line 1565 in the streaming path has the same gap.Suggested fix
- if self.tool_call_limit.is_exceeded() { + if self.tool_call_limit.is_exceeded() { + if let Some(limit) = self.tool_call_limit.limit() { + response.content = format!( + "Stopped tool execution after reaching the max_tool_calls limit ({}).", + limit + ); + } info!( count = self.tool_call_limit.count(), limit = ?self.tool_call_limit.limit(), "Tool call limit already reached, skipping tool execution" ); break; }Apply the same change to the streaming branch before its
break.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/agent/loop.rs` around lines 1012 - 1018, When the tool-call cap causes the loop to abort (the branch that checks self.tool_call_limit.is_exceeded()), ensure you set a fallback assistant message into response.content before breaking so callers do not receive an empty final assistant reply; locate the non-streaming tool loop where the code currently logs and breaks and assign a suitable assistant content string (e.g., a short "Tool call limit reached, aborting" assistant message) to response.content, and mirror the exact same assignment in the streaming branch (the analogous branch around the streaming path) before its break so both paths return the fallback content.
397-398:⚠️ Potential issue | 🔴 CriticalReset the tool-call tracker per request, not per
AgentLoop.This counter now lives for the entire agent instance, so one request can exhaust
max_tool_callsfor every later request handled by the sameAgentLoop. That contradicts the "per-agent-run" contract.Suggested fix
pub struct AgentLoop { /// Per-session token budget tracker. token_budget: Arc<TokenBudget>, - /// Per-agent-run tool call limit tracker. - tool_call_limit: ToolCallLimitTracker, /// Tool approval gate for policy-based tool gating. approval_gate: Arc<ApprovalGate>,pub async fn process_message(&self, msg: &InboundMessage) -> Result<String> { + let tool_call_limit = ToolCallLimitTracker::new(self.config.agents.defaults.max_tool_calls); // Acquire a per-session lock to serialize concurrent messages for the // same session key. Different sessions can still proceed concurrently.pub async fn process_message_streaming( &self, msg: &InboundMessage, ) -> Result<tokio::sync::mpsc::Receiver<crate::providers::StreamEvent>> { + let tool_call_limit = ToolCallLimitTracker::new(self.config.agents.defaults.max_tool_calls); use crate::providers::StreamEvent;🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/agent/loop.rs` around lines 397 - 398, The ToolCallLimitTracker currently stored as the AgentLoop struct field (tool_call_limit: ToolCallLimitTracker) causes the counter to persist across multiple requests; change the lifecycle so the tracker is created or reset at the start of each agent run/request instead of living on AgentLoop. Specifically, remove or stop using the persistent tool_call_limit field for per-request counting and instead instantiate a fresh ToolCallLimitTracker (or call its reset method) inside the entry point that handles a single run/request (e.g., AgentLoop::run or the request-handling method), and wire that per-run tracker into any functions that reference tool_call_limit so each request starts with a fresh counter.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@src/agent/loop.rs`:
- Around line 1012-1018: When the tool-call cap causes the loop to abort (the
branch that checks self.tool_call_limit.is_exceeded()), ensure you set a
fallback assistant message into response.content before breaking so callers do
not receive an empty final assistant reply; locate the non-streaming tool loop
where the code currently logs and breaks and assign a suitable assistant content
string (e.g., a short "Tool call limit reached, aborting" assistant message) to
response.content, and mirror the exact same assignment in the streaming branch
(the analogous branch around the streaming path) before its break so both paths
return the fallback content.
- Around line 397-398: The ToolCallLimitTracker currently stored as the
AgentLoop struct field (tool_call_limit: ToolCallLimitTracker) causes the
counter to persist across multiple requests; change the lifecycle so the tracker
is created or reset at the start of each agent run/request instead of living on
AgentLoop. Specifically, remove or stop using the persistent tool_call_limit
field for per-request counting and instead instantiate a fresh
ToolCallLimitTracker (or call its reset method) inside the entry point that
handles a single run/request (e.g., AgentLoop::run or the request-handling
method), and wire that per-run tracker into any functions that reference
tool_call_limit so each request starts with a fresh counter.
…atch When the tool call limit was reached after executing the last allowed batch, the loop broke immediately and returned the stale tool-call stub content from the previous LLM response. The executed tool results were in the session but the user got the assistant's earlier text instead of a final answer derived from those results. Non-streaming path: now makes one final LLM call with empty tool definitions (vec![]) so the model is forced to produce a text synthesis of the tool results before returning. Streaming path: clears response.tool_calls before breaking so the post-loop code enters the streaming final call branch, which re-issues the full conversation (with tool results in session) as a proper streamed response. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…oken budget Two issues in the tool-call-limit post-batch path: 1. Streaming: the post-loop final streaming branch rebuilt full tool definitions and passed them to chat_stream, allowing the model to emit ToolCalls after the limit was supposedly enforced. Now tracks tool_limit_hit flag and passes vec![] when the cap was reached. 2. Non-streaming: the synthesis LLM call bypassed the token budget gate that guards regular loop iterations. Now checks token_budget.is_exceeded() before issuing the synthesis call; if over budget, returns a descriptive message instead. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add MockBatchToolProvider and 4 e2e tests for per-template max_tool_calls enforcement: - zero budget blocks all tool execution - exact budget allows full batch then synthesizes - over-budget batch gets truncated - unlimited (None) allows normal flow Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Two review fixes:
1. (High) ToolCallLimitTracker persisted across process_message() calls,
so once any conversation exhausted max_tool_calls the agent was
permanently blocked. Add reset() and call it at the top of both
process_message and process_message_streaming.
2. (Medium) GitTool shelled out via Command::new("git") bypassing
ShellSecurityConfig, so templates with shell_allowlist=[] could still
run git commands. Thread ShellSecurityConfig into GitTool and validate
before execution.
Tests: ToolCallLimitTracker::reset unit test, 3 git allowlist unit tests,
e2e regression test for counter reset between runs.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
TokenBudget persisted across process_message() calls on the same AgentLoop, so once any conversation exhausted max_token_budget, later runs on the same agent were permanently blocked. Add token_budget.reset() alongside tool_call_limit.reset() at the top of both process_message and process_message_streaming. Regression test: test_token_budget_resets_between_runs verifies the second run on the same agent succeeds after the first exhausts the budget. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Summary
Adds three new optional fields to
AgentTemplatefor declarative per-template sandboxing:shell_allowlist— restrict which shell binaries the agent can execute (Strict mode). Applied to ShellTool, CustomTool, AND PluginTool via a sharedShellSecurityConfigbuilt once at kernel boot.max_token_budget— per-agent-run token cap. Resolved viamin(template, global)so templates can restrict but never expand beyond global.max_tool_calls— hard cap on total tool calls across the entire agent run via a dedicatedToolCallLimitTracker(AtomicU32). Intentionally NOT wired into LoopGuard.Also adds TOML template loading —
.tomlfiles in the template directory are now parsed alongside.json.All fields are optional with
Nonedefaults — zero behavior change for existing templates.Key design decisions
max_tool_callsenforcementmax_token_budget(notmax_tokens— avoids collision with response generation field)min()for numeric limitsFiles changed
src/config/templates.rs— 3 new fields + TOML loading + 7 testssrc/kernel/registrar.rs—build_shell_config()+ ToolDeps.template + 4 testssrc/kernel/mod.rs— pass template to ToolDepssrc/tools/custom.rs—with_security()constructor + 1 testsrc/tools/plugin.rs— security field +validate_command()beforesh -c+ 1 testsrc/agent/budget.rs—resolve_token_budget()+ 4 testssrc/agent/tool_call_limit.rs— NEW:ToolCallLimitTracker+ 5 testssrc/agent/mod.rs— module + re-exportsrc/agent/loop.rs— tracker field + enforcement in both streaming/non-streaming loopssrc/config/types.rs—max_tool_callson AgentDefaultssrc/config/mod.rs— env var overridesrc/config/validate.rs— known fieldsrc/cli/common.rs— template overrides for budget + tool callsTest plan
cargo clippy -- -D warningscleancargo fmt -- --checkclean["git"]blockscurl; empty list blocks all;Noneuses defaultmin(template, global)logic verified for all 4 branchesCloses #222
🤖 Generated with Claude Code
Summary by CodeRabbit
New Features
Tests