feat(routines): approval context for autonomous job execution#577
feat(routines): approval context for autonomous job execution#577ilblackdragon merged 16 commits intomainfrom
Conversation
Routines and background jobs were unable to use any tools that required approval (file ops, shell, message, http), making them effectively useless. This adds an ApprovalContext system that lets autonomous jobs pre-authorize tools at dispatch time. - Add ApprovalContext enum with Autonomous variant that auto-approves UnlessAutoApproved tools and optionally pre-authorizes Always tools - Add tool_permissions field to RoutineAction::FullJob for pre-authorizing Always-gated tools (e.g. destructive shell, cross-channel messaging) - Add Scheduler::dispatch_job_with_context() to thread approval context through to workers - Set message tool default channel/target from routine NotifyConfig so routines can send results without cross-channel approval - Fix Completed→Completed state transition error in worker (plan marks job completed, then direct loop or outer run() tries again) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add a 3-turn trace fixture and test that exercises: - Turn 1: routine_create with full_job mode and tool_permissions - Turn 2: Simulated digest workflow with echo + memory_write - Turn 3: Verification via memory_search [skip-regression-check] Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly enhances the system's capability to manage tool execution within autonomous jobs, such as scheduled routines. By introducing a granular Highlights
Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request introduces a new ApprovalContext to allow autonomous jobs to execute tools that require approval by pre-authorizing them, which is a great enhancement for routines. However, the current implementation has two significant security flaws: RoutineCreateTool defaults to Never approval, potentially allowing prompt injection to create routines with arbitrary permissions, and the Autonomous context automatically permits all UnlessAutoApproved tools, bypassing tool_permissions restrictions. Additionally, there are opportunities to reduce code duplication for better maintainability, and a potential issue where subtasks might not inherit the approval context, which could impact complex autonomous jobs.
| match self { | ||
| Self::Autonomous { allowed_tools } => match requirement { | ||
| ApprovalRequirement::Never => false, | ||
| ApprovalRequirement::UnlessAutoApproved => false, |
There was a problem hiding this comment.
The ApprovalContext::is_blocked method is implemented such that all tools with ApprovalRequirement::UnlessAutoApproved are automatically permitted in an Autonomous context (used for routines). This means that any tool categorized as UnlessAutoApproved (which often includes sensitive operations like file writes or network requests) will be allowed to run in any routine without explicit authorization in tool_permissions. This significantly increases the risk of autonomous jobs performing sensitive actions without explicit user consent, especially if an attacker can manipulate the routine's prompt via injection.
| ApprovalRequirement::UnlessAutoApproved => false, | |
| ApprovalRequirement::UnlessAutoApproved => true, // Block UnlessAutoApproved in autonomous context by default |
There was a problem hiding this comment.
This is by design. Routines run autonomously and need basic tools like file read/write, http, and memory to function. Only Always-approval tools (shell, message) require explicit listing in tool_permissions. Blocking UnlessAutoApproved would break all routine functionality since even memory_write and read_file use that level. The defense-in-depth here is: (1) routines are created by the user with explicit tool_permissions, (2) the routine prompt is user-authored, (3) tool outputs still go through the safety layer.
| tools, | ||
| context_manager, | ||
| safety, | ||
| None, |
There was a problem hiding this comment.
When calling execute_tool_task from spawn_subtask, None is passed for approval_context. This means subtasks will not inherit the ApprovalContext from their parent job and will use default approval logic. This could cause tools to be incorrectly blocked in autonomous routines that spawn subtasks. The ApprovalContext should be passed down to subtasks, which may require modifying the Task::ToolExec enum.
There was a problem hiding this comment.
Added TODO comment in 55d4acb. spawn_subtask is only called by spawn_batch which is not used in any production code path — only in scheduler tests. Will propagate ApprovalContext when subtasks are actually used in routine/autonomous paths.
| pub async fn dispatch_job_with_context( | ||
| &self, | ||
| user_id: &str, | ||
| title: &str, | ||
| description: &str, | ||
| metadata: Option<serde_json::Value>, | ||
| approval_context: ApprovalContext, | ||
| ) -> Result<Uuid, JobError> { | ||
| let job_id = self | ||
| .context_manager | ||
| .create_job_for_user(user_id, title, description) | ||
| .await?; | ||
|
|
||
| if let Some(meta) = metadata { | ||
| self.context_manager | ||
| .update_context(job_id, |ctx| { | ||
| ctx.metadata = meta; | ||
| }) | ||
| .await?; | ||
| } | ||
|
|
||
| if let Some(ref store) = self.store { | ||
| let ctx = self.context_manager.get_context(job_id).await?; | ||
| store.save_job(&ctx).await.map_err(|e| JobError::Failed { | ||
| id: job_id, | ||
| reason: format!("failed to persist job: {e}"), | ||
| })?; | ||
| } |
There was a problem hiding this comment.
There's significant code duplication between dispatch_job and dispatch_job_with_context. The logic for creating the job, updating metadata, and persisting it is identical in both functions. To improve maintainability, consider extracting this common logic into a private helper function. This aligns with the principle of consolidating related sequences of operations into a single reusable method to improve code consistency and maintainability.
References
- Consolidate related sequences of operations, such as creating, persisting, and scheduling a job, into a single reusable method to improve code consistency and maintainability.
There was a problem hiding this comment.
Resolved in 2bc6c0e — extracted dispatch_job_inner private helper.
| let blocked = match &deps.approval_context { | ||
| Some(ctx) => ctx.is_blocked(tool_name, requirement), | ||
| None => requirement.is_required(), | ||
| }; |
There was a problem hiding this comment.
This approval check logic is duplicated in src/agent/scheduler.rs inside execute_tool_task. To avoid duplication, this logic could be extracted into a static helper method on ApprovalContext itself. For example:
// in ApprovalContext
pub fn is_blocked_or_default(context: &Option<Self>, tool_name: &str, requirement: ApprovalRequirement) -> bool {
match context {
Some(ctx) => ctx.is_blocked(tool_name, requirement),
None => requirement.is_required(),
}
}Then both call sites could be simplified to let blocked = ApprovalContext::is_blocked_or_default(&deps.approval_context, tool_name, requirement);.
There was a problem hiding this comment.
Resolved in 55d4acb — extracted ApprovalContext::is_blocked_or_default() static helper, exactly as suggested.
| let tool_permissions = params | ||
| .get("tool_permissions") | ||
| .and_then(|v| v.as_array()) | ||
| .map(|arr| { | ||
| arr.iter() | ||
| .filter_map(|v| v.as_str().map(String::from)) | ||
| .collect() | ||
| }) | ||
| .unwrap_or_default(); |
There was a problem hiding this comment.
Resolved in 55d4acb — extracted parse_tool_permissions() helper in routine.rs, used from both call sites.
- Add `with_routines()` to TestRigBuilder that passes a RoutineConfig to Agent::new, enabling routine tool registration during agent startup - Add Turn 2 (routine_list) to the trace to verify routine persistence in the database after routine_create - Fix formatting issues flagged by CI (cargo fmt) [skip-regression-check] Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ontext Extract shared logic into private `dispatch_job_inner` to prevent divergence when dispatch behavior changes in the future. [skip-regression-check] Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 10 out of 10 changed files in this pull request and generated 4 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| tools, | ||
| context_manager, | ||
| safety, | ||
| None, | ||
| tool_parent_id, | ||
| &tool_name, | ||
| params, |
There was a problem hiding this comment.
execute_tool_task is now approval-context aware, but this call site always passes None, which forces the legacy behavior (blocking all non-Never tools). This defeats the new autonomous ApprovalContext in scheduler-driven tool execution paths. Pass through the relevant approval_context (e.g., from the job/worker deps or message payload) instead of hardcoding None.
There was a problem hiding this comment.
Same as Gemini's comment — TODO added in 55d4acb. Not in active code paths currently.
| /// Regression test: calling mark_completed on an already-Completed job must | ||
| /// not surface as an error. The outer `run()` guards should skip the second | ||
| /// transition instead of attempting Completed → Completed. | ||
| #[tokio::test] | ||
| async fn test_mark_completed_twice_does_not_error() { | ||
| let worker = make_worker(vec![]).await; | ||
|
|
||
| // Transition to InProgress first (required by state machine) | ||
| worker | ||
| .context_manager() | ||
| .update_context(worker.job_id, |ctx| { | ||
| ctx.transition_to(JobState::InProgress, None) | ||
| }) | ||
| .await | ||
| .unwrap() | ||
| .unwrap(); | ||
|
|
||
| // First mark_completed should succeed | ||
| worker.mark_completed().await.unwrap(); | ||
|
|
||
| // Verify state is Completed | ||
| let ctx = worker | ||
| .context_manager() | ||
| .get_context(worker.job_id) | ||
| .await | ||
| .unwrap(); | ||
| assert_eq!(ctx.state, JobState::Completed); | ||
|
|
||
| // Second mark_completed should fail (Completed → Completed is invalid) | ||
| let result = worker.mark_completed().await; | ||
| assert!( | ||
| result.is_err(), | ||
| "Completed → Completed transition should be rejected by state machine" | ||
| ); |
There was a problem hiding this comment.
This test’s name/docstring claims the second completion “must not surface as an error”, but the assertion explicitly expects an error on the second mark_completed(). Either (a) update the test to exercise the real regression path described (e.g., calling run()/execution loop such that a second completion attempt is safely ignored), or (b) rename/reword the test to reflect that mark_completed() is expected to error when called twice.
There was a problem hiding this comment.
Resolved in 55d4acb — renamed to test_mark_completed_twice_returns_error and updated docstring.
| /// Approval context for tool execution in different environments. | ||
| /// | ||
| /// Interactive sessions use session-level auto-approve lists managed by the UI. | ||
| /// Autonomous contexts (routines, background jobs) use pre-authorized tool sets | ||
| /// configured at creation time. | ||
| #[derive(Debug, Clone)] | ||
| pub enum ApprovalContext { | ||
| /// Autonomous job with no interactive user. `UnlessAutoApproved` tools are | ||
| /// pre-approved. `Always` tools are blocked unless listed in `allowed_tools`. | ||
| Autonomous { | ||
| /// Tool names that are pre-authorized even for `Always` approval. | ||
| allowed_tools: std::collections::HashSet<String>, | ||
| }, | ||
| } |
There was a problem hiding this comment.
The doc comment describes both interactive and autonomous environments, but the enum currently only models the autonomous case. Either add an explicit interactive variant (if this type is intended to represent both modes), or update the docs to avoid implying interactive-session behavior is represented by ApprovalContext.
There was a problem hiding this comment.
Resolved in 55d4acb — updated doc to clarify it only models autonomous mode.
| for tool in &[ | ||
| "routine_create", | ||
| "routine_list", | ||
| "memory_write", | ||
| "memory_search", | ||
| ] { | ||
| assert!( | ||
| started.iter().any(|s| s == *tool), | ||
| "{tool} not called: {started:?}" | ||
| ); | ||
| } |
There was a problem hiding this comment.
The trace fixture’s expects.tools_used includes echo, and turn 3 uses an echo tool call, but this verification list omits it. Add \"echo\" here (or adjust the fixture expectations) so the test consistently validates the intended tool sequence.
There was a problem hiding this comment.
Resolved in 8eafa57 — echo replaced with http tool throughout (fixture, test, and expects).
…test
- Add `routine_fire` tool that calls `RoutineEngine::fire_manual` to
trigger a routine on demand. Registered alongside the other 5 routine
tools (now 6 total).
- Rewrite the routine_news_digest E2E trace to exercise the full
execution stack end-to-end:
1. routine_create (manual trigger, full_job, tool_permissions: [message])
2. routine_fire → RoutineEngine → Scheduler::dispatch_job_with_context
→ autonomous Worker consuming TraceLlm steps
3. Worker calls echo → memory_write → message (broadcast to test channel)
4. Test verifies the message broadcast arrived, proving ApprovalContext
correctly allowed the Always-approval message tool
- Register message tools in TestRig so routines can send messages to
the test channel via channel_manager.broadcast().
[skip-regression-check]
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…rker http calls Propagate http_interceptor from AgentDeps → Scheduler → WorkerDeps → JobContext so that routine workers (and any scheduler-dispatched workers) can use the ReplayingHttpInterceptor for mock HTTP responses during tests. Changes: - Add http_interceptor field to Scheduler and WorkerDeps - Set job_ctx.http_interceptor in Worker before tool execution - Add with_http_exchanges() builder method to TestRigBuilder - Replace echo tool with http tool in routine_news_digest trace - Test now exercises real http tool with mock response → memory_write → message Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 14 out of 14 changed files in this pull request and generated 4 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| /// Regression test: calling mark_completed on an already-Completed job must | ||
| /// not surface as an error. The outer `run()` guards should skip the second | ||
| /// transition instead of attempting Completed → Completed. | ||
| #[tokio::test] | ||
| async fn test_mark_completed_twice_does_not_error() { | ||
| let worker = make_worker(vec![]).await; | ||
|
|
||
| // Transition to InProgress first (required by state machine) | ||
| worker | ||
| .context_manager() | ||
| .update_context(worker.job_id, |ctx| { | ||
| ctx.transition_to(JobState::InProgress, None) | ||
| }) | ||
| .await | ||
| .unwrap() | ||
| .unwrap(); | ||
|
|
||
| // First mark_completed should succeed | ||
| worker.mark_completed().await.unwrap(); | ||
|
|
||
| // Verify state is Completed | ||
| let ctx = worker | ||
| .context_manager() | ||
| .get_context(worker.job_id) | ||
| .await | ||
| .unwrap(); | ||
| assert_eq!(ctx.state, JobState::Completed); | ||
|
|
||
| // Second mark_completed should fail (Completed → Completed is invalid) | ||
| let result = worker.mark_completed().await; | ||
| assert!( | ||
| result.is_err(), | ||
| "Completed → Completed transition should be rejected by state machine" | ||
| ); |
There was a problem hiding this comment.
The new test name/docstring says marking a Completed job completed again “does not error”, but the assertions expect an error (and the comment even states the Completed → Completed transition is invalid). Either rename/reword the test to match what it actually verifies, or change it to exercise the real regression (the worker run loop should not attempt a second Completed transition when execution_loop already completed the job).
There was a problem hiding this comment.
Resolved in 55d4acb — renamed test and updated docstring (same as earlier Copilot comment).
| // Set the message tool's default channel/target from the routine's notify config | ||
| // so the LLM can send results without triggering cross-channel approval. | ||
| if let Some(channel) = &routine.notify.channel { | ||
| scheduler | ||
| .tools() | ||
| .set_message_tool_context(Some(channel.clone()), Some(routine.notify.user.clone())) | ||
| .await; | ||
| } | ||
|
|
||
| let metadata = serde_json::json!({ "max_iterations": max_iterations }); | ||
|
|
There was a problem hiding this comment.
Setting the message tool’s default channel/target via ToolRegistry mutates shared global state. Because the scheduler can run multiple jobs concurrently, this can race with other jobs/interactive turns and cause messages to be sent to the wrong channel/target when the message tool is invoked without explicit params (and it also affects requires_approval() decisions). Prefer storing the routine’s notify channel/target in the JobContext (e.g., metadata) and applying it per-job immediately before message tool approval/execute, or require routines to always pass explicit channel/target instead of relying on global defaults.
| // Set the message tool's default channel/target from the routine's notify config | |
| // so the LLM can send results without triggering cross-channel approval. | |
| if let Some(channel) = &routine.notify.channel { | |
| scheduler | |
| .tools() | |
| .set_message_tool_context(Some(channel.clone()), Some(routine.notify.user.clone())) | |
| .await; | |
| } | |
| let metadata = serde_json::json!({ "max_iterations": max_iterations }); | |
| // Attach the routine's notify configuration to the job metadata instead of | |
| // mutating global message tool defaults, to avoid cross-job races. | |
| let metadata = serde_json::json!({ | |
| "max_iterations": max_iterations, | |
| "notify": { | |
| "channel": routine.notify.channel, | |
| "user": routine.notify.user, | |
| }, | |
| }); |
There was a problem hiding this comment.
Valid concern. Added TODO comment in 55d4acb noting the race. Will move notify config into JobContext metadata in a follow-up.
| // Find the main conversation reply (from turn 2). | ||
| let fire_reply = &responses[1].content.to_lowercase(); | ||
| assert!( | ||
| fire_reply.contains("fired") || fire_reply.contains("running"), | ||
| "Turn 2: expected fired/running, got: {fire_reply}" |
There was a problem hiding this comment.
This test assumes the “fire” conversational reply will always be at responses[1]. Because the routine worker runs asynchronously and broadcasts to the same channel, the broadcast message could arrive before/after the conversational reply depending on scheduling, making the index-based assertion flaky. Consider locating the fire confirmation response by content (similar to message_broadcast) rather than relying on a fixed position.
| // Find the main conversation reply (from turn 2). | |
| let fire_reply = &responses[1].content.to_lowercase(); | |
| assert!( | |
| fire_reply.contains("fired") || fire_reply.contains("running"), | |
| "Turn 2: expected fired/running, got: {fire_reply}" | |
| // Find the main conversation reply (from turn 2) by content, since | |
| // the routine worker runs asynchronously and may interleave messages. | |
| let fire_reply = responses.iter().find(|r| { | |
| let content = r.content.to_lowercase(); | |
| content.contains("fired") || content.contains("running") | |
| }); | |
| assert!( | |
| fire_reply.is_some(), | |
| "Turn 2: expected fired/running, got: {:?}", | |
| responses.iter().map(|r| &r.content).collect::<Vec<_>>() |
There was a problem hiding this comment.
Resolved in 55d4acb — changed to content-based find() lookup, exactly as suggested.
| // Exercises the full routine execution stack: | ||
| // routine_create → routine_fire → RoutineEngine::fire_manual → | ||
| // Scheduler::dispatch_job_with_context → Worker (autonomous) → | ||
| // echo + memory_write + message (broadcast to test channel) |
There was a problem hiding this comment.
Comment says the routine worker executes “echo + memory_write + message”, but the trace fixture/expected tool chain here is http → memory_write → message. Updating the comment to match the actual tools exercised will avoid confusion when debugging this test.
| // echo + memory_write + message (broadcast to test channel) | |
| // http + memory_write + message (broadcast to test channel) |
There was a problem hiding this comment.
Resolved in 55d4acb — updated to http + memory_write + message.
…-context # Conflicts: # tests/support/test_rig.rs
- Extract `ApprovalContext::is_blocked_or_default()` helper to deduplicate approval check logic in worker.rs and scheduler.rs - Extract `parse_tool_permissions()` helper to deduplicate JSON array parsing in routine.rs and builtin/routine.rs - Fix test name: `test_mark_completed_twice_does_not_error` → `test_mark_completed_twice_returns_error` (matches actual behavior) - Fix ApprovalContext doc comment to clarify it only models autonomous mode - Fix flaky index-based assertion in routine_news_digest test — now uses content-based search instead of fixed position - Fix stale comment: echo → http in routine test header - Add TODO for subtask approval context propagation (latent, not in active code paths) - Add TODO for global message tool context race in routine_engine Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 14 out of 14 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| // Set the message tool's default channel/target from the routine's notify config | ||
| // so the LLM can send results without triggering cross-channel approval. | ||
| // TODO: This mutates shared global state and can race with concurrent jobs. | ||
| // Move notify config into JobContext metadata and apply per-job instead. | ||
| if let Some(channel) = &routine.notify.channel { | ||
| scheduler | ||
| .tools() | ||
| .set_message_tool_context(Some(channel.clone()), Some(routine.notify.user.clone())) | ||
| .await; | ||
| } | ||
|
|
||
| let metadata = serde_json::json!({ "max_iterations": max_iterations }); | ||
|
|
There was a problem hiding this comment.
set_message_tool_context() mutates the global MessageTool defaults (shared ToolRegistry state). In concurrent routine runs or while an interactive session is active, this can race and cause messages to be sent with the wrong channel/target and can also change message.requires_approval() outcomes across jobs. Prefer making channel/target a per-job/per-call input (e.g., pass explicitly in tool args or derive from JobContext metadata) rather than mutating shared global state.
| // Set the message tool's default channel/target from the routine's notify config | |
| // so the LLM can send results without triggering cross-channel approval. | |
| // TODO: This mutates shared global state and can race with concurrent jobs. | |
| // Move notify config into JobContext metadata and apply per-job instead. | |
| if let Some(channel) = &routine.notify.channel { | |
| scheduler | |
| .tools() | |
| .set_message_tool_context(Some(channel.clone()), Some(routine.notify.user.clone())) | |
| .await; | |
| } | |
| let metadata = serde_json::json!({ "max_iterations": max_iterations }); | |
| // Embed the routine's notify config into the job metadata so that any | |
| // message tools can derive channel/target per job instead of mutating | |
| // shared global state. | |
| let metadata = serde_json::json!({ | |
| "max_iterations": max_iterations, | |
| "notify": { | |
| "channel": routine.notify.channel, | |
| "user": routine.notify.user, | |
| }, | |
| }); |
There was a problem hiding this comment.
Duplicate of the earlier comment on the same line. TODO already added in 55d4acb noting the race. Will move to per-job metadata in a follow-up.
| "model_name": "advanced-routine-news-digest", | ||
| "expects": { | ||
| "tools_used": ["routine_create", "routine_fire", "http", "memory_write", "message"], | ||
| "all_tools_succeeded": true, | ||
| "min_responses": 2 |
There was a problem hiding this comment.
PR description says the E2E trace fixture is a 3-turn flow that includes a verification step using memory_search, but this trace has 2 turns and no memory_search tool usage. Please either update the PR description to match the actual fixture/test, or extend the fixture/test to include the documented verification step.
There was a problem hiding this comment.
Updated the PR description to match the actual implementation: 2-turn fixture with http → memory_write → message (no memory_search). The fixture evolved from the original 3-turn design during implementation.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 14 out of 14 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| let exchanges = if self.http_exchanges.is_empty() { | ||
| http_exchanges | ||
| } else { | ||
| self.http_exchanges |
There was a problem hiding this comment.
TestRigBuilder::build(self) conditionally moves self.http_exchanges here, but the function continues to access other self.* fields afterward (e.g. self.enable_routines, and later self.max_tool_iterations). This will fail to compile due to a partial move of self. Consider destructuring self into locals at the top of build (including enable_routines and http_exchanges) or changing the method to take mut self and using std::mem::take(&mut self.http_exchanges) so self remains usable for subsequent fields.
| self.http_exchanges | |
| self.http_exchanges.clone() |
There was a problem hiding this comment.
Already resolved in 35847fe — self is destructured into locals at the top of build().
| let run_id = self.engine.fire_manual(routine.id).await.map_err(|e| { | ||
| ToolError::ExecutionFailed(format!("failed to fire routine '{}': {e}", name)) | ||
| })?; |
There was a problem hiding this comment.
routine_fire ultimately calls RoutineEngine::fire_manual, which (unlike cron/event firing paths) does not apply the routine cooldown guardrail. This means a manual trigger can spam runs even when cooldown_secs is set, which could cause unexpected load/cost. If cooldown is meant to be respected for manual fires too, add a cooldown check in fire_manual or enforce it in this tool before firing; otherwise clarify in the tool description/docs that manual fires bypass cooldown.
There was a problem hiding this comment.
By design — documented in description ('bypassing schedule, trigger type, and cooldown') since c138926.
… fragility Destructure TestRigBuilder at the top of build() instead of accessing self.* fields after moving self.http_exchanges. While the prior code compiled (remaining fields are Copy), it was fragile and would break if any non-Copy field were added. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Manual fires are explicitly user-initiated and intentionally bypass cooldown checks (which only apply to automated cron/event triggers). Updated tool description and fire_manual docstring to make this clear. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 14 out of 14 changed files in this pull request and generated 1 comment.
Comments suppressed due to low confidence (1)
src/tools/builtin/routine.rs:105
routine_createnow acceptstool_permissions, which are later used to allowApprovalRequirement::Alwaystools in autonomous jobs. However, this tool does not overriderequires_approval(), so it defaults toApprovalRequirement::Neverand can be invoked without any approval in interactive sessions—effectively allowing an LLM to self-grant elevated autonomous tool permissions. Consider requiring at leastUnlessAutoApproved, and ideallyAlwayswhenaction_type == "full_job"andtool_permissionsis non-empty (or contains high-risk tools likeshell/cross-channelmessage).
"cooldown_secs": {
"type": "integer",
"description": "Minimum seconds between fires (default: 300)"
},
"tool_permissions": {
"type": "array",
"items": { "type": "string" },
"description": "Tool names pre-authorized for Always-approval tools in full_job mode (e.g. ['shell', 'message']). UnlessAutoApproved tools are automatically permitted in routines."
}
},
"required": ["name", "trigger_type", "prompt"]
})
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| #[async_trait] | ||
| impl Tool for RoutineFireTool { | ||
| fn name(&self) -> &str { | ||
| "routine_fire" | ||
| } | ||
|
|
||
| fn description(&self) -> &str { | ||
| "Manually trigger a routine to run immediately, bypassing schedule, trigger type, and cooldown." | ||
| } | ||
|
|
There was a problem hiding this comment.
routine_fire can immediately dispatch a routine run (bypassing schedule/trigger/cooldown per the description), but the tool does not override requires_approval() and therefore defaults to ApprovalRequirement::Never. With full_job routines + tool_permissions, this can become an approval bypass path for Always-gated tools. Add a requires_approval() implementation (at least UnlessAutoApproved, potentially Always) so interactive sessions must explicitly approve firing routines.
There was a problem hiding this comment.
Good catch. Added requires_approval() -> UnlessAutoApproved to routine_fire in fc15ff0.
Two fixes for message tool failures in autonomous routine jobs: 1. MessageTool::requires_approval() now returns UnlessAutoApproved when the explicit channel param matches the default channel (was Always, causing "requires authentication" errors for routine workers). 2. routine_create tool now accepts notify_channel and notify_user params, wired into NotifyConfig. Without these, routines had channel: None, so set_message_tool_context was never called, causing "No channel specified" errors. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…-context Resolved conflicts in message.rs (kept same-channel UnlessAutoApproved fix over main's Always behavior) and test_rig.rs (combined extra_tools from main with enable_routines/http_exchanges from this branch, added trace_llm_ref tracking). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 15 out of 15 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| let name = require_str(¶ms, "name")?; | ||
|
|
||
| let routine = self | ||
| .store | ||
| .get_routine_by_name(&ctx.user_id, name) |
There was a problem hiding this comment.
name is a &str borrowed from params and is then held across .await points, which will fail to compile in an async function due to borrowing a local across suspension. Convert name to an owned String immediately (e.g., let name = require_str(...)? .to_string();) and then pass &name to async calls / use it in error messages and the result JSON.
| let name = require_str(¶ms, "name")?; | |
| let routine = self | |
| .store | |
| .get_routine_by_name(&ctx.user_id, name) | |
| let name = require_str(¶ms, "name")?.to_string(); | |
| let routine = self | |
| .store | |
| .get_routine_by_name(&ctx.user_id, &name) |
There was a problem hiding this comment.
False positive — params is owned by the function, so the &str borrow from it lives for the function's lifetime. This compiles and runs correctly (verified by CI).
| // Set the message tool's default channel/target from the routine's notify config | ||
| // so the LLM can send results without triggering cross-channel approval. | ||
| // TODO: This mutates shared global state and can race with concurrent jobs. | ||
| // Move notify config into JobContext metadata and apply per-job instead. | ||
| if let Some(channel) = &routine.notify.channel { | ||
| scheduler | ||
| .tools() | ||
| .set_message_tool_context(Some(channel.clone()), Some(routine.notify.user.clone())) | ||
| .await; | ||
| } | ||
|
|
||
| let metadata = serde_json::json!({ "max_iterations": max_iterations }); | ||
|
|
||
| // Build approval context: UnlessAutoApproved tools are auto-approved for routines; | ||
| // Always tools require explicit listing in tool_permissions. | ||
| let approval_context = ApprovalContext::autonomous_with_tools(tool_permissions.iter().cloned()); | ||
|
|
There was a problem hiding this comment.
This updates shared/global message tool defaults, which can race when multiple jobs/routines run concurrently (e.g., one routine can accidentally change another routine’s default channel/target, impacting both routing and approval behavior). Prefer setting per-job message context (e.g., store notify channel/user in JobContext/metadata and have MessageTool resolve defaults from the executing job context rather than global mutable state).
| // Set the message tool's default channel/target from the routine's notify config | |
| // so the LLM can send results without triggering cross-channel approval. | |
| // TODO: This mutates shared global state and can race with concurrent jobs. | |
| // Move notify config into JobContext metadata and apply per-job instead. | |
| if let Some(channel) = &routine.notify.channel { | |
| scheduler | |
| .tools() | |
| .set_message_tool_context(Some(channel.clone()), Some(routine.notify.user.clone())) | |
| .await; | |
| } | |
| let metadata = serde_json::json!({ "max_iterations": max_iterations }); | |
| // Build approval context: UnlessAutoApproved tools are auto-approved for routines; | |
| // Always tools require explicit listing in tool_permissions. | |
| let approval_context = ApprovalContext::autonomous_with_tools(tool_permissions.iter().cloned()); | |
| // Attach notify configuration to the job metadata so the worker/message tool | |
| // can resolve per-job defaults instead of mutating shared global state. | |
| let metadata = serde_json::json!({ | |
| "max_iterations": max_iterations, | |
| "notify": { | |
| "channel": routine.notify.channel, | |
| "user": routine.notify.user, | |
| }, | |
| }); | |
| // Build approval context: UnlessAutoApproved tools are auto-approved for routines; | |
| // Always tools require explicit listing in tool_permissions. | |
| let approval_context = ApprovalContext::autonomous_with_tools(tool_permissions.iter().cloned()); | |
| let approval_context = ApprovalContext::autonomous_with_tools(tool_permissions.iter().cloned()); |
There was a problem hiding this comment.
Duplicate of earlier comment. TODO already added in 55d4acb. Will move to per-job metadata in a follow-up.
| } | ||
|
|
||
| #[test] | ||
| fn test_approval_context_never_always_passes() { |
There was a problem hiding this comment.
Test name is confusing (“never_always”). Consider renaming to something more specific to the behavior under test (e.g., test_approval_context_never_is_not_blocked) to improve readability and intent.
| fn test_approval_context_never_always_passes() { | |
| fn test_approval_context_never_is_not_blocked() { |
There was a problem hiding this comment.
Renamed to test_approval_context_never_is_not_blocked in fc15ff0.
The message tool only sends to user-owned channels via ChannelManager::broadcast (TUI, Telegram, Slack, web gateway, etc.). It cannot reach arbitrary external services, so approval adds friction with no security benefit. This also eliminates the routine context errors entirely since approval is never checked. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- routine_fire now returns UnlessAutoApproved since firing a routine can dispatch a full_job with pre-authorized Always-gated tools - Rename test_approval_context_never_always_passes to test_approval_context_never_is_not_blocked for clarity Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 15 out of 15 changed files in this pull request and generated 4 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| /// Enable the routines system so `routine_create` and other routine tools | ||
| /// are registered and functional. | ||
| pub fn with_routines(mut self) -> Self { | ||
| self.enable_routines = true; | ||
| self | ||
| } |
There was a problem hiding this comment.
with_routines() claims it enables routines so routine tools are registered/functional, but build() currently registers routine tools unconditionally (and uses a RoutineEngine constructed with scheduler: None). This means routine tools may be present even when routines are “disabled”, but full_job execution will fail (no scheduler), and the builder flag doesn’t actually control tool availability. Gate routine tool registration/engine creation behind enable_routines (or rely on Agent::run() to register them when routine_config.enabled).
There was a problem hiding this comment.
Updated docstring to clarify that with_routines() enables scheduler wiring, not tool registration. Fixed in 1a38775.
| fn requires_approval(&self, _params: &serde_json::Value) -> ApprovalRequirement { | ||
| // Message tool only sends to user-owned channels (TUI, Telegram, Slack, web | ||
| // gateway, etc.) via ChannelManager::broadcast. It cannot reach arbitrary | ||
| // external services, so no approval is needed. | ||
| ApprovalRequirement::Never |
There was a problem hiding this comment.
The new comment says the message tool “cannot reach arbitrary external services”, but ChannelManager::broadcast can send to external integrations (Telegram/Slack/web gateway/etc.) depending on configured channels. Either reintroduce an approval requirement for higher-risk sends (e.g., cross-channel/explicit channel), or adjust the comment + surrounding security assumptions (e.g., routine tool_permissions docs and routine engine comments) to match the new behavior.
There was a problem hiding this comment.
Reworded to 'only delivers to channels the user has configured' in 1a38775.
| "tool_permissions": { | ||
| "type": "array", | ||
| "items": { "type": "string" }, | ||
| "description": "Tool names pre-authorized for Always-approval tools in full_job mode (e.g. ['shell', 'message']). UnlessAutoApproved tools are automatically permitted in routines." |
There was a problem hiding this comment.
The tool_permissions schema/description still uses message as an example of an Always-approval tool (“cross-channel messaging”), but MessageTool::requires_approval() now returns Never. Update this description/examples to reflect the current approval model (e.g., use shell / HTTP with credentials as examples) so users don’t think message needs to be pre-authorized.
| "description": "Tool names pre-authorized for Always-approval tools in full_job mode (e.g. ['shell', 'message']). UnlessAutoApproved tools are automatically permitted in routines." | |
| "description": "Tool names pre-authorized for tools that require explicit approval in full_job mode (e.g. ['shell', 'http_request_with_auth']). Tools that are automatically approved do not need to be listed here." |
There was a problem hiding this comment.
Good catch — updated example to ['shell'] since message is now Never. Fixed in 1a38775.
| // so this is a meaningful escalation that warrants auto-approval gating. | ||
| ApprovalRequirement::UnlessAutoApproved |
There was a problem hiding this comment.
routine_fire returns ApprovalRequirement::UnlessAutoApproved, but in autonomous jobs ApprovalContext::Autonomous auto-approves UnlessAutoApproved tools. That means a routine full_job can invoke routine_fire (and spawn more jobs/runs) without being listed in tool_permissions, despite the comment calling it a “meaningful escalation”. Consider changing this to ApprovalRequirement::Always so autonomous contexts block it unless explicitly permitted, and interactive sessions still prompt.
| // so this is a meaningful escalation that warrants auto-approval gating. | |
| ApprovalRequirement::UnlessAutoApproved | |
| // so this is a meaningful escalation that warrants explicit approval in all contexts. | |
| ApprovalRequirement::Always |
There was a problem hiding this comment.
By design — routine_fire is an interactive tool (the user asks the agent to fire a routine). Routines don't fire other routines, so this never runs in autonomous context. In interactive mode, UnlessAutoApproved prompts once then auto-approves.
- Remove 'message' from tool_permissions example (no longer Always) - Reword message tool approval comment for accuracy - Clarify with_routines() docstring re: tool registration vs engine wiring Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…#577) * feat(routines): add approval context for autonomous job execution Routines and background jobs were unable to use any tools that required approval (file ops, shell, message, http), making them effectively useless. This adds an ApprovalContext system that lets autonomous jobs pre-authorize tools at dispatch time. - Add ApprovalContext enum with Autonomous variant that auto-approves UnlessAutoApproved tools and optionally pre-authorizes Always tools - Add tool_permissions field to RoutineAction::FullJob for pre-authorizing Always-gated tools (e.g. destructive shell, cross-channel messaging) - Add Scheduler::dispatch_job_with_context() to thread approval context through to workers - Set message tool default channel/target from routine NotifyConfig so routines can send results without cross-channel approval - Fix Completed→Completed state transition error in worker (plan marks job completed, then direct loop or outer run() tries again) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * test(routines): add E2E trace for routine news digest workflow Add a 3-turn trace fixture and test that exercises: - Turn 1: routine_create with full_job mode and tool_permissions - Turn 2: Simulated digest workflow with echo + memory_write - Turn 3: Verification via memory_search [skip-regression-check] Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(test): wire RoutineEngine into test rig for routine_create E2E - Add `with_routines()` to TestRigBuilder that passes a RoutineConfig to Agent::new, enabling routine tool registration during agent startup - Add Turn 2 (routine_list) to the trace to verify routine persistence in the database after routine_create - Fix formatting issues flagged by CI (cargo fmt) [skip-regression-check] Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * refactor(scheduler): deduplicate dispatch_job and dispatch_job_with_context Extract shared logic into private `dispatch_job_inner` to prevent divergence when dispatch behavior changes in the future. [skip-regression-check] Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat(routines): add routine_fire tool and real E2E routine execution test - Add `routine_fire` tool that calls `RoutineEngine::fire_manual` to trigger a routine on demand. Registered alongside the other 5 routine tools (now 6 total). - Rewrite the routine_news_digest E2E trace to exercise the full execution stack end-to-end: 1. routine_create (manual trigger, full_job, tool_permissions: [message]) 2. routine_fire → RoutineEngine → Scheduler::dispatch_job_with_context → autonomous Worker consuming TraceLlm steps 3. Worker calls echo → memory_write → message (broadcast to test channel) 4. Test verifies the message broadcast arrived, proving ApprovalContext correctly allowed the Always-approval message tool - Register message tools in TestRig so routines can send messages to the test channel via channel_manager.broadcast(). [skip-regression-check] Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat(routines): wire HttpInterceptor through scheduler for routine worker http calls Propagate http_interceptor from AgentDeps → Scheduler → WorkerDeps → JobContext so that routine workers (and any scheduler-dispatched workers) can use the ReplayingHttpInterceptor for mock HTTP responses during tests. Changes: - Add http_interceptor field to Scheduler and WorkerDeps - Set job_ctx.http_interceptor in Worker before tool execution - Add with_http_exchanges() builder method to TestRigBuilder - Replace echo tool with http tool in routine_news_digest trace - Test now exercises real http tool with mock response → memory_write → message Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: address review comments from Copilot on PR nearai#577 - Extract `ApprovalContext::is_blocked_or_default()` helper to deduplicate approval check logic in worker.rs and scheduler.rs - Extract `parse_tool_permissions()` helper to deduplicate JSON array parsing in routine.rs and builtin/routine.rs - Fix test name: `test_mark_completed_twice_does_not_error` → `test_mark_completed_twice_returns_error` (matches actual behavior) - Fix ApprovalContext doc comment to clarify it only models autonomous mode - Fix flaky index-based assertion in routine_news_digest test — now uses content-based search instead of fixed position - Fix stale comment: echo → http in routine test header - Add TODO for subtask approval context propagation (latent, not in active code paths) - Add TODO for global message tool context race in routine_engine Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * style: fix formatting in is_blocked_or_default test Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * refactor(test_rig): destructure self in build() to avoid partial-move fragility Destructure TestRigBuilder at the top of build() instead of accessing self.* fields after moving self.http_exchanges. While the prior code compiled (remaining fields are Copy), it was fragile and would break if any non-Copy field were added. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * docs: clarify that routine_fire bypasses cooldown Manual fires are explicitly user-initiated and intentionally bypass cooldown checks (which only apply to automated cron/event triggers). Updated tool description and fire_manual docstring to make this clear. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(routines): fix message tool approval in routine context Two fixes for message tool failures in autonomous routine jobs: 1. MessageTool::requires_approval() now returns UnlessAutoApproved when the explicit channel param matches the default channel (was Always, causing "requires authentication" errors for routine workers). 2. routine_create tool now accepts notify_channel and notify_user params, wired into NotifyConfig. Without these, routines had channel: None, so set_message_tool_context was never called, causing "No channel specified" errors. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * refactor(message): remove approval requirement from message tool The message tool only sends to user-owned channels via ChannelManager::broadcast (TUI, Telegram, Slack, web gateway, etc.). It cannot reach arbitrary external services, so approval adds friction with no security benefit. This also eliminates the routine context errors entirely since approval is never checked. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: address review comments — routine_fire approval + test rename - routine_fire now returns UnlessAutoApproved since firing a routine can dispatch a full_job with pre-authorized Always-gated tools - Rename test_approval_context_never_always_passes to test_approval_context_never_is_not_blocked for clarity Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: address review nits — update stale docs and comments - Remove 'message' from tool_permissions example (no longer Always) - Reword message tool approval comment for accuracy - Clarify with_routines() docstring re: tool registration vs engine wiring Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
…#577) * feat(routines): add approval context for autonomous job execution Routines and background jobs were unable to use any tools that required approval (file ops, shell, message, http), making them effectively useless. This adds an ApprovalContext system that lets autonomous jobs pre-authorize tools at dispatch time. - Add ApprovalContext enum with Autonomous variant that auto-approves UnlessAutoApproved tools and optionally pre-authorizes Always tools - Add tool_permissions field to RoutineAction::FullJob for pre-authorizing Always-gated tools (e.g. destructive shell, cross-channel messaging) - Add Scheduler::dispatch_job_with_context() to thread approval context through to workers - Set message tool default channel/target from routine NotifyConfig so routines can send results without cross-channel approval - Fix Completed→Completed state transition error in worker (plan marks job completed, then direct loop or outer run() tries again) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * test(routines): add E2E trace for routine news digest workflow Add a 3-turn trace fixture and test that exercises: - Turn 1: routine_create with full_job mode and tool_permissions - Turn 2: Simulated digest workflow with echo + memory_write - Turn 3: Verification via memory_search [skip-regression-check] Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(test): wire RoutineEngine into test rig for routine_create E2E - Add `with_routines()` to TestRigBuilder that passes a RoutineConfig to Agent::new, enabling routine tool registration during agent startup - Add Turn 2 (routine_list) to the trace to verify routine persistence in the database after routine_create - Fix formatting issues flagged by CI (cargo fmt) [skip-regression-check] Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * refactor(scheduler): deduplicate dispatch_job and dispatch_job_with_context Extract shared logic into private `dispatch_job_inner` to prevent divergence when dispatch behavior changes in the future. [skip-regression-check] Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat(routines): add routine_fire tool and real E2E routine execution test - Add `routine_fire` tool that calls `RoutineEngine::fire_manual` to trigger a routine on demand. Registered alongside the other 5 routine tools (now 6 total). - Rewrite the routine_news_digest E2E trace to exercise the full execution stack end-to-end: 1. routine_create (manual trigger, full_job, tool_permissions: [message]) 2. routine_fire → RoutineEngine → Scheduler::dispatch_job_with_context → autonomous Worker consuming TraceLlm steps 3. Worker calls echo → memory_write → message (broadcast to test channel) 4. Test verifies the message broadcast arrived, proving ApprovalContext correctly allowed the Always-approval message tool - Register message tools in TestRig so routines can send messages to the test channel via channel_manager.broadcast(). [skip-regression-check] Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat(routines): wire HttpInterceptor through scheduler for routine worker http calls Propagate http_interceptor from AgentDeps → Scheduler → WorkerDeps → JobContext so that routine workers (and any scheduler-dispatched workers) can use the ReplayingHttpInterceptor for mock HTTP responses during tests. Changes: - Add http_interceptor field to Scheduler and WorkerDeps - Set job_ctx.http_interceptor in Worker before tool execution - Add with_http_exchanges() builder method to TestRigBuilder - Replace echo tool with http tool in routine_news_digest trace - Test now exercises real http tool with mock response → memory_write → message Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: address review comments from Copilot on PR nearai#577 - Extract `ApprovalContext::is_blocked_or_default()` helper to deduplicate approval check logic in worker.rs and scheduler.rs - Extract `parse_tool_permissions()` helper to deduplicate JSON array parsing in routine.rs and builtin/routine.rs - Fix test name: `test_mark_completed_twice_does_not_error` → `test_mark_completed_twice_returns_error` (matches actual behavior) - Fix ApprovalContext doc comment to clarify it only models autonomous mode - Fix flaky index-based assertion in routine_news_digest test — now uses content-based search instead of fixed position - Fix stale comment: echo → http in routine test header - Add TODO for subtask approval context propagation (latent, not in active code paths) - Add TODO for global message tool context race in routine_engine Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * style: fix formatting in is_blocked_or_default test Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * refactor(test_rig): destructure self in build() to avoid partial-move fragility Destructure TestRigBuilder at the top of build() instead of accessing self.* fields after moving self.http_exchanges. While the prior code compiled (remaining fields are Copy), it was fragile and would break if any non-Copy field were added. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * docs: clarify that routine_fire bypasses cooldown Manual fires are explicitly user-initiated and intentionally bypass cooldown checks (which only apply to automated cron/event triggers). Updated tool description and fire_manual docstring to make this clear. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(routines): fix message tool approval in routine context Two fixes for message tool failures in autonomous routine jobs: 1. MessageTool::requires_approval() now returns UnlessAutoApproved when the explicit channel param matches the default channel (was Always, causing "requires authentication" errors for routine workers). 2. routine_create tool now accepts notify_channel and notify_user params, wired into NotifyConfig. Without these, routines had channel: None, so set_message_tool_context was never called, causing "No channel specified" errors. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * refactor(message): remove approval requirement from message tool The message tool only sends to user-owned channels via ChannelManager::broadcast (TUI, Telegram, Slack, web gateway, etc.). It cannot reach arbitrary external services, so approval adds friction with no security benefit. This also eliminates the routine context errors entirely since approval is never checked. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: address review comments — routine_fire approval + test rename - routine_fire now returns UnlessAutoApproved since firing a routine can dispatch a full_job with pre-authorized Always-gated tools - Rename test_approval_context_never_always_passes to test_approval_context_never_is_not_blocked for clarity Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: address review nits — update stale docs and comments - Remove 'message' from tool_permissions example (no longer Always) - Reword message tool approval comment for accuracy - Clarify with_routines() docstring re: tool registration vs engine wiring Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Summary
ApprovalContext— new enum that distinguishes interactive sessions (human approves tools) from autonomous jobs (pre-authorized tool lists). Tools withApprovalRequirement::Alwaysare blocked in autonomous mode unless explicitly listed intool_permissions.tool_permissions—routine_createnow accepts atool_permissionsarray, and the routine engine builds anApprovalContext::Autonomousfrom it before dispatching full_job routines.routine_firetool — new tool to manually trigger a routine, enabling deterministic E2E testing of the full routine execution stack.http_interceptorflows fromAgentDeps→Scheduler→WorkerDeps→JobContext, enabling mock HTTP responses in routine worker tests.InvalidTransitionerrors when a job's execution loop andrun()both try to mark a job completed.routine_news_digest.jsontesting routine creation and firing, with the routine worker exercising http (mock) → memory_write → message broadcast.Fixes
Tool message requires authenticationin autonomous routine jobsCannot transition from completed to completedstate machine errorReview feedback addressed
ApprovalContext::is_blocked_or_default()helper (dedup approval checks)parse_tool_permissions()helper (dedup JSON parsing)test_mark_completed_twice_returns_error)find()Changes
src/tools/tool.rsApprovalContextenum +is_blocked()+is_blocked_or_default()+ 5 testssrc/tools/mod.rsApprovalContextsrc/agent/worker.rssrc/agent/scheduler.rsdispatch_job_inner()dedup,dispatch_job_with_context(), http_interceptor propagation, approval inexecute_tool_task+ 3 testssrc/agent/agent_loop.rssrc/agent/routine.rstool_permissionsfield onFullJobaction +parse_tool_permissions()helpersrc/agent/routine_engine.rsApprovalContextfrom permissions, set message tool contextsrc/tools/builtin/routine.rsRoutineFireTool+ parsetool_permissionsvia shared helpersrc/tools/registry.rsroutine_firetoolsrc/tools/schema_validator.rsroutine_fireschema entrytests/fixtures/llm_traces/advanced/routine_news_digest.jsontests/e2e_advanced_traces.rsroutine_news_digesttest with mock HTTP exchangestests/support/test_rig.rswith_routines()+with_http_exchanges()builder methodsTest plan
cargo check --all-featurespassescargo clippy --all --all-features— zero warningscargo fmt -- --check— cleanApprovalContext::is_blocked()+is_blocked_or_default()(5 tests intool.rs)worker.rs)scheduler.rs)tool_permissionsroutine_news_digestpasses — exercises full stack: routine_create → routine_fire → http (mock) → memory_write → message broadcast🤖 Generated with Claude Code