feat(tools): shell tool security hardening#1097
Conversation
There was a problem hiding this comment.
Pull request overview
This PR hardens the exec tool by replacing the regex/os/exec-based shell executor with an in-process mvdan.cc/sh/v3 interpreter plus AST/runtime interception to classify and block risky commands, sanitize environment variables, and restrict shell redirection file access to the workspace. It also updates configuration/docs and ensures cron-executed commands use the same security path.
Changes:
- Introduces
pkg/tools/shell/runner with risk classification, env allowlisting, and redirection sandboxing; adaptsExecToolto use it (including async/background support). - Deprecates legacy deny/allow regex config fields and adds new config knobs (
risk_threshold,risk_overrides,arg_modifiers,env_allowlist,env_set) with updated defaults and docs. - Adds/updates tests for runner, sandbox, env sanitization, async exec behavior, and cron integration.
Reviewed changes
Copilot reviewed 22 out of 23 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
| pkg/tools/shell_tool.go | New ExecTool adapter using the in-process shell runner + async/background support + config wiring. |
| pkg/tools/shell_tool_test.go | Tests sync/async execution and callback behavior. |
| pkg/tools/cron_exec_test.go | Ensures scheduled (cron) commands go through the same blocking path. |
| pkg/tools/shell/runner.go | New interpreter-backed execution pipeline with ExecHandlers, timeout handling, and output capture/truncation. |
| pkg/tools/shell/runner_test.go | End-to-end runner tests (blocking, timeout, env, pipelines, etc.). |
| pkg/tools/shell/risk.go | Risk levels, command table, arg modifiers, overrides, and blocked-command error formatting. |
| pkg/tools/shell/risk_test.go | Unit tests for base table, modifiers, overrides, parsing, and matching helpers. |
| pkg/tools/shell/env.go | Builds sanitized environment (default allowlist + prefixes + env_set override). |
| pkg/tools/shell/env_test.go | Verifies secret filtering, allowlist extension, and env_set behavior. |
| pkg/tools/shell/sandbox.go | OpenHandler implementation restricting redirections to workspace (with symlink-escape prevention + safe paths). |
| pkg/tools/shell/sandbox_test.go | Tests workspace allow/deny, safe paths, symlink escape, and dotted-name regression. |
| pkg/config/config.go | Adds new exec config fields and keeps old ones as deprecated/ignored. |
| pkg/config/defaults.go | Updates default exec config to use risk_threshold: "medium". |
| docs/tools_configuration.md | Documents new exec tool configuration + behavior model. |
| docs/design/DDR-shell-tool-hardening.md | Design Decision Record documenting threat model/approach/migration. |
| config/config.example.json | Updates example config to new exec tool fields. |
| pkg/tools/shell.go | Removes legacy regex-based executor implementation. |
| pkg/tools/shell_test.go | Removes legacy exec tool tests (superseded by new runner/tool tests). |
| pkg/tools/shell_timeout_unix_test.go | Removes legacy process-group timeout test (needs replacement coverage under new runner). |
| pkg/tools/shell_process_unix.go | Removes legacy Unix process-group termination helpers. |
| pkg/tools/shell_process_windows.go | Removes legacy Windows process termination helpers. |
| go.mod | Adds mvdan.cc/sh/v3 dependency (and updates module requirements). |
| go.sum | Updates dependency checksums for new modules/transitives. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 22 out of 23 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 24 out of 25 changed files in this pull request and generated 6 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 27 out of 28 changed files in this pull request and generated 4 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
Does this fix the fact that skills scripts get the entire environment including all the secrets API keys etc. |
if it is executed with same shell tool yes. |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 27 out of 28 changed files in this pull request and generated 5 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 30 out of 31 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 29 out of 30 changed files in this pull request and generated 5 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 29 out of 30 changed files in this pull request and generated 4 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
The async callback created in the agent loop was only logging and doing nothing, forcing SubagentManager to import bus and deliver results directly. This moves result delivery responsibility to the loop side where it belongs, and removes the bus dependency from SubagentManager. Ref: sipeed#1070 (comment by @blib), sipeed#1097 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The async callback created in the agent loop was only logging and doing nothing, forcing SubagentManager to import bus and deliver results directly. This moves result delivery responsibility to the loop side where it belongs, and removes the bus dependency from SubagentManager. Ref: sipeed#1070 (comment by @blib), sipeed#1097 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace regex deny-patterns with in-process POSIX interpreter (mvdan.cc/sh/v3) and four-tier risk classification. Add env sanitization, file-access sandboxing, and argument-aware risk modifiers. Cron commands now share the same security path. Signed-off-by: Boris Bliznioukov <blib@mail.com>
… sandboxing for shell commands Signed-off-by: Boris Bliznioukov <blib@mail.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Signed-off-by: Boris Bliznioukov <blib@mail.com>
…ests for path handling Signed-off-by: Boris Bliznioukov <blib@mail.com>
…commands; add Windows-specific tests and deprecate old config fields Signed-off-by: Boris Bliznioukov <blib@mail.com>
Signed-off-by: Boris Bliznioukov <blib@mail.com>
…t import Signed-off-by: Boris Bliznioukov <blib@mail.com>
…round delivery Signed-off-by: Boris Bliznioukov <blib@mail.com>
…llback handling Signed-off-by: Boris Bliznioukov <blib@mail.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
…ling Signed-off-by: Boris Bliznioukov <blib@mail.com>
Signed-off-by: Boris Bliznioukov <blib@mail.com>
f8e0da6 to
88da084
Compare
…mand_DeepPath for clarity Signed-off-by: Boris Bliznioukov <blib@mail.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 29 out of 30 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
…ce risk classification Signed-off-by: Boris Bliznioukov <blib@mail.com>
|
I am extracting the environment sanitisation into a separate PR, breaking things down into smaller more digestable chunks. |
|
@blib Hi! This PR has had no activity for over 2 weeks, so I'm closing it for now to keep things organized. Feel free to reopen anytime if you'd like to continue. |
📝 Description
Replace the regex-based deny-pattern shell executor with an AST-based risk classifier built on an in-process POSIX interpreter (
mvdan.cc/sh/v3).The old approach used
os/execwith regex deny lists — brittle, bypassable via shell metacharacters, and platform-specific (separate_unix/_windowsfiles). The new design parses every command into an AST before execution, classifying each resolved binary against a four-tier risk table (low→medium→high→critical). Commands above the configured threshold are blocked before any process is spawned.Key changes:
rmishigh,rm -rfiscritical). User-configurable overrides and extra modifiers via config.env_setsupport.OpenHandlerrestricts shell redirections to workspace directory, with symlink-escape prevention.enable_deny_patterns/custom_deny_patternsfields deprecated with warnings; newrisk_threshold,risk_overrides,arg_modifiers,env_allowlist,env_set.Design documented in DDR (Design Decision Record):
docs/design/DDR-shell-tool-hardening.md🗣️ Type of Change
🤖 AI Code Generation
🔗 Related Issue
Supersedes: #820
Addresses shell tool security hardening — see
docs/design/DDR-shell-tool-hardening.mdfor full threat model and acceptance criteria.📚 Technical Context (Skip for Docs)
a=sudo; $a rm -rf /), command substitution, and shell metacharacters. Moving to an in-process AST interpreter (mvdan.cc/sh/v3) lets us classify every resolved binary at execution time, eliminating regex evasion entirely. The four-tier risk model replaces the binary allow/deny with graduated control, and env sanitization prevents credential leakage.🧪 Test Environment
☑️ Checklist