99.6% Token Reduction through CLI-based scripts and progressive tool discovery for Model Context Protocol (MCP) servers.
Note: This project is optimized for Claude Code with native Skills support. The core runtime works with any AI agent. Scripts with CLI arguments achieve 99.6% token reduction.
An enhanced implementation of Anthropic's Code Execution with MCP pattern, optimized for Claude Code, combining the best ideas from the MCP community and adding significant improvements:
- Scripts with CLI Args: Reusable Python workflows with command-line parameters (99.6% token reduction)
- Multi-Transport: Full support for stdio, SSE, and HTTP MCP servers
- Container Sandboxing: Optional rootless isolation with security controls
- Type Safety: Pydantic models throughout with full validation
- Production-Ready: 129 passing tests, comprehensive error handling
Native Skills Support: This project includes proper Claude Code Skills integration:
.claude/skills/- Skills in Claude Code's native format (SKILL.md + workflow.py)- Auto-discovery - Claude Code automatically finds and validates Skills
- 2 Generic Examples - simple-fetch, multi-tool-pipeline (templates for custom workflows)
- Format Compliant - YAML frontmatter, validation rules, progressive disclosure
Dual-layer architecture:
- Layer 1: Claude Code Skills (
.claude/skills/) - Native discovery and format - Layer 2: Scripts (
./scripts/) - CLI-based Python workflows with argparse
Token efficiency:
- Core runtime: 98.7% reduction (Anthropic's filesystem pattern)
- Scripts with CLI args: 99.6% reduction (no file editing needed)
Note: Scripts work with any AI agent. Claude Code Skills provide native auto-discovery for Claude Code users.
This project builds upon and merges ideas from:
-
ipdelete/mcp-code-execution - Original implementation of Anthropic's PRIMARY pattern
- Filesystem-based progressive disclosure
- Type-safe Pydantic wrappers
- Schema discovery system
- Lazy server connections
-
elusznik/mcp-server-code-execution-mode - Production security patterns
- Container sandboxing architecture
- Comprehensive security controls
- Production deployment patterns
Our contribution: Merged the best of both, added CLI-based scripts pattern, implemented multi-transport support, and refined the architecture for maximum efficiency.
Native Skills format in .claude/skills/ directory:
.claude/skills/
βββ simple-fetch/
β βββ SKILL.md # YAML frontmatter + markdown instructions
β βββ workflow.py # β symlink to ../../scripts/simple_fetch.py
βββ multi-tool-pipeline/
βββ SKILL.md # Multi-tool orchestration example
βββ workflow.py # β symlink to ../../scripts/multi_tool_pipeline.py
How it works:
- Claude Code auto-discovers Skills in
.claude/skills/ - Reads SKILL.md (follows Claude Code's format spec)
- Executes workflow.py (which is a script) with CLI arguments
- Returns results
Benefits:
- β Native Claude Code discovery
- β Standard SKILL.md format (YAML + markdown)
- β Validation compliant (name, description rules)
- β Progressive disclosure compatible
- β Generic examples as templates
Documentation: See .claude/skills/README.md for details
CLI-based Python workflows that agents execute with parameters:
# Simple example (generic template)
uv run python -m runtime.harness scripts/simple_fetch.py \
--url "https://example.com"
# Pipeline example (generic template)
uv run python -m runtime.harness scripts/multi_tool_pipeline.py \
--repo-path "." \
--max-commits 5Benefits over writing scripts from scratch:
- 18x better tokens: 110 vs 2,000
- 24x faster: 5 seconds vs 2 minutes
- Immutable templates: No file editing
- Reusable workflows: Same logic, different parameters
What's included:
- 2 generic template scripts (simple_fetch.py, multi_tool_pipeline.py)
- Complete pattern documentation
Full support for all MCP transport types:
{
"mcpServers": {
"local-tool": {
"type": "stdio",
"command": "uvx",
"args": ["mcp-server-git"]
},
"jina": {
"type": "sse",
"url": "https://mcp.jina.ai/sse",
"headers": {"Authorization": "Bearer YOUR_KEY"}
},
"exa": {
"type": "http",
"url": "https://mcp.exa.ai/mcp",
"headers": {"x-api-key": "YOUR_KEY"}
}
}
}Optional rootless container execution with comprehensive security:
# Sandbox mode with security controls
uv run python -m runtime.harness workspace/script.py --sandboxSecurity features:
- Rootless execution (UID 65534:65534)
- Network isolation (--network none)
- Read-only root filesystem
- Memory/CPU/PID limits
- Capability dropping (--cap-drop ALL)
- Timeout enforcement
- Python 3.11 or 3.12 (3.14 not recommended due to anyio compatibility issues)
- uv package manager (v0.5.0+)
- Claude Code (optional, for Skills auto-discovery)
- Git (for cloning repository)
- Docker or Podman (optional, for sandbox mode)
If you don't have uv installed:
# macOS/Linux
curl -LsSf https://astral.sh/uv/install.sh | sh
# Windows (PowerShell)
powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex"
# Verify installation
uv --version# Clone repository
git clone https://github.com/yourusername/mcp-code-execution-enhanced.git
cd mcp-code-execution-enhanced
# Install dependencies (creates .venv automatically)
uv sync
# Verify installation
uv run python -c "from runtime.mcp_client import get_mcp_client_manager; print('β Installation successful')"Important for Claude Code Users: This project uses its own
mcp_config.jsonfor MCP server configuration, separate from Claude Code's global configuration (~/.claude.json). To avoid conflicts, use different servers in each configuration or disable overlapping servers in~/.claude.jsonwhile using this project.
Create mcp_config.json from the example:
# Copy example config (includes git + fetch for examples)
cp mcp_config.example.json mcp_config.jsonThis config works out of the box:
{
"mcpServers": {
"git": {
"type": "stdio",
"command": "uvx",
"args": ["mcp-server-git", "--repository", "."]
},
"fetch": {
"type": "stdio",
"command": "uvx",
"args": ["mcp-server-fetch"]
}
},
"sandbox": {
"enabled": false
}
}To add more servers: Edit mcp_config.json and add your own MCP servers. See docs/TRANSPORTS.md for examples of stdio, SSE, and HTTP transports.
# Auto-generate typed Python wrappers from your MCP servers
uv run mcp-generate
# This creates ./servers/<server_name>/<tool>.py files
# Example: servers/git/git_log.py, servers/fetch/fetch.py# Test with a simple script
uv run python -m runtime.harness scripts/simple_fetch.py --url "https://example.com"
# If you configured a git server, test the pipeline
uv run python -m runtime.harness scripts/multi_tool_pipeline.py --repo-path "." --max-commits 5If you want to use container sandboxing:
# Install Podman (recommended, rootless)
sudo apt-get install -y podman # Ubuntu/Debian
brew install podman # macOS
# OR install Docker
curl -fsSL https://get.docker.com | sh
# Verify
podman --version # or docker --version
# Test sandbox mode
uv run python -m runtime.harness scripts/simple_fetch.py --url "https://example.com" --sandboxIf using Claude Code, the Skills are already configured in .claude/skills/ and will be auto-discovered. No additional setup needed!
To use:
- Claude Code will automatically find Skills in
.claude/skills/ - Just ask Claude to use them naturally
- Example: "Fetch https://example.com" β Claude discovers and uses simple-fetch Skill
For multi-step workflows (research, data processing, synthesis):
- Discover scripts:
ls ./scripts/β see available script templates - Read documentation:
cat ./scripts/simple_fetch.pyβ see CLI args and pattern - Execute with parameters:
uv run python -m runtime.harness scripts/simple_fetch.py \ --url "https://example.com"
Generic template scripts (scripts/):
simple_fetch.py- Basic single-tool execution patternmulti_tool_pipeline.py- Multi-tool chaining pattern
Note: These are templates - use them as examples to create workflows for your specific MCP servers and use cases.
For simple tasks or novel workflows:
- Explore tools:
ls ./servers/β discover available MCP tools - Write script: Create Python script using tool imports
- Execute:
uv run python -m runtime.harness workspace/script.py
Example script:
import asyncio
from runtime.mcp_client import call_mcp_tool
async def main():
result = await call_mcp_tool(
"git__git_log",
{"repo_path": ".", "max_count": 10}
)
print(f"Fetched {len(result)} commits")
return result
if __name__ == "__main__":
asyncio.run(main())Traditional Approach (High Token Usage):
Agent β MCP Server β [Full Tool Schemas 27,300 tokens] β Agent
Scripts with CLI Args (99.6% Reduction - PREFERRED):
Agent β Discovers scripts β Reads script docs β Executes with CLI args
Script β Multi-server orchestration β Returns results
Tokens: ~110 (script discovery + documentation)
Time: ~5 seconds
Script Writing (98.7% Reduction - ALTERNATIVE):
Agent β Discovers tools β Writes script
Script β MCP Server β Returns data
Agent β Processes/summarizes
Tokens: ~2,000 (tool discovery + script writing)
Time: ~2 minutes
runtime/mcp_client.py: Lazy-loading MCP client manager with multi-transport supportruntime/harness.py: Dual-mode script execution (direct/sandbox)runtime/generate_wrappers.py: Auto-generate typed wrappers from MCP schemasruntime/sandbox/: Container sandboxing with security controlsscripts/: CLI-based workflow templates with 2 generic examples
DON'T: Write scripts from scratch each time DO: Use pre-written scripts with CLI arguments
"""
SCRIPT: Your Script Name
DESCRIPTION: What it does
CLI ARGUMENTS:
--query Research query (required)
--limit Max results (default: 10)
USAGE:
uv run python -m runtime.harness scripts/your_script.py \
--query "your question" \
--limit 5
"""
import argparse
import asyncio
import sys
def parse_args():
parser = argparse.ArgumentParser()
parser.add_argument("--query", required=True)
parser.add_argument("--limit", type=int, default=10)
# Filter script path from args
args_to_parse = [arg for arg in sys.argv[1:] if not arg.endswith(".py")]
return parser.parse_args(args_to_parse)
async def main():
args = parse_args()
# Your workflow logic here
return result
if __name__ == "__main__":
asyncio.run(main())See scripts/README.md for complete documentation.
{
"type": "stdio",
"command": "uvx",
"args": ["mcp-server-name"],
"env": {"API_KEY": "your-key"}
}{
"type": "sse",
"url": "https://mcp.example.com/sse",
"headers": {"Authorization": "Bearer YOUR_KEY"}
}{
"type": "http",
"url": "https://mcp.example.com/mcp",
"headers": {"x-api-key": "YOUR_KEY"}
}See docs/TRANSPORTS.md for detailed information.
{
"sandbox": {
"enabled": true,
"runtime": "auto",
"image": "python:3.11-slim",
"memory_limit": "512m",
"timeout": 30
}
}- Rootless execution: UID 65534:65534 (nobody)
- Network isolation:
--network none - Filesystem: Read-only root, writable tmpfs
- Resource limits: Memory, CPU, PID constraints
- Capabilities: All dropped (
--cap-drop ALL) - Security:
no-new-privileges, SELinux labels
See SECURITY.md for complete security documentation.
# Run all tests (129 total)
uv run pytest
# Unit tests only
uv run pytest tests/unit/
# Integration tests (requires Docker/Podman for sandbox tests)
uv run pytest tests/integration/
# With coverage
uv run pytest --cov=src/runtimeREADME.md(this file) - Overview and quick startCLAUDE.md- Quick reference for Claude CodeAGENTS.md.template- Template for adapting to other AI frameworksscripts/README.md- Scripts system guidescripts/SKILLS.md- Complete scripts documentationdocs/USAGE.md- Comprehensive user guidedocs/ARCHITECTURE.md- Technical architecturedocs/CONFIGURATION.md- MCP server configuration management (Claude Code vs project)docs/TRANSPORTS.md- Transport-specific detailsSECURITY.md- Security architecture and best practices
# Type checking
uv run mypy src/
# Formatting
uv run black src/ tests/
# Linting
uv run ruff check src/ tests/# Generate wrappers from tool definitions
uv run mcp-generate
# (Optional) Generate discovery config with LLM parameter generation
uv run mcp-generate-discovery
# (Optional) Execute safe tools and infer schemas
uv run mcp-discover
# Execute a script with MCP tools available
uv run mcp-exec workspace/script.py
# Execute in sandbox mode
uv run mcp-exec workspace/script.py --sandbox| Approach | Tokens | Time | Use Case |
|---|---|---|---|
| Traditional | 27,300 | N/A | All tool schemas loaded upfront |
| Scripts with CLI Args | 110 | 5 sec | Multi-step workflows (PREFERRED) |
| Script Writing | 2,000 | 2 min | Novel workflows (ALTERNATIVE) |
Scripts with CLI args achieve 99.6% reduction - exceeding Anthropic's 98.7% target!
From ipdelete/mcp-code-execution:
- β Filesystem-based progressive disclosure
- β Type-safe Pydantic wrappers
- β Lazy server connections
- β Schema discovery system
From elusznik/mcp-server-code-execution-mode:
- β Container sandboxing architecture
- β Security controls and policies
- β Production deployment patterns
Enhanced in this project:
- β CLI-based scripts: CLI-based immutable templates (99.6% reduction)
- β Multi-transport: stdio + SSE + HTTP support (100% server coverage)
- β Dual-mode execution: Direct (fast) + Sandbox (secure)
- β Python 3.11 stable: Avoiding 3.14 anyio compatibility issues
- β Comprehensive testing: 129 tests covering all features
- β Enhanced documentation: Complete guides for all features
Scripts with CLI Arguments:
- Scripts are immutable templates executed with CLI arguments
- No file editing required (parameters via
--query,--num-urls, etc.) - Reusable across different queries and contexts
- Pre-tested and documented workflows
Multi-Transport:
- Single codebase supports all transport types
- Automatic transport detection
- Unified configuration format
- Seamless server connections
Dual-Mode Execution:
- Direct mode: Fast, full access (development)
- Sandbox mode: Secure, isolated (production)
- Same code, different security postures
- Runtime selection via flag or config
{
"mcpServers": {
"git": {
"command": "uvx",
"args": ["mcp-server-git", "--repository", "."]
}
}
}{
"mcpServers": {
"local-stdio": {
"type": "stdio",
"command": "uvx",
"args": ["mcp-server-name"],
"env": {"API_KEY": "key"},
"disabled": false
},
"remote-sse": {
"type": "sse",
"url": "https://mcp.example.com/sse",
"headers": {"Authorization": "Bearer KEY"},
"disabled": false
},
"remote-http": {
"type": "http",
"url": "https://mcp.example.com/mcp",
"headers": {"x-api-key": "KEY"},
"disabled": false
}
},
"sandbox": {
"enabled": false,
"runtime": "auto",
"image": "python:3.11-slim",
"memory_limit": "512m",
"cpu_limit": "1.0",
"timeout": 30,
"max_timeout": 120
}
}- π¦₯ Lazy Loading: Servers connect only when tools are called
- π Type Safety: Pydantic models for all tool inputs/outputs
- π Defensive Coding: Handles variable MCP response structures
- π¦ Auto-generated Wrappers: Typed Python functions from MCP schemas
- π οΈ Field Normalization: Handles inconsistent API casing
- π― Scripts Pattern: Pattern for CLI-based reusable workflows
- π Multi-Transport: stdio, SSE, and HTTP support
- π Container Sandboxing: Optional rootless isolation
- π§ͺ Comprehensive Testing: 129 tests with full coverage
- π Complete Documentation: Guides for every feature
See the examples/ directory for:
example_progressive_disclosure.py- Classic token reduction patternexample_tool_chaining.py- LLM orchestration patternexample_sandbox_usage.py- Container sandboxing demoexample_sandbox_simple.py- Basic sandbox usage
See the scripts/ directory for production-ready workflows.
"MCP server not configured"
- Check
mcp_config.jsonserver names match your calls
"Connection closed"
- Verify server command:
which <command> - Check server logs for startup errors
"Module not found"
- Run
uv run mcp-generateto regenerate wrappers - Ensure
src/is in PYTHONPATH (harness handles this)
Import errors in skills
- Skills must be run via harness (sets PYTHONPATH)
- Don't run skills directly:
python scripts/script.pyβ - Correct:
uv run python -m runtime.harness scripts/script.pyβ
Python 3.14 compatibility:
- Not recommended due to anyio <4.9.0 breaking changes
- Use Python 3.11 or 3.12 for stability
- See issue tracker for updates
We welcome contributions! Areas of interest:
- New skills: Add more workflow templates
- MCP server support: Test with different servers
- Documentation: Improve guides and examples
- Testing: Expand test coverage
- Performance: Optimize token usage further
# Install with dev dependencies
uv sync --all-extras
# Run quality checks
uv run black src/ tests/
uv run mypy src/
uv run ruff check src/ tests/
uv run pytestMIT License - see LICENSE file for details
- ipdelete/mcp-code-execution - Anthropic's PRIMARY pattern
- elusznik/mcp-server-code-execution-mode - Production security
| Feature | Original (ipdelete) | Bridge (elusznik) | Enhanced (this) |
|---|---|---|---|
| Progressive Disclosure | β PRIMARY | β PRIMARY | |
| Token Reduction | 98.7% | ~95% | 99.6% |
| Type Safety | β Pydantic | β Enhanced | |
| Sandboxing | β None | β Required | β Optional |
| Multi-Transport | β stdio only | β stdio only | β stdio/SSE/HTTP |
| Scripts Pattern | β None | β None | β Yes + examples |
| CLI Execution | β None | β None | β Immutable |
| Test Coverage | β Comprehensive | ||
| Python 3.11 | β Yes | β Stable |
- β AI agents needing to orchestrate multiple MCP tools
- β Research workflows (web search β read β synthesize)
- β Data processing pipelines (fetch β transform β output)
- β Code discovery (search β analyze β recommend)
- β Production deployments requiring security isolation
- β Teams needing reproducible research workflows
- β Single tool calls (use MCP directly instead)
- β Real-time interactive tools (better suited for direct integration)
- β GUI applications (command-line focused)
- Install Python 3.11+ and uv
- Clone repository
- Run
uv sync - Create
mcp_config.jsonwith your MCP servers - Run
uv run mcp-generateto create wrappers - Try a skill:
uv run python -m runtime.harness scripts/simple_fetch.py --url "https://example.com" - Read
AGENTS.mdfor operational guide - Explore
scripts/for available workflows - Review
docs/for detailed documentation
Q: Why Skills instead of writing scripts? A: Skills achieve 99.6% token reduction vs 98.7% for scripts, and execute 24x faster (5 sec vs 2 min). They're pre-tested, documented, and immutable.
Q: Can I use this without Claude Code? A: Yes, but with limitations. The core runtime (script writing, 98.7% reduction) works with any AI agent. The Scripts with CLI args (99.6% reduction) work for Claude Code's operational intelligence.
Q: Can I still write custom scripts? A: Yes! Scripts with CLI args are PREFERRED for common workflows (with Claude Code), but custom scripts are fully supported for novel use cases and other AI agents.
Q: What's the difference from the original projects? A: We merged the best of both (progressive disclosure + security), added CLI-based scripts pattern, multi-transport support, and refined the architecture.
Q: Why Python 3.11 instead of 3.14? A: anyio <4.9.0 has compatibility issues with Python 3.14's asyncio changes. 3.11 is stable and well-tested.
Q: Is sandboxing required? A: No, it's optional. Use direct mode for development (fast), sandbox mode for production (secure).
Q: How do I add my own MCP servers?
A: Add them to mcp_config.json, run uv run mcp-generate, and they're ready to use!
- Explore scripts:
ls scripts/andcat scripts/simple_fetch.py - Try examples: Run the example skills or create your own
- Read CLAUDE.md: Quick operational guide (for Claude Code users)
- Review docs/: Deep dive into architecture
- Create custom skill: Follow the template for your use case