Paper: Tracking Capabilities for Safer Agents (arXiv:2603.00991)
TACIT (Tracked Agent Capabilities In Types) is a safety harness for AI agents. Instead of calling tools directly, agents write code in Scala 3 with capture checking: a type system that statically tracks capabilities and enforces that agent code cannot forge access rights, cannot perform effects beyond its budget, and cannot leak information from pure sub-computations. It provides an MCP interface, so that it can be easily used by all MCP-compatible agents.
The framework has three main components:
- Scala 3 compiler. Agent-submitted code is validated and type-checked with capture checking enabled in safe mode, which enforces a capability-safe language subset.
- Scala REPL. A local REPL instance executes compiled code and manages state across interactions. Supports both stateless one-shot execution and stateful sessions.
- Capability safety library. A typed API that serves as the sole gateway through which agent code interacts with the real world: file system, process execution, network, and sub-agents. The library is extensible: add new capabilities by modifying only the library code, without changing the MCP server itself.
TACIT provides a standard MCP server that communicates via JSON-RPC over stdio. It works with any MCP-compatible agent, including Claude Code, OpenCode, GitHub Copilot, and others.
Requires JDK 17+
Use the release download script to get started quickly (no local build required). It will download the latest server and library JARs from GitHub releases and place them in the current directory.
# Download the script directly (no git clone required)
curl -fsSL https://raw.githubusercontent.com/lampepfl/TACIT/refs/heads/main/download_release.sh -o download_release.sh
chmod +x download_release.sh
# Run it
./download_release.shOptional:
# Or use wget instead of curl
wget -q https://raw.githubusercontent.com/lampepfl/TACIT/refs/heads/main/download_release.sh -O download_release.sh
chmod +x download_release.sh
# Download into a custom directory
./download_release.sh ./dist
# Use latest pre-release instead of latest stable release
./download_release.sh --pre-release ./distBy default, this downloads:
| JAR | Default path |
|---|---|
| MCP Server | ./TACIT.jar |
| Library | ./TACIT-library.jar |
Requires JDK 17+ and sbt 1.12+.
git clone https://github.com/lampepfl/TACIT.git
cd TACIT
./build.shOptional:
# Build and copy JARs into a custom directory
./build.sh ./dist
# Show full sbt output while building
./build.sh --verboseThis builds and copies two JARs:
| JAR | Path |
|---|---|
| MCP Server | ./TACIT.jar (or ./dist/TACIT.jar) |
| Library | ./TACIT-library.jar (or ./dist/TACIT-library.jar) |
Add TACIT as an MCP server in your agent's configuration. Replace the paths below with absolute paths to your JARs (downloaded or built).
Claude Code
Add to your project's .mcp.json (or ~/.claude.json for global):
{
"mcpServers": {
"tacit": {
"command": "java",
"args": [
"-jar", "/path/to/TACIT.jar",
"--library-jar", "/path/to/TACIT-library.jar"
]
}
}
}OpenCode
Add to your opencode.json:
{
"$schema": "https://opencode.ai/config.json",
"mcp": {
"tacit": {
"type": "local",
"enabled": true,
"command": [
"java",
"-jar", "/path/to/TACIT.jar",
"--library-jar", "/path/to/TACIT-library.jar"
]
}
}
}GitHub Copilot (VS Code)
Add to your .vscode/mcp.json:
{
"servers": {
"tacit": {
"command": "java",
"args": [
"-jar", "/path/to/TACIT.jar",
"--library-jar", "/path/to/TACIT-library.jar"
]
}
}
}Your agent can now use TACIT's tools to execute sandboxed Scala code.
To fully benefit from TACIT's capability-based safety, disable the agent's built-in file, shell, and network tools so that all operations go through the sandboxed REPL.
Claude Code
Launch with --disallowedTools to block built-in tools:
claude --disallowedTools "Bash,Read,Write,Edit,WebFetch"Or add to your project's .claude/settings.json:
{
"permissions": {
"disallowedTools": ["Bash", "Read", "Write", "Edit", "WebFetch"]
}
}OpenCode
Set built-in tool permissions to "deny" in opencode.json:
{
"$schema": "https://opencode.ai/config.json",
"permission": {
"*": "ask",
"bash": "deny",
"read": "deny",
"edit": "deny",
"glob": "deny",
"grep": "deny",
"list": "deny",
"tacit*": "allow"
},
"mcp": {
"tacit": { "..." : "..." }
}
}GitHub Copilot (VS Code)
In your VS Code settings.json, restrict the tools available to Copilot:
{
"github.copilot.chat.agent.tools": {
"terminal": false,
"fs_read": false,
"fs_write": false
}
}The server can be configured via CLI flags or a JSON config file. Pass flags directly in your agent's MCP args, or use --config to point to a JSON file.
| Flag | Description |
|---|---|
-r/--record <dir> |
Log every execution to disk |
-s/--strict |
Block file ops (cat, ls, rm, etc.) through exec |
--classified-paths <paths> |
Comma-separated classified (protected) paths |
--library-jar <path> |
Required. Path to the library JAR (TACIT-library.jar) |
-q/--quiet |
Suppress startup banner and request/response logging |
--no-wrap |
Disable wrapping user code in def run() = ... ; run() |
--no-session |
Disable session-related tools |
-c/--config <path> |
JSON config file (flags after --config override file values) |
--llm-base-url <url> |
LLM API base URL |
--llm-api-key <key> |
LLM API key |
--llm-model <name> |
LLM model name |
{
"recordPath": "/tmp/recordings",
"strictMode": true,
"quiet": false,
"wrappedCode": true,
"sessionEnabled": false,
"classifiedPaths": ["/home/user/project/secrets"],
"libraryJarPath": "/path/to/TACIT-library.jar",
"llm": {
"baseUrl": "https://api.example.com",
"apiKey": "sk-...",
"model": "gpt-..."
}
}| Tool | Parameters | Description |
|---|---|---|
execute_scala |
code |
Execute a Scala snippet in a fresh REPL (stateless) |
create_repl_session |
- | Create a persistent REPL session, returns session_id |
execute_in_session |
session_id, code |
Execute code in an existing session (stateful) |
list_sessions |
- | List active session IDs |
delete_repl_session |
session_id |
Delete a session |
show_interface |
- | Show the full capability API reference |
1. create_repl_session → session_id: "abc-123"
2. execute_in_session(code: "val x = 42") → x: Int = 42
3. execute_in_session(code: "x * 2") → val res0: Int = 84
4. delete_repl_session(session_id: "abc-123")
TACIT's type system provides three safety guarantees that hold regardless of whether the agent is misaligned, hallucinating, or under prompt injection attack:
| Property | What it means |
|---|---|
| Capability safety | Capabilities cannot be forged or forgotten. The agent can only access resources through capabilities explicitly granted to it. |
| Capability completeness | Capabilities regulate all safety-relevant effects. The agent interacts with the world only through its granted capabilities. |
| Local purity | Specific computations can be enforced as side-effect-free. This prevents information leakage when agents process classified data. |
The library exposes three capability request methods, each scoping access to a block. Capabilities cannot escape their scoped block. This is enforced at compile time by the capture checker.
// File system: scoped to a root directory
requestFileSystem("/tmp/work") {
val f = access("data.txt")
f.write("hello")
val lines = f.readLines()
grep("data.txt", "hello")
find(".", "*.txt")
}
// Process execution: scoped to an allowlist of commands
requestExecPermission(Set("ls", "cat")) {
val result = exec("ls", List("-la"))
println(result.stdout)
}
// Network: scoped to an allowlist of hosts
requestNetwork(Set("api.example.com")) {
val body = httpGet("https://api.example.com/data")
httpPost("https://api.example.com/submit", """{"key":"value"}""")
}Consider a typical code agent working on a project directory. Some files are ordinary (source code, build configs, READMEs). Others are sensitive: API keys in .env, credentials in secrets/, internal documents. The agent is powered by a cloud-hosted LLM (a third-party service). We want the agent to use or process the sensitive data (summarize internal docs, rotate keys, process reports) but never leak it to the cloud provider.
TACIT solves this through the Classified[T] type. Files under designated classified paths (configured via --classified-paths) return their content wrapped in Classified[String] instead of plain String. The type system then enforces the key property:
Pure-only access. Classified.map accepts only pure functions (T -> U), meaning no effects, not capturing any capability. You can transform the data, but you cannot send it anywhere.
Any attempt to exfiltrate classified data is rejected at compile time:
requestFileSystem("/project") {
val secret = readClassified("secrets/api-key.txt")
// Compile error: map captures the file capability, not a pure function
secret.map: s =>
access("exfil.txt").write(s) // error: capturing f is not allowed
s
// Compile error: print out the classified content to the cloud LLM
secret.map: s =>
println(s) // error: capturing IOCapability is not allowed
s
}So how can the agent do useful work with classified data? Through a dual LLM design: a separate trusted local LLM processes classified content. The framework provides a chat overload that accepts Classified[String] and returns Classified[String]. The trusted LLM sees the content, but the result stays wrapped and can never flow back to the untrusted cloud model.
requestFileSystem("/project") {
// OK: read classified content
val doc = readClassified("secrets/contract-v2.txt")
// OK: pure transformation
val upper = doc.map(_.trim)
// OK: send to trusted local LLM, result stays Classified
val summary = chat("Summarize the following document:", doc)
// summary: Classified[String], content is still protected
// OK: write back to a classified file
writeClassified("secrets/summary.txt", summary)
}Agent-generated code is compiled under Scala 3's safe mode (import language.experimental.safe), which enforces a capability-safe language subset:
- No unchecked type casts or pattern matches
- No features from the
caps.unsafemodule - No
@uncheckedannotations - No runtime reflection
- Compile with capture checking and explicit nulls enabled, tracking all mutation effects
- Global objects and functions accessible only if they are implemented safely
These restrictions prevent agents from "forgetting" capabilities through unsafe casts, reflection, or type system holes. Code that does not pass compilation is never executed.
The safe mode is an experimental feature and still under active development and testing. We current use a static code validator that checks for forbidden patterns to enforce the safe mode subset. We are planning to migrate to the official Scala 3 safe mode once soon.
A secondary LLM is available through the chat method, no capability scope required. Safety comes from the Classified type system: chat(String): String for regular data, chat(Classified[String]): Classified[String] for sensitive data.
// Regular chat
val answer = chat("What is 2 + 2?")
// Classified chat: input and output stay wrapped
requestFileSystem("/secrets") {
val secret = readClassified("/secrets/key.txt")
val result = chat(secret.map(s => s"Summarize: $s"))
// result is Classified[String], cannot be printed or leaked
}Configure via CLI flags (--llm-base-url, --llm-api-key, --llm-model) or a JSON config file (--config). Any OpenAI-compatible API is supported.
We evaluate TACIT on safety and expressiveness (see paper Section 4 for full details).
Safety (RQ1). In classified mode (secrets wrapped in Classified[String]), both Claude Sonnet 4.6 and MiniMax M2.5 achieve 100% security across all 131 trials. Every injection and malicious task is blocked by the type system. Utility remains high (99.2% for Sonnet, 90.0% for MiniMax).
Expressiveness (RQ2). On τ2-bench and SWE-bench Lite, agents using TACIT's capability-safe harness match or slightly exceed standard tool-calling baselines across all tested models (gpt-oss-120b, MiniMax M2.5, DeepSeek V3.2), demonstrating that writing type-safe Scala does not degrade agentic performance.
The library (library/) defines the capability API that user code can call inside the REPL.
To implement custom permissions and fine-grained access control, you can add new capabilities (e.g., database access, message queues, server management) by modifying the library and rebuilding just the library JAR.
library/
├── Interface.scala # Public API trait (what user code sees)
├── impl/
│ ├── InterfaceImpl.scala # Wires everything together (exports Ops objects)
│ ├── FileOps.scala # grep, grepRecursive, find
│ ├── ProcessOps.scala # exec, execOutput
│ ├── WebOps.scala # httpGet, httpPost
│ ├── LlmOps.scala # chat
│ ├── RealFileSystem.scala # FileSystem on real disk
│ ├── VirtualFileSystem.scala # In-memory FileSystem (for testing)
│ ├── ClassifiedImpl.scala # Classified[T] wrapper implementation
│ ├── CommandValidator.scala # Command allowlist enforcement
│ └── LlmConfig.scala # LLM configuration case class
└── test/ # Library-level tests
Here is an example of adding a hypothetical requestDatabase capability.
// Add a result type
case class QueryResult(columns: List[String], rows: List[List[String]])
// Add a capability class
class DatabasePermission(val connectionString: String) extends caps.SharedCapability
// Add methods to the Interface trait
trait Interface:
// ... existing methods ...
def requestDatabase[T](connectionString: String)(op: DatabasePermission^ ?=> T)(using IOCapability): T
def query(sql: String)(using DatabasePermission): QueryResultKey points:
- The capability class must extend
caps.SharedCapability. This is what enables Scala 3's capture checker to prevent the capability from escaping its scoped block. - The
request*method takes a blockopthat receives the capability as a context parameter (?=>). The^mark means the capability is tracked by the capture checker. - Operation methods (like
query) take the capability as ausingparameter, so they can only be called inside the correspondingrequest*block.
Create library/impl/DatabaseOps.scala:
package tacit.library
import language.experimental.captureChecking
object DatabaseOps:
def query(sql: String)(using perm: DatabasePermission): QueryResult =
// Your implementation here
// perm.connectionString has the connection info
???In library/impl/InterfaceImpl.scala, export your new operations and implement the request* method:
class InterfaceImpl(...) extends Interface:
export FileOps.*
export ProcessOps.*
export WebOps.*
export DatabaseOps.* // ← add this
// ... existing methods ...
def requestDatabase[T](connectionString: String)(op: DatabasePermission^ ?=> T)(using IOCapability): T =
val perm = new DatabasePermission(connectionString)
op(using perm)If your new API wraps a Java/Scala library that users should not call directly, add forbidden patterns to src/main/scala/executor/CodeValidator.scala:
ForbiddenPattern("db-jdbc", raw"java\.sql\b".r, "Direct JDBC access is forbidden; use requestDatabase"),
ForbiddenPattern("db-driver", raw"DriverManager".r, "DriverManager is forbidden; use requestDatabase"),This ensures user code goes through the capability API instead of bypassing it.
If your new API requires external libraries, add them to the lib project in build.sbt:
lazy val lib = project
.in(file("library"))
.settings(
// ... existing settings ...
libraryDependencies ++= Seq(
"com.openai" % "openai-java" % "4.23.0",
"org.postgresql" % "postgresql" % "42.7.3", // ← add your dep
),
)sbt "lib/assembly"You do not need to rebuild the server JAR unless you changed CodeValidator (step 4) or other server-side code. Just point the server at the new library JAR:
java -jar server.jar --library-jar new-library.jar-
Capabilities must extend
caps.SharedCapability. This is what makes capture checking work. Without it, the compiler cannot track the capability's scope and users could leak it out of therequest*block. -
Capture checking is experimental. The project uses
-language:experimental.captureChecking. Compiler behavior may change across Scala 3 nightly versions. If you hit unexpected errors, check if the issue is with capture checking by temporarily removing the flag. -
The library uses Scala 3 nightly. The build automatically fetches the latest Scala 3 nightly. This means your code must be compatible with bleeding-edge Scala. Pin a specific version in
build.sbt(val scala3Version = "3.x.y") if you need stability. -
Interface.scalais bundled as a resource. The server copiesInterface.scalainto its resources at build time so theshow_interfacetool can display it. If you add new APIs, users will see them viashow_interfaceautomatically, no extra work needed. -
Forbidden patterns run on user code, not library code. The validator in
CodeValidator.scalaonly checks user-submitted code. The library itself can freely usejava.io,java.net,ProcessBuilder, etc. in its implementation. But if your new API wraps a Java API, you should add a corresponding forbidden pattern so users cannot bypass your capability wrapper. -
The library JAR is a fat JAR.
sbt "lib/assembly"produces a JAR that includes all of the library's dependencies (e.g.,openai-java). If you add a dependency, it will be bundled automatically. -
Server depends on library types at compile time. The server depends on the interface type to run the REPL. Make sure your change is compatible with the server's expected interface.
-
Test your API at the library level first. The
library/test/directory contains library-level tests using MUnit. Test your new operations there before doing integration tests through the MCP server. SeeLibrarySuite.test.scalafor examples.
Requirements
- JDK 17+
- sbt 1.12+
sbt clean # Clean build artifacts
sbt compile # Compile
sbt test # Run all tests
sbt "testOnly *McpServerSuite" # Run a single suite
sbt assembly # Build both JARs (server + library)
sbt "lib/assembly" # Build library JAR onlyRunning the server directly (without an agent)
# Basic
java -jar target/scala-*/TACIT-assembly-*.jar \
--library-jar library/target/scala-*/TACIT-library.jar
# With logging
java -jar server.jar --library-jar library.jar --record ./log
# With JSON config
java -jar server.jar --library-jar library.jar --config config.json@misc{odersky2026trackingcapabilitiessaferagents,
title={Tracking Capabilities for Safer Agents},
author={Martin Odersky and Yaoyu Zhao and Yichen Xu and Oliver Bračevac and Cao Nguyen Pham},
year={2026},
eprint={2603.00991},
archivePrefix={arXiv},
primaryClass={cs.AI},
url={https://arxiv.org/abs/2603.00991},
}Apache-2.0

