-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Description
What would you like to be added?
Add an optional “client-side redaction before request” feature: before sending any prompt/history/tool output to any LLM provider, scan outgoing text and replace detected secrets with placeholders; locally restore placeholders when displaying model output and before executing tool/function call arguments.
Goal: secrets never leave the local machine (are never transmitted to the LLM provider), while keeping tool execution correct.
Why is this needed?
qwen-code aggregates local context and tool outputs into prompts. It’s easy to accidentally send real secrets to third-party providers, e.g. pasting .env content or reading secret files via tools (tokens, private keys, etc.) that then enter the prompt history.
Sandboxing restricts actions/permissions, but it doesn’t prevent sensitive strings from being included in the prompt. A last-mile client-side redaction layer helps reduce accidental secret leakage.
Additional context
1) Scope
Outbound (before API request):
- user input
- full history / context that will be included in the request (including tool outputs, IDE context, etc.)
Inbound (after model response):
- model text (optionally restore before rendering)
- tool/function call arguments (must restore before execution to avoid writing placeholders into files/commands)
2) Placeholder format (keep consistent with VibeGuard)
Placeholder format:
__VG_{CATEGORY}_{hash12}__
hash12:
- first 12 lowercase hex chars of
HMAC-SHA256(sessionSecret, original) - stable within a session, non-reversible for providers
Mapping:
- in-memory only, bidirectional
placeholder ↔ original - TTL cleanup + max size cap to prevent unbounded memory growth
3) Hook points (implementation sketch)
Provider-agnostic “send before” point:
- Intercept right before any provider request is sent (so no MITM is needed and all providers are covered).
Tool arg restore:
- Deep-walk tool/function-call args and restore placeholders in string fields before tool scheduling/execution.
Streaming:
- For streaming UI restore, consider a small tail buffer so placeholders split across chunks can still be restored correctly.
4) Configuration (default off)
Example (illustrative only; field naming can follow project conventions):
{
"security": {
"redaction": {
"enabled": true,
"keywords": [
{ "value": "sk-xxx", "category": "OPENAI_KEY" }
],
"patterns": [
{ "regex": "ghp_[A-Za-z0-9]{36}", "category": "GITHUB_TOKEN" }
],
"builtins": ["email", "ipv4", "uuid"],
"exclude": ["localhost", "127.0.0.1"],
"ttlMinutes": 60,
"maxSize": 10000
}
}
}5) Non-goals (for v1, to keep PR small)
- Not trying to auto-detect every secret perfectly; start with configurable keywords/regex + a few built-ins
- No on-disk modifications; redaction is done ephemerally in memory at request time
- Further redaction of session recordings/logs can be discussed as a follow-up
6) Prior art / reference implementation
7) Questions for maintainers
- Prefer implementing this in core (provider-agnostic), or wait for a hooks/plugin mechanism and implement as a plugin?
- Should the configuration live under
security.redactionorprivacy.redaction? - Is the
__VG_{CATEGORY}_{hash12}__placeholder + session HMAC mapping approach acceptable?