Skip to content

Feature Request: Client-side secret redaction before sending prompts (placeholder replace + session restore) #2010

@inkdust2021

Description

@inkdust2021

What would you like to be added?

Add an optional “client-side redaction before request” feature: before sending any prompt/history/tool output to any LLM provider, scan outgoing text and replace detected secrets with placeholders; locally restore placeholders when displaying model output and before executing tool/function call arguments.

Goal: secrets never leave the local machine (are never transmitted to the LLM provider), while keeping tool execution correct.

Why is this needed?

qwen-code aggregates local context and tool outputs into prompts. It’s easy to accidentally send real secrets to third-party providers, e.g. pasting .env content or reading secret files via tools (tokens, private keys, etc.) that then enter the prompt history.

Sandboxing restricts actions/permissions, but it doesn’t prevent sensitive strings from being included in the prompt. A last-mile client-side redaction layer helps reduce accidental secret leakage.

Additional context

1) Scope

Outbound (before API request):

  • user input
  • full history / context that will be included in the request (including tool outputs, IDE context, etc.)

Inbound (after model response):

  • model text (optionally restore before rendering)
  • tool/function call arguments (must restore before execution to avoid writing placeholders into files/commands)

2) Placeholder format (keep consistent with VibeGuard)

Placeholder format:

  • __VG_{CATEGORY}_{hash12}__

hash12:

  • first 12 lowercase hex chars of HMAC-SHA256(sessionSecret, original)
  • stable within a session, non-reversible for providers

Mapping:

  • in-memory only, bidirectional placeholder ↔ original
  • TTL cleanup + max size cap to prevent unbounded memory growth

3) Hook points (implementation sketch)

Provider-agnostic “send before” point:

  • Intercept right before any provider request is sent (so no MITM is needed and all providers are covered).

Tool arg restore:

  • Deep-walk tool/function-call args and restore placeholders in string fields before tool scheduling/execution.

Streaming:

  • For streaming UI restore, consider a small tail buffer so placeholders split across chunks can still be restored correctly.

4) Configuration (default off)

Example (illustrative only; field naming can follow project conventions):

{
  "security": {
    "redaction": {
      "enabled": true,
      "keywords": [
        { "value": "sk-xxx", "category": "OPENAI_KEY" }
      ],
      "patterns": [
        { "regex": "ghp_[A-Za-z0-9]{36}", "category": "GITHUB_TOKEN" }
      ],
      "builtins": ["email", "ipv4", "uuid"],
      "exclude": ["localhost", "127.0.0.1"],
      "ttlMinutes": 60,
      "maxSize": 10000
    }
  }
}

5) Non-goals (for v1, to keep PR small)

  • Not trying to auto-detect every secret perfectly; start with configurable keywords/regex + a few built-ins
  • No on-disk modifications; redaction is done ephemerally in memory at request time
  • Further redaction of session recordings/logs can be discussed as a follow-up

6) Prior art / reference implementation

7) Questions for maintainers

  1. Prefer implementing this in core (provider-agnostic), or wait for a hooks/plugin mechanism and implement as a plugin?
  2. Should the configuration live under security.redaction or privacy.redaction?
  3. Is the __VG_{CATEGORY}_{hash12}__ placeholder + session HMAC mapping approach acceptable?

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions