feat: mask sensitive infrastructure identifiers before model calls (#… by hamzzaaamalik · Pull Request #634 · Tracer-Cloud/opensre

hamzzaaamalik · 2026-04-17T14:28:27Z

Adds a reversible masking layer that swaps pod/cluster/host/account/IP/
email identifiers with stable placeholders before sending prompts to the
LLM, and restores the originals in the final Slack report.

Configurable via OPENSRE_MASK_ENABLED and OPENSRE_MASK_KINDS env vars.
Off by default - no behavior change for existing users.

Closes #478

…racer-Cloud#478)

greptile-apps · 2026-04-17T14:45:05Z

Greptile Summary

This PR adds an opt-in, reversible masking layer that replaces infrastructure identifiers (pods, namespaces, clusters, IPs, emails, etc.) with stable placeholders before LLM calls and restores them in user-facing Slack output. All previously flagged issues — private import of _compile_extra_patterns, per-call regex compilation, partial-overlap corruption, and counter inflation — are correctly resolved in the follow-up commit. The feature is off by default, integrates cleanly with AgentState, and ships with solid unit and integration test coverage.

Confidence Score: 5/5

Safe to merge — feature is off by default, all prior P0/P1 issues resolved, remaining findings are minor P2 suggestions.

All previously flagged blocking issues (private import, per-call compilation, partial-overlap corruption, counter inflation) are properly addressed. No new P0 or P1 defects found. The two remaining observations — unmasked dict keys and the silent ALL_KINDS fallback — are P2 quality improvements that don't affect correctness for the common case.

app/masking/context.py (dict-key masking gap) and app/masking/policy.py (silent ALL_KINDS fallback) warrant a second look before the feature is widely enabled.

Important Files Changed

Filename	Overview
app/masking/policy.py	Pydantic-based MaskingPolicy with env-var loading; compile_extra_patterns promoted to public API; kind validation and bool parsing are clean.
app/masking/detectors.py	Regex-based identifier detection; partial-overlap guard added in _resolve_overlaps; compiled_extras parameter added to find_identifiers for one-time compilation.
app/masking/context.py	MaskingContext with stable placeholder map; counter inflation bug fixed by accumulating max-index first; _compiled_extras compiled once in init; mask_value masks dict values but not keys.
app/nodes/investigate/node.py	Masking applied to evidence before downstream LLM nodes; masking_map conditionally written to state only when non-empty.
app/nodes/root_cause_diagnosis/node.py	LLM response fields (root_cause, causal_chain, claims) are unmasked before writing back to state; correct integration with MaskingContext.from_state.
app/nodes/publish_findings/node.py	slack_message, short_summary, and all_blocks unmasked before delivery; send_ingest receives unmasked report with full state as intended.
app/state/agent_state.py	masking_map field added to both AgentState TypedDict and AgentStateModel Pydantic model; kept in sync as required.

Sequence Diagram

sequenceDiagram
    participant Inv as node_investigate
    participant State as AgentState
    participant RCA as node_root_cause_diagnosis
    participant Pub as node_publish_findings
    participant LLM as External LLM

    Inv->>Inv: Execute tool actions → raw evidence
    Inv->>Inv: MaskingContext.from_state(state) mask_value(evidence)
    Inv->>State: evidence = masked_evidence, masking_map = {placeholder→original}

    RCA->>State: read masked evidence + masking_map
    RCA->>LLM: build_diagnosis_prompt(state, masked_evidence)
    LLM-->>RCA: response (may contain placeholders)
    RCA->>RCA: MaskingContext.from_state(state) unmask(root_cause, causal_chain, claims)
    RCA->>State: root_cause = unmasked value

    Pub->>State: read root_cause (unmasked) + masking_map
    Pub->>Pub: format_slack_message(ctx) → slack_message
    Pub->>Pub: MaskingContext.from_state(state) unmask(slack_message, problem_md, all_blocks)
    Pub->>Pub: send_slack_report(unmasked message)

Prompt To Fix All With AI

This is a comment left during a code review.
Path: app/masking/context.py
Line: 120-122

Comment:
**Dict keys not masked in `mask_value`**

`mask_value` recurses into dict *values* only — dict *keys* are passed through unchanged. This is a gap when Kubernetes evidence stores identifiers as keys, e.g. `{"etl-worker-7d9f8b-xkp2q": {"status": "Failed"}}`. The integration fixture test (`test_integration_with_k8s_fixture.py`) uses `json.dumps(masked)` to scan for the namespace value, which wouldn't catch a pod name that survived as a key.

```suggestion
        if isinstance(value, dict):
            return {self.mask(k) if isinstance(k, str) else k: self.mask_value(v) for k, v in value.items()}
```

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: app/masking/policy.py
Line: 71-78

Comment:
**Silent ALL_KINDS fallback when every specified kind is invalid**

When all entries in `OPENSRE_MASK_KINDS` are unrecognised, `_filter_valid_kinds` silently falls back to masking *all* identifier kinds. An operator who sets `OPENSRE_MASK_KINDS=internal_only` expecting restricted masking would instead get every built-in detector active — the opposite of their intent — with only per-kind `ignoring unknown identifier kind` warnings but no indication that the fallback occurred.

```suggestion
        if valid:
            return tuple(valid)
        logger.warning(
            "[masking] all specified kinds were invalid; falling back to all defaults: %s",
            ", ".join(ALL_KINDS),
        )
        return ALL_KINDS
```

How can I resolve this? If you propose a fix, please make it concise.

_{Reviews (2): Last reviewed commit: "fix: CodeQL ReDoS + Greptile review (par..." | Re-trigger Greptile}

…e once)

feat: mask sensitive infrastructure identifiers before model calls (T…

510eb60

…racer-Cloud#478)

github-advanced-security AI found potential problems Apr 17, 2026

View reviewed changes

Comment thread app/masking/detectors.py Fixed

greptile-apps Bot reviewed Apr 17, 2026

View reviewed changes

Comment thread app/masking/detectors.py Outdated

Comment thread app/masking/detectors.py Outdated

Comment thread app/masking/detectors.py

Comment thread app/masking/context.py Outdated

fix: CodeQL ReDoS + Greptile review (partial overlap, counter, compil…

a1bec3a

…e once)

Devesh36 merged commit c5db908 into Tracer-Cloud:main Apr 17, 2026
11 checks passed

davincios mentioned this pull request Apr 19, 2026

feat: restore readable investigation output after masking #496

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: mask sensitive infrastructure identifiers before model calls (#…#634

feat: mask sensitive infrastructure identifiers before model calls (#…#634
Devesh36 merged 2 commits intoTracer-Cloud:mainfrom
hamzzaaamalik:issue/478-mask-sensitive-identifiers

hamzzaaamalik commented Apr 17, 2026 •

edited

Loading

Uh oh!

Uh oh!

greptile-apps Bot commented Apr 17, 2026 •

edited

Loading

Sequence Diagram

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

hamzzaaamalik commented Apr 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

greptile-apps Bot commented Apr 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Sequence Diagram

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

hamzzaaamalik commented Apr 17, 2026 •

edited

Loading

greptile-apps Bot commented Apr 17, 2026 •

edited

Loading