probe: ansi escape codes in tokenizer by leondz · Pull Request #1351 · NVIDIA/garak

leondz · 2025-09-01T13:50:03Z

add probe for scanning HF tokenizers for tokens bearing raw escape codes

Verification

Basic execution garak -m huggingface -n gpt2 -p ansiescape.AnsiRawTokenizerHF
Test new probe & existing ansiescape probes python -m pytest tests/probes/test_detectors_ansiescape.py
Test behaviour on non-HF generator garak -m openai -n o3-mini -p ansiescape.AnsiRawTokenizerHF (should noop)

garak/detectors/ansiescape.py

garak/probes/ansiescape.py

erickgalinkin · 2025-09-02T16:17:27Z

garak/probes/ansiescape.py

+        for t in tok.vocab:
+            if any(payload in t for payload in LIVE_PAYLOAD_TOKENS):
+                attempts.append(_get_token_attempt(t))
+            elif not clean_attempt_found:
+                clean_attempt_found = True
+                attempts.append(_get_token_attempt(t))


So I'm thinking about this -- does the escape code need to be a single token? Or will it work otherwise? I think we can be more efficient and more accurate with this.

Looking at tiktoken for an example:

>>> import tiktoken >>> enc = tiktoken.encoding_for_model("gpt-4") >>> enc.encode("\x1b[") [91535] >>> enc.encode("\x1b]") [215, 60] >>> enc.encode("\x9b") [126, 249] >>> enc.encode("\x9d") [126, 251] >>> enc.decode([91535]) '\x1b[' >>> enc.decode([215, 60]) '\x1b]' >>> enc.decode([126, 251]) '\x9d' >>> enc.decode([126, 249]) '\x9b'

Looks like only one of them is encoded as a single token, but all of these will still work.

I think we can rewrite this to go only over the set of LIVE_PAYLOAD_TOKENS (much smaller set) and then rewrite _get_token_attempt to encode, then decode, and if the same string pops out, Bob's your uncle.
If it has to be a single token (I don't believe it does), then we only need to check that tok.convert_tokens_to_ids(token_to_check) tokenizes to a single token not equal to tok.unk_token_id.

Thoughts?

Oh, like are escape codes composable? Crossed my mind too. This is cool, we should add these. And maybe even be principled about it.

The codes are already starting to be found in four-five places - should possibly get factored out as payloads or data

there are some Intricacies predicated on tokeniser implementation here. tiktoken has its own way of handling these sequences. HF tokenizers has another. Will rename class to scope just to hf. More mining on how to get these out of hf tokenizers is appropriate, but this feels like a borderline red team/research question. What's the minimum bar you'd like to see for acceptance here?

I think we could accept it as-is, but IMO there are two adjacent questions worth answering:

In the current implementation, we are checking single tokens in the vocabulary. Wouldn't it be more efficient to have something like:

for escape_code in `LIVE_PAYLOAD_TOKENS`: tokenized = tok.convert_tokens_to_ids(escape_code) if len(tokenized) == 1 and tokenized != tok.unk_token_id: attempts.append(_get_token_attempt(t))

or whatever -- you get it.

If it does not need to be a single token, then shouldn't we simply check if:

for escape_code in `LIVE_PAYLOAD_TOKENS`: if tok.decode(tok.encode(escape_code)) == escape_code: do_whatever()

We're checking all the tokenizer vocab to see if it has any entries containing a usable sequence, not just necessarily matching. I can see multiple modes worth checking for - the current one is conservative (i.e. sensitive)

This is a fine test for tiktoken, yeah. Current probe focuses on Hugging Face models (& has been renamed accordingly)

garak/probes/ansiescape.py

garak/data/ansi.py

garak/probes/ansiescape.py

garak/generators/huggingface.py

garak/probes/ansiescape.py

…f scoping to HF

add probe for scanning HF tokenizers for dodgy tokens

9d34996

leondz added probes Content & activity of LLM probes new plugin Describes an entirely new probe, detector, generator or harness labels Sep 1, 2025

erickgalinkin reviewed Sep 2, 2025

View reviewed changes

leondz added 3 commits September 5, 2025 12:09

rename to scope as HF tokenizer probe

c10abac

standardise local HF tokenizer access point

0a02c9b

move ANSI seqs to data module; don't load tokenizer twice

ec7b125

leondz requested a review from jmartin-tech September 5, 2025 11:44

leondz added 2 commits September 5, 2025 14:24

Merge branch 'main' into feature/ansi-tokenizer-probe

bbc95dc

tests reference new payload locn

57cece3