feat: language-orthography plugin — enforce diacritical marks for non-English languages by alissonlinneker · Pull Request #32894 · anthropics/claude-code

alissonlinneker · 2026-03-10T17:02:52Z

Summary

Adds a new language-orthography plugin that enforces full orthographic correctness (accents, cedillas, umlauts, etc.) when the language setting targets a non-ASCII language.

SessionStart hook reads the user's language from settings and injects an orthographic enforcement instruction
No-op for English or when no language is configured
Follows the same pattern as explanatory-output-style and learning-output-style plugins

Closes #32886

Context

The built-in language instruction template says "Always respond in pt-BR" but never mentions diacritical marks. The model interprets this loosely and frequently produces accent-less text — e.g., informacao instead of informação, voce instead of você. This affects every language with diacritics: Portuguese, French, Vietnamese, Czech, Turkish, Spanish, German, etc.

The problem gets worse after context compaction because:

The compaction/summarization step doesn't receive language rules, so the summary can lose proper diacritics
CLAUDE.md instructions that reinforce accents are wrapped in a "may or may not be relevant" disclaimer, which the model uses to deprioritize them

This plugin works around the issue at the prompt level. The long-term fix would involve strengthening the core r1z() language template and passing language rules to the compaction step — details in #32886.

What the plugin does

The SessionStart hook:

Reads language from ~/.claude/settings.json or settings.local.json
Skips if no language is set or if the language is English
Injects an instruction that frames diacritic omission as an orthographic error (equivalent to a typo in English), not a style preference

Test plan

Tested with language: "pt-BR" — outputs correct enforcement JSON with accented examples
Tested with language: "fr" — outputs enforcement for French
Tested with language: "en-US" — silent no-op (exit 0, no output)
Tested with no settings file — silent no-op
JSON output validated — proper structure, accented characters preserved

The built-in language instruction ("Always respond in X") doesn't mention diacritical marks, so the model frequently drops accents, cedillas, and other characters required by non-ASCII languages like Portuguese, French, Vietnamese, Czech, etc. This plugin adds a SessionStart hook that reads the user's language setting and injects an explicit orthographic enforcement instruction, framing diacritic omission as an error rather than a style choice. No-op for English or when no language is configured. Closes anthropics#32886

alissonlinneker mentioned this pull request Mar 10, 2026

[BUG] Language setting does not enforce diacritical marks — accents/cedillas dropped in non-English output #32886

Open

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: language-orthography plugin — enforce diacritical marks for non-English languages#32894

feat: language-orthography plugin — enforce diacritical marks for non-English languages#32894
alissonlinneker wants to merge 1 commit intoanthropics:mainfrom
alissonlinneker:fix/language-diacritics-enforcement

alissonlinneker commented Mar 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

alissonlinneker commented Mar 10, 2026

Summary

Context

What the plugin does

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant