Fix skill-creator UTF-8 panic on multi-byte characters by Mr-Neutr0n · Pull Request #362 · anthropics/skills

Mr-Neutr0n · 2026-02-09T21:55:54Z

Summary

Replace character-based length checks with UTF-8 byte-length validation in quick_validate.py to prevent Rust panics when the CLI processes multi-byte characters
Add _utf8_byte_len() and _truncate_utf8_safe() helpers for byte-aware string operations
Switch name (64), description (1024), and compatibility (500) field validation from character counts to UTF-8 byte counts

Problem

When a skill description contains multi-byte UTF-8 characters (such as Chinese text), Python len() counts characters rather than bytes. A 350-character Chinese description is only 350 characters but 1050 bytes in UTF-8. The previous validation accepted this since 350 < 1024. When the downstream Rust CLI then attempted to process the string at byte boundaries, it sliced in the middle of a multi-byte character, causing a panic.

Test plan

Existing ASCII-only skills still pass validation
Chinese descriptions under 1024 bytes pass validation
Chinese descriptions over 1024 bytes are correctly rejected
_truncate_utf8_safe() correctly avoids splitting multi-byte characters

Fixes #263

Add a pre-parse check in quick_validate.py that scans raw frontmatter text for unquoted description and compatibility values containing special YAML characters (: # { } [ ]). These characters cause yaml.safe_load() to silently misparse values into unexpected types (e.g., dicts instead of strings), making skills fail to load with no clear error message. The check runs before yaml.safe_load() and provides an actionable error message telling the user to wrap their value in quotes. Fixes anthropics#338

Replace character-based length checks with UTF-8 byte-length validation in quick_validate.py. The previous code used Python's len() which counts characters, allowing strings that fit within character limits but exceed byte limits to pass validation. When the downstream Rust CLI attempted to truncate these strings at byte boundaries, it could slice in the middle of multi-byte UTF-8 characters (e.g., Chinese full-stop U+3002), causing a panic: "byte index 2 is not a char boundary". Changes: - Add _utf8_byte_len() helper for byte-aware length checking - Add _truncate_utf8_safe() helper that respects character boundaries - Switch name (64), description (1024), and compatibility (500) field validation from character counts to UTF-8 byte counts Fixes anthropics#263

Mr-Neutr0n · 2026-02-12T18:11:34Z

Friendly bump! Let me know if there's anything I should update or improve to help move this forward.

Mr-Neutr0n added 2 commits February 10, 2026 03:20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix skill-creator UTF-8 panic on multi-byte characters#362

Fix skill-creator UTF-8 panic on multi-byte characters#362
Mr-Neutr0n wants to merge 2 commits intoanthropics:mainfrom
Mr-Neutr0n:fix/skill-creator-utf8-panic

Mr-Neutr0n commented Feb 9, 2026

Uh oh!

Mr-Neutr0n commented Feb 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Mr-Neutr0n commented Feb 9, 2026

Summary

Problem

Test plan

Uh oh!

Mr-Neutr0n commented Feb 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant