Skip to content

Comments

fix: preserve unicode in tagged template literals and regex source#27274

Open
robobun wants to merge 2 commits intomainfrom
claude/fix-tagged-template-unicode
Open

fix: preserve unicode in tagged template literals and regex source#27274
robobun wants to merge 2 commits intomainfrom
claude/fix-tagged-template-unicode

Conversation

@robobun
Copy link
Collaborator

@robobun robobun commented Feb 20, 2026

Summary

Rebased version of #15047 by @pfgithub.

  • Changes ascii_only to prefers_ascii as a runtime parameter (instead of comptime) on the printer, so when a tagged template literal or regex contains non-ASCII characters, the printer can disable ASCII-only mode and emit them as-is rather than escaping them
  • Changes cloneLatin1 to cloneUTF8 in module loading paths so JSC correctly interprets the UTF-8 output
  • Bumps the runtime transpiler cache version to invalidate stale Latin-1 cached entries

Fixes #8745, fixes #15492, fixes #15929, fixes #16763, fixes #18115, fixes #8207

console.log(String.raw`æ™`); // -> "æ™" instead of "\u00E6\u2122"
console.log(/æ/.source); // -> "æ™" instead of "\u00E6\u2122"

Test plan

  • test/regression/issue/8745/8745.test.ts - tagged template unicode (run directly, build, build+minify, with RTC)
  • test/regression/issue/8207/8207.test.ts - regex source string replacement with UTF-16 character
  • test/regression/issue/14976/14976.test.ts - template literal raw property with unicode (unskipped)
  • test/bundler/bundler_string.test.ts - added tagged template and regex source tests
  • test/bundler/transpiler/transpiler.test.js - added tagged template literal test
  • test/js/bun/shell/bunshell.test.ts - escape unicode test now expects correct output

🤖 Generated with Claude Code

Changes `ascii_only` to `prefers_ascii` as a runtime parameter instead of
comptime. When a tagged template literal or regex contains non-ASCII
characters, the printer disables ASCII-only mode so the characters are
emitted as-is rather than escaped. Also changes `cloneLatin1` to
`cloneUTF8` in module loading so JSC correctly interprets the UTF-8 output.

Fixes #8745, #15492, #15929, #16763, #18115, #8207

Co-Authored-By: pfg <pfg@pfg.pw>
Co-Authored-By: Claude <noreply@anthropic.com>
@robobun
Copy link
Collaborator Author

robobun commented Feb 20, 2026

Updated 9:09 AM PT - Feb 20th, 2026

❌ Your commit ca7ab71f has 6 failures in Build #37730 (All Failures):


🧪   To try this PR locally:

bunx bun-pr 27274

That installs a local version of the PR into your bun-27274 executable, so you can run:

bun-27274 --bun

@alii alii requested review from Jarred-Sumner and alii February 20, 2026 08:53
@coderabbitai

This comment was marked as spam.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@test/regression/issue/8207/8207.test.ts`:
- Line 1: The regression test currently containing the import line "import {
expect, test } from \"bun:test\";" must be moved into the repository's canonical
regression issue folder and renamed to use the real issue number (8207.test.ts);
relocate the file accordingly, ensure its filename is exactly "8207.test.ts",
update any relative imports or references if they break after the move, and run
the test suite to verify it still executes.

In `@test/regression/issue/8745/8745.test.ts`:
- Line 1: This test file imports harness helpers but is missing the required
test runner imports; add an explicit import of test and expect from "bun:test"
at the top of the file (alongside the existing imports such as bunEnv, bunExe,
tempDirWithFiles) so the file uses bun:test's test() and expect() functions as
per the project guideline for *.test.* files.
- Line 9: Replace the large string created with `" ".repeat(16 * 1024 * 1024)`
in the test entry for "requires_rtc_fixture.ts" with a Buffer-based allocation
to avoid heavy string-repeat usage; use Buffer.alloc(16 * 1024 * 1024, '
').toString() (or equivalent Buffer.alloc(count, fill).toString()) so the test
creates the same repeated-space string but follows the guideline and is more
memory/CPU friendly.

@claude
Copy link
Contributor

claude bot commented Feb 20, 2026

Newest first

ca7ab — Looks good!

Reviewed 11 files across src/ and test/: Preserves Unicode characters in tagged template literals and regex source strings by converting the printer's ascii_only flag to a runtime prefers_ascii parameter and switching module loaders from Latin-1 to UTF-8 encoding.

Previous reviews

35a03 — Looks good!

Reviewed 11 files across src/ and test/: This PR changes the JavaScript printer to preserve Unicode characters in tagged template literals and regex source strings by converting ascii_only from compile-time to runtime and switching module loaders from Latin-1 to UTF-8 encoding.

- Add explicit `import { expect, test } from "bun:test"` to 8745.test.ts

Co-Authored-By: Claude <noreply@anthropic.com>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@test/regression/issue/8745/8745.test.ts`:
- Line 10: Replace the use of string repetition for the large fixture with a
Buffer allocation: instead of `"requires_rtc_fixture.ts": fixture + "
".repeat(16 * 1024 * 1024)`, build the repeated-space string via Buffer.alloc(16
* 1024 * 1024, ' ').toString() and concatenate to fixture; update the entry for
"requires_rtc_fixture.ts" to use this Buffer.alloc-based string creation to
comply with the guideline.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

1 participant