-
Notifications
You must be signed in to change notification settings - Fork 4k
fix(printer): avoid double backslash in regex with escaped non-ASCII #26786
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
When the regex printer encounters a non-ASCII character that was preceded by a backslash escape, it was incorrectly adding another backslash before the unicode escape sequence, resulting in `\\uXXXX` instead of `\uXXXX`. For example, `/[\⁄]/` (backslash + U+2044 fraction slash) was being printed as `/[\\u2044]/`, which changes the regex semantics from matching the fraction slash to matching `\`, `u`, `2`, `0`, `4`, `4` as separate characters. This fix tracks whether the previous character was a backslash and omits the leading backslash when generating unicode escape sequences for escaped non-ASCII characters. Closes #26785 Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
|
Updated 10:49 PM PT - Feb 6th, 2026
❌ Your commit
🧪 To try this PR locally: bunx bun-pr 26786That installs a local version of the PR into your bun-26786 --bun |
WalkthroughAdds tracking of whether the previous character was a backslash during bun_platform non-ASCII escaping, and uses that to choose between emitting full Unicode escapes (\uXXXX) or partial escapes (uXXXX) in template literals and RegExp literals. Also adds regression tests for the behavior. Changes
🚥 Pre-merge checks | ✅ 4✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@test/regression/issue/26785.test.ts`:
- Around line 11-33: Replace the tempDir + file I/O pattern used in the test
(the using dir = tempDir("issue-26785", { "test.js": ... }) and subsequent
Bun.spawn with cwd) with a single-file spawn using bunExe() -e and inline
source: stop creating "test.js" on disk, pass the script string as the argument
to bun via ["bunExe()", "-e", "<script source>"] in the Bun.spawn call, remove
cwd/file-related setup and teardown, and apply the same replacement to the other
two test cases referenced (the blocks around lines 47–62 and 78–97) so each test
uses bunExe() with -e and no tempDir.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Summary
Problem
When the regex printer encounters a non-ASCII character that was preceded by a backslash escape, it was incorrectly adding another backslash before the unicode escape sequence, resulting in
\\uXXXXinstead of\uXXXX.For example:
Was being printed as
/[\\u2044]/, which changes the regex semantics:\,u,2,0,4,4as separate charactersThis caused the regex from the issue to fail matching fractions:
Solution
Track whether the previous character was a backslash when iterating through the regex literal. When generating unicode escape sequences for non-ASCII characters, omit the leading backslash if the previous character was already a backslash (which was escaping the non-ASCII character).
Test plan
test/regression/issue/26785.test.tsCloses #26785
🤖 Generated with Claude Code