Conversation
ea67f97 to
66eae13
Compare
|
Maybe related: #17657 |
7f7102e to
eb11922
Compare
|
@claude Please revive this PR and rebase it to the current main. Make sure to build and run the changed tests to make sure everything works. |
This comment was marked as off-topic.
This comment was marked as off-topic.
eb11922 to
7cdb0bb
Compare
|
Is there something blocking this PR? I just did a protracted debugging session where I looked through Markdown source files, a markdown to HTML pipeline, a template tag implementation, ESBuild, and finally Bun. It’s 2025 please use Unicode 🙃🥰😝 |
WalkthroughSwitches Latin‑1 cloning to UTF‑8 in module loader/cache and Postgres message handling; refactors js_printer to use a new prefers_ascii field and updated APIs; and adds/adjusts tests for String.raw/tagged templates, RegExp sources, Unicode handling, shell behavior, and several regressions. Changes
Suggested reviewers
Pre-merge checks❌ Failed checks (1 warning)
✅ Passed checks (3 passed)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/js_printer.zig (1)
4948-4989: Bug: invalid escape for non‑BMP identifiers inprintIdentifierUTF16For code points > 0xFFFF, this prints
\ufollowed by raw UTF‑8 bytes, which is invalid JS. Use\u{...}or surrogate pairs.Apply:
@@ - if (p.prefers_ascii and c > last_ascii) { - switch (c) { - 0...0xFFFF => { - p.print( - [_]u8{ - '\\', - 'u', - hex_chars[c >> 12], - hex_chars[(c >> 8) & 15], - hex_chars[(c >> 4) & 15], - hex_chars[c & 15], - }, - ); - }, - else => { - p.print("\\u"); - var buf_ptr = p.writer.reserve(4) catch unreachable; - p.writer.advance(strings.encodeWTF8RuneT(buf_ptr[0..4], CodeUnitType, c)); - }, - } - continue; - } + if (p.prefers_ascii and c > last_ascii) { + if (c <= 0xFFFF) { + p.print([_]u8{ + '\\','u', + hex_chars[(c >> 12) & 0xF], + hex_chars[(c >> 8) & 0xF], + hex_chars[(c >> 4) & 0xF], + hex_chars[c & 0xF], + }); + } else { + // Use ES2015 identifier escape + p.print("\\u{"); + std.fmt.formatInt(c, 16, .lower, .{}, p) catch unreachable; + p.print("}"); + } + continue; + }
🧹 Nitpick comments (1)
src/sql/postgres/protocol/FieldMessage.zig (1)
58-59: Droppinglocalized_severity— confirm intentThe initializer no longer constructs
.localized_severity; such fields are now skipped. If intentional, add a brief comment explaining why it’s ignored and confirm downstream doesn’t display it.Optionally keep the case (using
String.cloneUTF8(message)) to preserve parity withseverityif there’s no downside.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Disabled knowledge base sources:
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (10)
src/bun.js/ModuleLoader.zig(5 hunks)src/bun.js/RuntimeTranspilerCache.zig(2 hunks)src/js_printer.zig(24 hunks)src/sql/postgres/protocol/FieldMessage.zig(1 hunks)test/bundler/bundler_string.test.ts(1 hunks)test/bundler/transpiler/transpiler.test.js(1 hunks)test/js/bun/shell/bunshell.test.ts(1 hunks)test/regression/issue/14976/14976.test.ts(1 hunks)test/regression/issue/8207/8207.test.ts(1 hunks)test/regression/issue/8745/8745.test.ts(1 hunks)
🧰 Additional context used
📓 Path-based instructions (17)
test/**
📄 CodeRabbit inference engine (.cursor/rules/writing-tests.mdc)
Place all tests under the test/ directory
Files:
test/regression/issue/8207/8207.test.tstest/bundler/transpiler/transpiler.test.jstest/js/bun/shell/bunshell.test.tstest/regression/issue/14976/14976.test.tstest/bundler/bundler_string.test.tstest/regression/issue/8745/8745.test.ts
test/regression/issue/+([0-9])/**
📄 CodeRabbit inference engine (.cursor/rules/writing-tests.mdc)
Place regression tests under test/regression/issue/[number] (a numeric directory per issue)
Files:
test/regression/issue/8207/8207.test.tstest/regression/issue/14976/14976.test.tstest/regression/issue/8745/8745.test.ts
test/**/*.{js,ts}
📄 CodeRabbit inference engine (.cursor/rules/writing-tests.mdc)
test/**/*.{js,ts}: Write tests in JavaScript or TypeScript using Bun’s Jest-style APIs (test, describe, expect) and run with bun test
Prefer data-driven tests (e.g., test.each) to reduce boilerplate
Use shared utilities from test/harness.ts where applicable
Files:
test/regression/issue/8207/8207.test.tstest/bundler/transpiler/transpiler.test.jstest/js/bun/shell/bunshell.test.tstest/regression/issue/14976/14976.test.tstest/bundler/bundler_string.test.tstest/regression/issue/8745/8745.test.ts
test/**/*.test.ts
📄 CodeRabbit inference engine (test/CLAUDE.md)
test/**/*.test.ts: Name test files*.test.tsand usebun:test
Do not write flaky tests: never wait for arbitrary time; wait for conditions instead
Never hardcode port numbers in tests; useport: 0to get a random port
When spawning Bun in tests, usebunExe()andbunEnvfromharness
Preferasync/awaitin tests; for a single callback, usePromise.withResolvers()
Do not set explicit test timeouts; rely on Bun’s built-in timeouts
UsetempDir/tempDirWithFilesfromharnessfor temporary files and directories in tests
For large/repetitive strings in tests, preferBuffer.alloc(count, fill).toString()over"A".repeat(count)
Import common test utilities fromharness(e.g.,bunExe,bunEnv,tempDirWithFiles,tmpdirSync, platform checks, GC helpers)
In error tests, assert non-zero exit codes for failing processes and usetoThrowfor synchronous errors
Usedescribeblocks for grouping,describe.eachfor parameterized tests, snapshots withtoMatchSnapshot, and lifecycle hooks (beforeAll,beforeEach,afterEach); track resources for cleanup inafterEach
Useusing/await usingwith Bun resources (e.g., Bun.listen/connect/spawn/serve) to ensure cleanup in tests
Files:
test/regression/issue/8207/8207.test.tstest/js/bun/shell/bunshell.test.tstest/regression/issue/14976/14976.test.tstest/bundler/bundler_string.test.tstest/regression/issue/8745/8745.test.ts
test/regression/issue/**
📄 CodeRabbit inference engine (test/CLAUDE.md)
Place regression tests under
test/regression/issue/and organize by issue numberPlace regression tests under test/regression/issue/ (one per bug fix)
Files:
test/regression/issue/8207/8207.test.tstest/regression/issue/14976/14976.test.tstest/regression/issue/8745/8745.test.ts
test/**/*.test.{ts,tsx}
📄 CodeRabbit inference engine (CLAUDE.md)
test/**/*.test.{ts,tsx}: Test files must live under test/ and end with .test.ts or .test.tsx
In tests, always use port: 0; do not hardcode ports or roll your own random port
Prefer normalizeBunSnapshot for snapshotting test output instead of asserting raw strings
Do not write tests that assert absence of crashes (e.g., 'no panic' or 'no uncaught exception')
Use Bun’s Jest-compatible runner (import { test, expect } from "bun:test") for tests
Avoid shell commands like find or grep in tests; use Bun’s Glob and built-in tools instead
Prefer running tests via bun bd test and use provided harness utilities (bunEnv, bunExe, tempDir)
Use Bun.spawn with proper stdio handling and await proc.exited in process-spawning tests
Files:
test/regression/issue/8207/8207.test.tstest/js/bun/shell/bunshell.test.tstest/regression/issue/14976/14976.test.tstest/bundler/bundler_string.test.tstest/regression/issue/8745/8745.test.ts
test/bundler/**/*
📄 CodeRabbit inference engine (.cursor/rules/writing-tests.mdc)
Place bundler/transpiler/CSS/bun build tests under test/bundler/
Files:
test/bundler/transpiler/transpiler.test.jstest/bundler/bundler_string.test.ts
test/bundler/**
📄 CodeRabbit inference engine (CLAUDE.md)
Place bundler/transpiler tests under test/bundler/
Files:
test/bundler/transpiler/transpiler.test.jstest/bundler/bundler_string.test.ts
test/js/**/*.{js,ts}
📄 CodeRabbit inference engine (.cursor/rules/writing-tests.mdc)
Place JavaScript and TypeScript tests under test/js/
Files:
test/js/bun/shell/bunshell.test.ts
test/js/bun/**/*.{js,ts}
📄 CodeRabbit inference engine (.cursor/rules/writing-tests.mdc)
Place Bun API tests under test/js/bun/, separated by category (e.g., test/js/bun/glob/)
Files:
test/js/bun/shell/bunshell.test.ts
test/js/**
📄 CodeRabbit inference engine (test/CLAUDE.md)
Organize unit tests for specific features under
test/js/by module
Files:
test/js/bun/shell/bunshell.test.ts
test/js/bun/**
📄 CodeRabbit inference engine (CLAUDE.md)
Place Bun-specific API tests under test/js/bun/
Files:
test/js/bun/shell/bunshell.test.ts
src/**/*.zig
📄 CodeRabbit inference engine (.cursor/rules/building-bun.mdc)
Implement debug logs in Zig using
const log = bun.Output.scoped(.${SCOPE}, false);and invokinglog("...", .{})In Zig code, manage memory carefully and use defer for cleanup of allocations/resources
src/**/*.zig: In Zig source files, place all @import statements at the bottom of the file
Use @import("bun") instead of @import("root").bun when importing Bun in Zig
Files:
src/bun.js/RuntimeTranspilerCache.zigsrc/bun.js/ModuleLoader.zigsrc/sql/postgres/protocol/FieldMessage.zigsrc/js_printer.zig
**/*.zig
📄 CodeRabbit inference engine (.cursor/rules/javascriptcore-class.mdc)
**/*.zig: Declare the extern C symbol in Zig and export a Zig-friendly alias for use
Wrap the Bun____toJS extern in a Zig method that takes a JSGlobalObject and returns JSC.JSValue
Files:
src/bun.js/RuntimeTranspilerCache.zigsrc/bun.js/ModuleLoader.zigsrc/sql/postgres/protocol/FieldMessage.zigsrc/js_printer.zig
src/bun.js/**/*.zig
📄 CodeRabbit inference engine (.cursor/rules/zig-javascriptcore-classes.mdc)
src/bun.js/**/*.zig: In Zig binding structs, expose generated bindings via pub const js = JSC.Codegen.JS and re-export toJS/fromJS/fromJSDirect
Constructors and prototype methods should return bun.JSError!JSC.JSValue to integrate Zig error handling with JS exceptions
Use parameter name globalObject (not ctx) and accept (*JSC.JSGlobalObject, *JSC.CallFrame) in binding methods/constructors
Implement getters as get(this, globalObject) returning JSC.JSValue and matching the .classes.ts interface
Provide deinit() for resource cleanup and finalize() that calls deinit(); use bun.destroy(this) or appropriate destroy pattern
Access JS call data via CallFrame (argument(i), argumentCount(), thisValue()) and throw errors with globalObject.throw(...)
For properties marked cache: true, use the generated Zig accessors (NameSetCached/GetCached) to work with GC-owned values
In finalize() for objects holding JS references, release them using .deref() before destroy
Files:
src/bun.js/RuntimeTranspilerCache.zigsrc/bun.js/ModuleLoader.zig
src/**/js_*.zig
📄 CodeRabbit inference engine (.cursor/rules/registering-bun-modules.mdc)
src/**/js_*.zig: Implement JavaScript bindings in a Zig file named with a js_ prefix (e.g., js_smtp.zig, js_your_feature.zig)
Handle reference counting correctly with ref()/deref() in JS-facing Zig code
Always implement proper cleanup in deinit() and finalize() for JS-exposed types
Files:
src/js_printer.zig
src/{**/js_*.zig,bun.js/api/**/*.zig}
📄 CodeRabbit inference engine (.cursor/rules/registering-bun-modules.mdc)
Use bun.JSError!JSValue for proper error propagation in JS-exposed Zig functions
Files:
src/js_printer.zig
🧠 Learnings (16)
📚 Learning: 2025-10-04T09:51:30.294Z
Learnt from: CR
PR: oven-sh/bun#0
File: CLAUDE.md:0-0
Timestamp: 2025-10-04T09:51:30.294Z
Learning: Applies to test/**/*.test.{ts,tsx} : Prefer normalizeBunSnapshot for snapshotting test output instead of asserting raw strings
Applied to files:
test/js/bun/shell/bunshell.test.ts
📚 Learning: 2025-08-30T00:13:36.815Z
Learnt from: CR
PR: oven-sh/bun#0
File: .cursor/rules/zig-javascriptcore-classes.mdc:0-0
Timestamp: 2025-08-30T00:13:36.815Z
Learning: Applies to src/bun.js/bindings/generated_classes_list.zig : Update src/bun.js/bindings/generated_classes_list.zig to include new classes
Applied to files:
src/bun.js/ModuleLoader.zig
📚 Learning: 2025-10-04T09:51:30.294Z
Learnt from: CR
PR: oven-sh/bun#0
File: CLAUDE.md:0-0
Timestamp: 2025-10-04T09:51:30.294Z
Learning: Applies to test/**/*.test.{ts,tsx} : Prefer running tests via bun bd test <file> and use provided harness utilities (bunEnv, bunExe, tempDir)
Applied to files:
test/regression/issue/8745/8745.test.ts
📚 Learning: 2025-10-04T09:51:30.294Z
Learnt from: CR
PR: oven-sh/bun#0
File: CLAUDE.md:0-0
Timestamp: 2025-10-04T09:51:30.294Z
Learning: Applies to test/**/*.test.{ts,tsx} : Use Bun.spawn with proper stdio handling and await proc.exited in process-spawning tests
Applied to files:
test/regression/issue/8745/8745.test.ts
📚 Learning: 2025-08-30T00:12:56.803Z
Learnt from: CR
PR: oven-sh/bun#0
File: .cursor/rules/writing-tests.mdc:0-0
Timestamp: 2025-08-30T00:12:56.803Z
Learning: Applies to test/**/*.{js,ts} : Write tests in JavaScript or TypeScript using Bun’s Jest-style APIs (test, describe, expect) and run with bun test
Applied to files:
test/regression/issue/8745/8745.test.ts
📚 Learning: 2025-09-03T17:10:13.486Z
Learnt from: CR
PR: oven-sh/bun#0
File: test/CLAUDE.md:0-0
Timestamp: 2025-09-03T17:10:13.486Z
Learning: Applies to test/**/*.test.ts : When spawning Bun in tests, use `bunExe()` and `bunEnv` from `harness`
Applied to files:
test/regression/issue/8745/8745.test.ts
📚 Learning: 2025-10-04T09:51:30.294Z
Learnt from: CR
PR: oven-sh/bun#0
File: CLAUDE.md:0-0
Timestamp: 2025-10-04T09:51:30.294Z
Learning: Applies to test/**/*.test.{ts,tsx} : Use Bun’s Jest-compatible runner (import { test, expect } from "bun:test") for tests
Applied to files:
test/regression/issue/8745/8745.test.ts
📚 Learning: 2025-09-03T17:10:13.486Z
Learnt from: CR
PR: oven-sh/bun#0
File: test/CLAUDE.md:0-0
Timestamp: 2025-09-03T17:10:13.486Z
Learning: Applies to test/**/*.test.ts : Name test files `*.test.ts` and use `bun:test`
Applied to files:
test/regression/issue/8745/8745.test.ts
📚 Learning: 2025-08-30T00:12:56.803Z
Learnt from: CR
PR: oven-sh/bun#0
File: .cursor/rules/writing-tests.mdc:0-0
Timestamp: 2025-08-30T00:12:56.803Z
Learning: Applies to test/js/bun/**/*.{js,ts} : Place Bun API tests under test/js/bun/, separated by category (e.g., test/js/bun/glob/)
Applied to files:
test/regression/issue/8745/8745.test.ts
📚 Learning: 2025-09-03T17:10:13.486Z
Learnt from: CR
PR: oven-sh/bun#0
File: test/CLAUDE.md:0-0
Timestamp: 2025-09-03T17:10:13.486Z
Learning: Applies to test/**/*-fixture.ts : Name test fixture files that are spawned by tests with the suffix `-fixture.ts`
Applied to files:
test/regression/issue/8745/8745.test.ts
📚 Learning: 2025-08-30T00:12:56.803Z
Learnt from: CR
PR: oven-sh/bun#0
File: .cursor/rules/writing-tests.mdc:0-0
Timestamp: 2025-08-30T00:12:56.803Z
Learning: Applies to test/cli/**/*.{js,ts} : When testing Bun as a CLI, use spawn with bunExe() and bunEnv from harness, and capture stdout/stderr via pipes
Applied to files:
test/regression/issue/8745/8745.test.ts
📚 Learning: 2025-08-30T00:09:39.100Z
Learnt from: CR
PR: oven-sh/bun#0
File: .cursor/rules/dev-server-tests.mdc:0-0
Timestamp: 2025-08-30T00:09:39.100Z
Learning: Applies to test/bake/dev/bundle.test.ts : bundle.test.ts should contain DevServer-specific bundling tests
Applied to files:
test/regression/issue/8745/8745.test.ts
📚 Learning: 2025-09-03T17:10:13.486Z
Learnt from: CR
PR: oven-sh/bun#0
File: test/CLAUDE.md:0-0
Timestamp: 2025-09-03T17:10:13.486Z
Learning: Applies to test/**/*.test.ts : Import common test utilities from `harness` (e.g., `bunExe`, `bunEnv`, `tempDirWithFiles`, `tmpdirSync`, platform checks, GC helpers)
Applied to files:
test/regression/issue/8745/8745.test.ts
📚 Learning: 2025-08-30T00:09:39.100Z
Learnt from: CR
PR: oven-sh/bun#0
File: .cursor/rules/dev-server-tests.mdc:0-0
Timestamp: 2025-08-30T00:09:39.100Z
Learning: Applies to test/bake/bake-harness.ts : Provide and maintain shared test utilities: devTest, prodTest, devAndProductionTest, Dev, Client, and helpers in the harness
Applied to files:
test/regression/issue/8745/8745.test.ts
📚 Learning: 2025-08-30T00:09:39.100Z
Learnt from: CR
PR: oven-sh/bun#0
File: .cursor/rules/dev-server-tests.mdc:0-0
Timestamp: 2025-08-30T00:09:39.100Z
Learning: Applies to test/bake/{dev/*.test.ts,dev-and-prod.ts} : Import testing utilities (devTest, prodTest, devAndProductionTest, Dev, Client) from test/bake/bake-harness.ts
Applied to files:
test/regression/issue/8745/8745.test.ts
📚 Learning: 2025-08-30T00:09:39.100Z
Learnt from: CR
PR: oven-sh/bun#0
File: .cursor/rules/dev-server-tests.mdc:0-0
Timestamp: 2025-08-30T00:09:39.100Z
Learning: Applies to test/bake/dev/*.test.ts : Write Dev Server and HMR tests in test/bake/dev/*.test.ts using devTest from the shared harness
Applied to files:
test/regression/issue/8745/8745.test.ts
🧬 Code graph analysis (1)
test/regression/issue/8745/8745.test.ts (2)
test/harness.ts (2)
tempDirWithFiles(260-267)bunExe(103-106)test/js/bun/shell/bunshell.test.ts (1)
bunEnv(15-28)
🪛 ast-grep (0.39.5)
test/regression/issue/8207/8207.test.ts
[warning] 17-17: Regular expression constructed from variable input detected. This can lead to Regular Expression Denial of Service (ReDoS) attacks if the variable contains malicious patterns. Use libraries like 'recheck' to validate regex safety or use static patterns.
Context: new RegExp(replaced, "gu")
Note: [CWE-1333] Inefficient Regular Expression Complexity [REFERENCES]
- https://owasp.org/www-community/attacks/Regular_expression_Denial_of_Service_-_ReDoS
- https://cwe.mitre.org/data/definitions/1333.html
(regexp-from-variable)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Format
🔇 Additional comments (19)
src/bun.js/ModuleLoader.zig (1)
781-781: LGTM: Systematic encoding upgrade from Latin-1 to UTF-8.The change from
cloneLatin1tocloneUTF8across all source code string operations correctly enables preservation of non-ASCII characters in templates and regex sources, aligning with the PR objectives to fix Unicode handling in tagged template literals and RegExp sources.Also applies to: 1079-1079, 1123-1123, 1292-1292, 1425-1425, 1863-1863, 1897-1897, 2519-2519, 2599-2599
src/bun.js/RuntimeTranspilerCache.zig (2)
16-17: LGTM: Cache version bump for UTF-8 encoding change.The version increment and comment accurately reflect the encoding change. Bumping the cache version ensures cached Latin-1 content won't be incorrectly interpreted as UTF-8.
635-635: LGTM: Consistent UTF-8 encoding changes.The switch from
cloneLatin1tocloneUTF8and the debug logging adjustment from.latin1().lento.length()correctly implement UTF-8 handling in the transpiler cache, consistent with the ModuleLoader changes.Also applies to: 643-643
test/regression/issue/8207/8207.test.ts (2)
3-20: LGTM: Comprehensive test for regex source UTF-16 preservation.The test correctly validates that non-ASCII characters (¶, U+00B6) are preserved in regex sources and can be manipulated via string operations, addressing the parsel-js compatibility issue mentioned in #8207.
Note: The static analysis warning about ReDoS is a false positive—this is a test file using a known safe pattern, not production code processing untrusted input.
22-34: LGTM: Test validates regex matching behavior.The test correctly validates that the regex with UTF-16 characters works at runtime, complementing the source string manipulation test.
test/regression/issue/8745/8745.test.ts (4)
1-10: LGTM: Well-structured test setup.The test correctly:
- Imports harness utilities as per coding guidelines
- Uses UTF-8 byte arrays for expected output to validate Unicode handling
- Creates fixtures including a large file to trigger runtime transpiler cache (RTC)
12-28: LGTM: Direct run test follows best practices.The test correctly uses
Bun.spawnSyncwith harness utilities (bunExe(),bunEnv) and proper stdio handling to validate that non-ASCII characters in template literals are preserved when running TypeScript directly.
30-80: LGTM: Build-then-run tests provide comprehensive coverage.The tests validate that non-ASCII characters are correctly preserved through the build pipeline for both normal and minified builds, ensuring the UTF-8 handling works across all transpilation paths.
82-169: LGTM: Runtime transpiler cache tests validate caching path.The tests correctly exercise the runtime transpiler cache (RTC) code path with large files, validating that UTF-8 handling works consistently whether cached or not. The comment honestly notes the uncertainty about the exact threshold.
The RTC threshold is defined in
RuntimeTranspilerCache.zig:20asMINIMUM_CACHE_SIZE = 50 * 1024, so the 16 MB fixture definitely triggers it.src/js_printer.zig (9)
26-32: Escape policy switch matches objectivesSwitching to
prefers_asciiand allowing direct print of non‑ASCII when ASCII is not preferred while still escaping BOM/LS/PS is sound.Please confirm BOM handling: raw template parts bypass
canPrintWithoutEscapeand print as-is. Is that intentional for Node/Deno parity for String.raw? Based on objectives.
109-143: String emission path updated coherently
estimateLengthForUTF8andwritePreQuotedStringnow consultprefers_ascii, preserving ASCII bias yet emitting UTF‑8 when allowed. Control characters,${in templates, and LS/PS are handled.Also applies to: 145-156, 171-183, 183-207, 291-335
338-345: JSON quoting path OK
quoteForJSONpropagatesprefers_asciiand uses JSON‑safe escaping. No issues.
1537-1543: Propagation to UTF8/UTF16 string printersPassing
e.prefers_asciiintowritePreQuotedStringis correct and consistent.Also applies to: 1548-1553
1909-1916: Identifier classification respects ASCII preferenceUsing
p.prefers_asciito choose Latin‑1 vs UTF‑16 identifier checks aligns with the “ASCII unless needed” goal.
2742-2750: Tagged template raw: flip to UTF‑8 emission when neededDetecting non‑ASCII in
.rawparts and flippingp.prefers_ascii = falseensures the file can be loaded as UTF‑8 so raw chars are preserved (fixes String.raw/tagged template issues).This flip is global for the remainder of the file. Confirm this is intended (subsequent identifiers/strings may now print with Unicode instead of escapes), matching the loader’s “scan for non‑ASCII, load as UTF‑8” behavior described in the PR.
Also applies to: 2758-2766
3125-3130: RegExp.sourcecorrectnessFor non‑ASCII regex, switching off ASCII preference and printing the raw literal preserves
.source. Matches Node/Deno behavior.
5085-5094: Printer initialization change LGTMPlumbing
prefers_asciiintoPrinter.initis correct; no lifetime/alloc issues spotted.
5579-5673: Constructor sites consistent; one nuance in CJS path
printAst/printJSONpropagate flags correctly.- In
printCommonJS,NewPrinter(... is_bun_platform=false, ...)with a runtimeprefers_asciiis fine (avoids bun‑specific branches) while still honoring ASCII preference.Double‑check any CJS callers pass
prefers_ascii=truewhen targeting Bun’s ASCII‑preferred output to retain historical behavior.Also applies to: 5748-5768, 5933-5956
test/regression/issue/14976/14976.test.ts (1)
73-76: Good coverage for Unicode in tagged templatesValidates String.raw with non‑ASCII and ensures shell echo returns Unicode unchanged.
| TaggedTemplate7: { expr: 'String.raw`\xE6${"one"}`', print: true }, | ||
| TaggedTemplate8: { expr: 'String.raw`\u{10334}${"two"}→`', print: true }, |
There was a problem hiding this comment.
Fix the \xE6 escape so the test exercises the raw escape path
Line 95 currently reads 'String.raw\xE6${"one"}', but inside a JavaScript string literal \xE6 is parsed immediately into the character æ. That means this entry never emits a backslash in the generated source and ends up testing the same “literal æ” case covered by TaggedTemplate5, instead of the intended “written as \x… escape” scenario. Please escape the backslash so the bundler sees the actual \xE6 sequence.
- TaggedTemplate7: { expr: 'String.raw`\xE6${"one"}`', print: true },
+ TaggedTemplate7: { expr: 'String.raw`\\xE6${"one"}`', print: true },🤖 Prompt for AI Agents
In test/bundler/bundler_string.test.ts around lines 95-96, the test entry uses
the JavaScript string literal 'String.raw`\xE6${"one"}`' which causes \xE6 to be
interpreted as the character æ at parse time; change the literal to contain an
escaped backslash so the bundler sees the actual backslash escape sequence
(e.g., use '\\xE6' inside the string) so the test exercises the raw-escape path
rather than the pre-parsed æ character.
|
this is a blocker for me, I use emojis in sql syntax with drizzle :) |
markovejnovic
left a comment
There was a problem hiding this comment.
Looks good to me, I guess, but I don't know enough to be comfortable being the sole stamp on this PR.
| } | ||
|
|
||
| pub fn canPrintWithoutEscape(comptime CodePointType: type, c: CodePointType, comptime ascii_only: bool) bool { | ||
| pub fn canPrintWithoutEscape(comptime CodePointType: type, c: CodePointType, prefers_ascii: bool) bool { |
There was a problem hiding this comment.
Any reason to drop the comptime?
| .severity => FieldMessage{ .severity = String.cloneUTF8(message) }, | ||
| // Ignore this one for now. | ||
| // .localized_severity => FieldMessage{ .localized_severity = String.createUTF8(message) }, | ||
| // .localized_severity => FieldMessage{ .localized_severity = String.cloneUTF8(message) }, |
There was a problem hiding this comment.
Is this a correct/necessary change?
|
@markovejnovic Any chance it could get pushed for review internally? 🙏 This issue is such a pain for non-english people. I've given more details here: #15929 (comment) |
|
Is there any why we can help with getting this PR merged? It would solve quite a few problems for me and my team. |
What does this PR do?
Fixes #8745, fixes #15492, fixes #15929, fixes #16763, fixes #18115, fixes #8207
Changes
ascii_onlytoprefers_ascii. It will try to emit mostly ascii, but if a non-ascii character is encountered in a tagged template, it will emit it. Updates execution to scan the file to see if it contains any non-ascii and if it does, load it as utf-8 instead.Performance impact:
String.rawwith unicode)Benchmark
mkdir -p benches && bun ./bench.js// @bunfile, don't do this. in the case of loading a// @bunfile, use cloneUTF8