fix(posix): set IUTF8 in prompt() so backspace erases whole UTF-8 characters#27284
fix(posix): set IUTF8 in prompt() so backspace erases whole UTF-8 characters#27284
Conversation
…racters On POSIX systems, the kernel's canonical-mode line editing only erases whole UTF-8 characters on backspace when the IUTF8 termios flag is set. Without it, each backspace removes a single byte, corrupting multi-byte characters like CJK or emoji. Set IUTF8 on stdin before reading prompt input and restore the original termios afterward, mirroring the existing Windows terminal mode handling. Closes #27283 Co-Authored-By: Claude <noreply@anthropic.com>
WalkthroughThe PR enables the IUTF8 termios flag on POSIX systems in prompt handling to treat multi-byte UTF-8 characters as single deletion units during terminal operations. It adds regression tests covering multi-byte characters, mixed ASCII-UTF-8 content, and emoji inputs. Changes
🚥 Pre-merge checks | ✅ 4✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@test/regression/issue/27283.test.ts`:
- Around line 19-22: The destructured variable stdout from the Promise.all call
is unused in the assertions; update the three test instances to remove or ignore
it (e.g., replace stdout with an underscore or omit it) so the destructuring
becomes [stderr, exitCode] or [_stdout, stderr, exitCode], and keep the rest of
the call to Promise.all using proc.stdout.text(), proc.stderr.text(),
proc.exited to preserve behavior; update the matching assertions that reference
stderr and exitCode (symbols: stdout, stderr, exitCode, proc.stdout.text(),
proc.stderr.text(), proc.exited).
| const [stdout, stderr, exitCode] = await Promise.all([proc.stdout.text(), proc.stderr.text(), proc.exited]); | ||
|
|
||
| expect(stderr).toBe("笨蛋"); | ||
| expect(exitCode).toBe(0); |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Consider removing unused stdout variable.
The stdout variable is destructured but never used in the assertions. This pattern repeats in all three tests (lines 19, 38, 57).
♻️ Suggested cleanup
- const [stdout, stderr, exitCode] = await Promise.all([proc.stdout.text(), proc.stderr.text(), proc.exited]);
+ const [stderr, exitCode] = await Promise.all([proc.stderr.text(), proc.exited]);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@test/regression/issue/27283.test.ts` around lines 19 - 22, The destructured
variable stdout from the Promise.all call is unused in the assertions; update
the three test instances to remove or ignore it (e.g., replace stdout with an
underscore or omit it) so the destructuring becomes [stderr, exitCode] or
[_stdout, stderr, exitCode], and keep the rest of the call to Promise.all using
proc.stdout.text(), proc.stderr.text(), proc.exited to preserve behavior; update
the matching assertions that reference stderr and exitCode (symbols: stdout,
stderr, exitCode, proc.stdout.text(), proc.stderr.text(), proc.exited).
|
✅ 1edaf — Looks good! Reviewed 2 files across |
Summary
IUTF8termios flag on stdin before readingprompt()input on POSIX systems, so the kernel's canonical-mode line editing erases whole UTF-8 characters on backspace instead of single bytesENABLE_VIRTUAL_TERMINAL_INPUTpatternIUTF8is already setRoot Cause
Linux's canonical-mode line discipline (
n_tty.c) handles backspace byte-by-byte by default. It only respects UTF-8 multi-byte character boundaries when theIUTF8termios input flag is set. Many environments (SSH sessions, containers, some terminal configs) don't set this flag automatically, causingprompt()to corrupt multi-byte characters (CJK, emoji, etc.) on backspace.Closes #27283
Test plan
prompt()bun bd test test/regression/issue/27283.test.tspassesprompt()and press backspace — characters are now erased whole🤖 Generated with Claude Code