Skip to content

fix(posix): set IUTF8 in prompt() so backspace erases whole UTF-8 characters#27284

Open
robobun wants to merge 1 commit intomainfrom
claude/fix-prompt-iutf8-backspace
Open

fix(posix): set IUTF8 in prompt() so backspace erases whole UTF-8 characters#27284
robobun wants to merge 1 commit intomainfrom
claude/fix-prompt-iutf8-backspace

Conversation

@robobun
Copy link
Collaborator

@robobun robobun commented Feb 20, 2026

Summary

  • Set the IUTF8 termios flag on stdin before reading prompt() input on POSIX systems, so the kernel's canonical-mode line editing erases whole UTF-8 characters on backspace instead of single bytes
  • Restore original termios settings after reading, mirroring the existing Windows ENABLE_VIRTUAL_TERMINAL_INPUT pattern
  • Gracefully skip the fix when stdin is not a TTY (piped input) or IUTF8 is already set

Root Cause

Linux's canonical-mode line discipline (n_tty.c) handles backspace byte-by-byte by default. It only respects UTF-8 multi-byte character boundaries when the IUTF8 termios input flag is set. Many environments (SSH sessions, containers, some terminal configs) don't set this flag automatically, causing prompt() to corrupt multi-byte characters (CJK, emoji, etc.) on backspace.

Closes #27283

Test plan

  • Added regression tests for multi-byte UTF-8, mixed ASCII/UTF-8, and emoji input via prompt()
  • bun bd test test/regression/issue/27283.test.ts passes
  • Manual verification: type multi-byte characters at a prompt() and press backspace — characters are now erased whole

🤖 Generated with Claude Code

…racters

On POSIX systems, the kernel's canonical-mode line editing only erases
whole UTF-8 characters on backspace when the IUTF8 termios flag is set.
Without it, each backspace removes a single byte, corrupting multi-byte
characters like CJK or emoji.

Set IUTF8 on stdin before reading prompt input and restore the original
termios afterward, mirroring the existing Windows terminal mode handling.

Closes #27283

Co-Authored-By: Claude <noreply@anthropic.com>
@robobun
Copy link
Collaborator Author

robobun commented Feb 20, 2026

Updated 10:06 AM PT - Feb 20th, 2026

❌ Your commit 1edafaaa has 5 failures in Build #37735 (All Failures):


🧪   To try this PR locally:

bunx bun-pr 27284

That installs a local version of the PR into your bun-27284 executable, so you can run:

bun-27284 --bun

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 20, 2026

Walkthrough

The PR enables the IUTF8 termios flag on POSIX systems in prompt handling to treat multi-byte UTF-8 characters as single deletion units during terminal operations. It adds regression tests covering multi-byte characters, mixed ASCII-UTF-8 content, and emoji inputs.

Changes

Cohort / File(s) Summary
POSIX Terminal Configuration
src/bun.js/webcore/prompt.zig
Enables IUTF8 termios flag on POSIX systems to handle multi-byte UTF-8 character deletion as complete units. Retrieves current termios settings, enables flag if not already set, applies changes, and restores original configuration on function exit via deferred block.
Regression Tests
test/regression/issue/27283.test.ts
Adds three regression tests validating prompt() correctly processes multi-byte UTF-8 input, mixed ASCII-UTF-8 input, and emoji input with proper stdin handling, output verification, and process exit code validation.
🚥 Pre-merge checks | ✅ 4
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately and clearly describes the main change: setting IUTF8 in prompt() to fix backspace behavior for UTF-8 characters on POSIX systems.
Description check ✅ Passed The description comprehensively covers both required sections: it clearly explains what the PR does (setting IUTF8 flag, restoring termios, graceful handling) and provides thorough verification methods (regression tests, manual testing).
Linked Issues check ✅ Passed The PR directly addresses issue #27283 by implementing the required fix: setting IUTF8 on POSIX to make backspace erase whole UTF-8 characters instead of bytes, with comprehensive regression tests validating the behavior.
Out of Scope Changes check ✅ Passed All changes are directly related to fixing the UTF-8 backspace issue: termios configuration in prompt.zig and regression tests covering multi-byte UTF-8, mixed ASCII/UTF-8, and emoji inputs—no out-of-scope modifications detected.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@test/regression/issue/27283.test.ts`:
- Around line 19-22: The destructured variable stdout from the Promise.all call
is unused in the assertions; update the three test instances to remove or ignore
it (e.g., replace stdout with an underscore or omit it) so the destructuring
becomes [stderr, exitCode] or [_stdout, stderr, exitCode], and keep the rest of
the call to Promise.all using proc.stdout.text(), proc.stderr.text(),
proc.exited to preserve behavior; update the matching assertions that reference
stderr and exitCode (symbols: stdout, stderr, exitCode, proc.stdout.text(),
proc.stderr.text(), proc.exited).

Comment on lines +19 to +22
const [stdout, stderr, exitCode] = await Promise.all([proc.stdout.text(), proc.stderr.text(), proc.exited]);

expect(stderr).toBe("笨蛋");
expect(exitCode).toBe(0);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Consider removing unused stdout variable.

The stdout variable is destructured but never used in the assertions. This pattern repeats in all three tests (lines 19, 38, 57).

♻️ Suggested cleanup
-  const [stdout, stderr, exitCode] = await Promise.all([proc.stdout.text(), proc.stderr.text(), proc.exited]);
+  const [stderr, exitCode] = await Promise.all([proc.stderr.text(), proc.exited]);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/regression/issue/27283.test.ts` around lines 19 - 22, The destructured
variable stdout from the Promise.all call is unused in the assertions; update
the three test instances to remove or ignore it (e.g., replace stdout with an
underscore or omit it) so the destructuring becomes [stderr, exitCode] or
[_stdout, stderr, exitCode], and keep the rest of the call to Promise.all using
proc.stdout.text(), proc.stderr.text(), proc.exited to preserve behavior; update
the matching assertions that reference stderr and exitCode (symbols: stdout,
stderr, exitCode, proc.stdout.text(), proc.stderr.text(), proc.exited).

@claude
Copy link
Contributor

claude bot commented Feb 20, 2026

1edaf — Looks good!

Reviewed 2 files across src/bun.js/webcore/ and test/regression/issue/: Sets the IUTF8 termios flag on stdin before reading prompt() input on POSIX systems so the kernel's canonical-mode line editing erases whole UTF-8 characters on backspace instead of individual bytes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

prompt() failing to handle deleting multi-byte characters

1 participant