Skip to content

[Windows] N-API vtable corruption causes segfault, TUI escape, and orphaned processes in Bun v1.3.10 #27471

@ThatDragonOverThere

Description

@ThatDragonOverThere

Summary

Bun v1.3.10 (bundled with Claude Code / Anthropic CLI) has a critical N-API race condition on Windows that causes:

  1. Segfaults at addresses 0x18, 0x7, 0x113, 0xFFFF (vtable corruption pattern)
  2. TUI escape — terminal state machine dumps raw CSI focus reporting codes ([I[O[) and internal message buffers into the host shell as executable commands
  3. Orphaned processes — crashed Bun processes stay alive, leak memory (1GB+ each), accumulate until system runs out of RAM
  4. Full computer lockups — 5 hard reboots in 8 days from orphaned process memory exhaustion
  5. Config file corruption — non-atomic writes during crash leave truncated JSON (280+ corruption events in one day)

Reproduction

  • Application: Claude Code (Anthropic CLI), which bundles Bun v1.3.10
  • Platform: Windows 11, x86_64 (MSYS2/Git Bash terminal)
  • Trigger: Extended sessions (30min+) with sub-agent spawning (multiple concurrent Bun child processes). Also occurs in single-process sessions after ~4-5 hours.
  • Frequency: 42 crashes in 33 days. 22+ crashes in a single day (Feb 25-26, 2026).

Steps to Reproduce

  1. Run Claude Code on Windows with multiple concurrent sub-agent tasks
  2. Wait 30-90 minutes
  3. TUI escape occurs: [I[O[I[O[ dumped to terminal, session dies
  4. Bun process stays alive with no window handle, consuming 400MB-1GB+ RAM
  5. Repeat until system runs out of memory

Crash Variants Observed

  • panic(main thread): Segmentation fault at address 0x18 — most common
  • panic: switch on corrupt value — vtable corruption
  • SIGILL — illegal instruction (less common)
  • Silent TUI escape with no panic output (most frequent on Windows)

Root Cause Analysis

WinDbg analysis (posted on anthropics/claude-code#21875) points to N-API callback vtable corruption during concurrent TUI rendering + child process I/O. The crash site is in the N-API bridge between Bun's native runtime and the JavaScript TUI layer.

Community member @balandari identified the pattern: the vtable pointer is corrupted to point at an invalid address, causing the "switch on corrupt value" panic when the runtime tries to dispatch through it. This is consistent with a use-after-free or double-free in the N-API reference tracking.

Terminal State Corruption

When Bun crashes, it does NOT:

  • Exit the alternate screen buffer
  • Disable focus reporting (CSI ?1004l)
  • Restore terminal raw mode to cooked mode
  • Kill child processes / process group
  • Clean up ConPTY handles (Windows)

This means:

  • Focus reporting events ([I / [O) continue being sent to the dead terminal
  • Internal message buffers get flushed to stdout as raw text
  • On Windows, these strings are interpreted as PowerShell commands: The : The term 'The' is not recognized...
  • The terminal remains corrupted until manually reset (stty sane / restart)

Orphaned Process Impact

After each crash, 1-3 Bun processes remain alive with:

  • No window handle (headless)
  • Increasing memory consumption (400MB → 1GB+ over 1-3 hours)
  • Still receiving terminal focus events via the broken ConPTY

Kill log from our custom orphan killer script (last 24 hours):

PID=32164 Mem=1059.1MB Runtime=00:33:24
PID=44712 Mem=887.2MB  Runtime=03:00:31
PID=42640 Mem=1032.4MB Runtime=01:35:53
PID=39384 Mem=1045.1MB Runtime=01:54:50
PID=57052 Mem=972.5MB  Runtime=03:04:45
PID=24036 Mem=1028.8MB Runtime=02:28:29

Downstream Impact

Environment

  • Bun: v1.3.10 (bundled with Claude Code v2.1.50-v2.1.59, all crash)
  • OS: Windows 11 (MSYS_NT-10.0-26200, x86_64)
  • Terminal: Windows Terminal + Git Bash / MSYS2
  • RAM: 32GB (orphaned processes exhaust this in 2-4 hours)
  • Crash rate: ~1.3 per day average, up to 22 per day

Expected Behavior

  • Bun should not segfault during normal N-API callback dispatch
  • On crash, terminal state should be restored (alternate screen, focus reporting, raw mode)
  • On crash, child processes should be killed (process group cleanup)
  • File writes should be atomic (temp + rename) to prevent corruption on crash

Requested Fixes

  1. N-API thread safety audit — the vtable corruption pattern suggests a concurrency issue in reference tracking
  2. Crash cleanup handler — atexit/signal handler to restore terminal state and kill child processes
  3. Process group management — child processes should die with parent on Windows (Job Objects)
  4. ConPTY cleanup — proper handle lifecycle on crash

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions