Skip to content

Allow JIT and CPU thread to work together#1806

Merged
midwan merged 4 commits intomasterfrom
claude/fix-cpu-threads-jit-UJ2Q6
Feb 22, 2026
Merged

Allow JIT and CPU thread to work together#1806
midwan merged 4 commits intomasterfrom
claude/fix-cpu-threads-jit-UJ2Q6

Conversation

@midwan
Copy link
Collaborator

@midwan midwan commented Feb 21, 2026

The code already supports JIT + CPU thread via cpu_thread_run_jit() which
runs JIT compiled code on a separate thread while run_cpu_thread() manages
cycle timing on the main thread. The JIT helper functions (do_nothing,
exec_nostats, execute_normal) already skip do_cycles() when cpu_thread is
enabled, and do_specialties_thread() handles SPCFLAG_END_COMPILE.

Remove the artificial restrictions that prevented enabling both:

  • amiberry.cpp: Remove target_fixup_options() force-disable of cpu_thread
    when cachesize > 0
  • cpu.cpp: Allow JIT checkbox when cpu_thread is enabled
  • cpu.cpp: Allow cpu_thread checkbox when JIT cache is enabled

https://claude.ai/code/session_01V9mxWsSPRWrdiYzAfEuCng

claude and others added 4 commits February 21, 2026 07:10
The code already supports JIT + CPU thread via cpu_thread_run_jit() which
runs JIT compiled code on a separate thread while run_cpu_thread() manages
cycle timing on the main thread. The JIT helper functions (do_nothing,
exec_nostats, execute_normal) already skip do_cycles() when cpu_thread is
enabled, and do_specialties_thread() handles SPCFLAG_END_COMPILE.

Remove the artificial restrictions that prevented enabling both:
- amiberry.cpp: Remove target_fixup_options() force-disable of cpu_thread
  when cachesize > 0
- cpu.cpp: Allow JIT checkbox when cpu_thread is enabled
- cpu.cpp: Allow cpu_thread checkbox when JIT cache is enabled

https://claude.ai/code/session_01V9mxWsSPRWrdiYzAfEuCng
…ocol

The cross-thread indirect memory access path was using semaphores
(cpu_out_sema/cpu_in_sema) for every IO request, which added significant
kernel overhead per custom register/CIA access. With JIT enabled, the CPU
thread runs much faster and hits these IO paths more frequently, making
the synchronization cost the primary bottleneck.

Changes:
- Replace semaphore-based signaling with pure atomic spin-wait protocol
  on the hot path. The cpu_thread_indirect_mode atomic variable is now
  the sole synchronization mechanism for IO requests.
- Extract service_cpu_indirect_request() helper that the main thread
  calls to poll and complete one pending request.
- Main thread now checks for IO requests EVERY cycle inside the cycle
  loop (not just at the top of the outer loop), reducing worst-case
  latency from 128-256 CCKs to 1 CCK.
- Flush register batch on main thread at every IO service point.
- Service IO requests during frame limiter sleep instead of sleeping
  through them.
- Add SPCFLAG_MODE_CHANGE escape from spin-wait to prevent deadlock
  during shutdown.

Semaphores are retained only for rare operations (CPU reset, CPU stop
wakeup, shutdown drain).

https://claude.ai/code/session_01V9mxWsSPRWrdiYzAfEuCng
The lockless spin optimization calls cpu_thread_flush_register_batch()
from run_cpu_thread(), which requires the declaration from cpu_thread.h.

https://claude.ai/code/session_01V9mxWsSPRWrdiYzAfEuCng
@midwan midwan merged commit e2388e1 into master Feb 22, 2026
22 checks passed
@midwan midwan deleted the claude/fix-cpu-threads-jit-UJ2Q6 branch February 22, 2026 18:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants