SpawnDev.ILGPU extends ILGPU with three browser GPU backends. It transpiles .NET IL into GPU shader languages (WGSL, GLSL, Wasm binary) at runtime.
dotnet build SpawnDev.ILGPU/SpawnDev.ILGPU.csproj # Main library (~2s)
dotnet build SpawnDev.ILGPU.slnx # Full solution
dotnet run --project SpawnDev.ILGPU.DemoConsole # Desktop tests (CUDA, OpenCL, CPU)
dotnet run --project SpawnDev.ILGPU.Demo # Browser tests (Blazor WASM → /tests)Target: net10.0. PublishTrimmed and RunAOTCompilation must remain false — ILGPU relies on IL reflection at runtime.
Detailed constraints live in each directory's own CLAUDE.md. Read the relevant one when working in that area.
| Directory | What | Context File |
|---|---|---|
SpawnDev.ILGPU/WebGPU/ |
WGSL transpiler, dispatch, buffers | WebGPU/CLAUDE.md |
SpawnDev.ILGPU/Wasm/ |
Wasm binary compiler, worker dispatch | Wasm/CLAUDE.md |
SpawnDev.ILGPU/WebGL/ |
GLSL transpiler, Transform Feedback | WebGL/CLAUDE.md |
ILGPU/ |
Forked ILGPU core (IR, types, runtime) | ILGPU/CLAUDE.md |
ILGPU.Algorithms/ |
Forked algorithms (Scan, RadixSort) | ILGPU.Algorithms/CLAUDE.md |
PlaywrightMultiTest/ |
Unified test runner | PlaywrightMultiTest/CLAUDE.md |
.claude/skills/ilgpu_transpiler/ |
Hard-won transpiler mapping rules | SKILL.md |
| Backend | Target | Shader Language | Key Constraint |
|---|---|---|---|
| WebGPU | Browser | WGSL | 4-byte alignment, uniformity analysis |
| WebGL | Browser | GLSL ES 3.0 | No shared memory/atomics/barriers |
| Wasm | Browser | WebAssembly binary | SharedArrayBuffer + multi-worker dispatch |
| CUDA | Desktop | PTX | Via upstream ILGPU |
| OpenCL | Desktop | OpenCL C | Via upstream ILGPU |
| CPU | Desktop | .NET | Via upstream ILGPU |
Tests in SpawnDev.ILGPU.Demo.Shared/UnitTests/BackendTestBase*.cs (~211 tests, Tests1-10). Backend-specific classes inherit and override unsupported tests. See PlaywrightMultiTest/CLAUDE.md for running tests.
Current results (March 2026): Wasm: 249 pass / 0 fail / 3 skip (v4.6.0). Full hardwareConcurrency multi-worker barrier dispatch with pure spin barriers (i32.atomic.load loops) and in-Wasm phase dispatcher (no JS-Wasm boundary crossings between phases). All large sort tests (260K-4M) passing including SpawnSceneSimulation (1.4M elements, multi-frame). Key fixes: pure spin barrier (V8 wait32 visibility gap), shared memory alloca overlap (same-size dedup), IR address space aliasing (InferAddressSpaces guards), struct/scratch overlap, multi-pass scan, Float16, unsigned ops, 256 threads, memory.grow(), ViewSourceSequencer, subViewByteOffset, atomic RMW opcode table, CopyFromBuffer, onesComplementMask .tt template, per-worker scratch, atomic.fence at 3 sync points, float atomic stores, broadcast atomic store/load, barrier counter zeroing between groups.
Every kernel compilation auto-dumps generated code to a local folder via ShaderDebugService (registered in the demo's Program.cs). Use this — do NOT ask TJ to manually run tests or capture output.
- Run the demo, go to
/tests - Click "Set Debug Folder" → pick a local folder (e.g.,
_debugdump) - Folder persists in IndexedDB across sessions — set once, works forever
debugfolder/
├── _DEBUG_README.md
├── latest.json ← live test results (updated each test)
├── test-run-YYYY-MM-DD_HH-mm-ss.json ← permanent test run history
├── wgsl/ ← WebGPU shaders with metadata headers
│ └── NNN_KernelName.wgsl
├── glsl/ ← WebGL shaders with metadata headers
│ └── NNN_KernelName.glsl
└── wasm/ ← Wasm binaries + compilation info
├── NNN_KernelName.wasm ← disassemble: wasm2wat --enable-threads
└── NNN_KernelName.txt ← params, locals, barriers, shared mem size
- Find a kernel: Grep the
.txtfiles forhasBarriers=True,helpers=1, etc. - Disassemble Wasm:
wasm2wat --enable-threads NNN_kernel.wasm > kernel.wat - Read WGSL/GLSL: Files include metadata headers (kernel name, workgroup size, shared mem, bindings, timestamp)
- Track test results:
latest.jsonupdates after every test. Comparetest-run-*.jsonacross runs. - The files are on disk. Do NOT ask TJ to capture output or run tests manually. Read the dump folder.
UnitTestsView writes results to the same debug folder via the ResultsDirectory parameter. latest.json is overwritten after EVERY test completion — it contains the full test suite state in real-time: pass/fail/skip/pending counts and per-test details (class, method, result, error, duration, stack trace). A timestamped test-run-*.json is written when the full run finishes.
During test runs, read latest.json to see results as they happen. Don't wait for the run to finish. Parse it with node -e to find failures:
node -e "const d=JSON.parse(require('fs').readFileSync('path/to/latest.json','utf8')); console.log('Pass:',d.passed,'Fail:',d.failed,'Skip:',d.skipped,'Pending:',d.pending); d.tests.filter(t=>t.result==='Error').forEach(t=>console.log('FAIL:',t.className+'.'+t.method,'-',(t.error||'?').substring(0,200)));"- Bugs found here are HIGHEST PRIORITY. SpawnDev.ILGPU is the foundation for SpawnDev.ILGPU.ML, SpawnScene, and every project that uses GPU compute. A bug here is a bug in everything. When a consuming project discovers a SpawnDev.ILGPU bug, stop all other work and fix it here first — with unit tests. No workarounds in consumers. No "fix it later." Treat every release as the final release.
- Correctness is non-negotiable. Performance is a close second. Kernels dispatch thousands of times/sec.
- No workarounds that mask problems. Fix root causes.
- Cross-backend impact — changes to
ILGPU/affect all 6 backends. Consider all of them. - No quick fixes — plan before implementing complex changes.
- Do not hardcode evolving hardware limits — preserve full i64 index paths.
These apply everywhere, not just one directory:
- No backend-specific kernel variants. NEVER create backend-specific copies of algorithm kernels (e.g.,
WasmRadixSortKernel1) to work around bugs. The same kernel must work on all 6 backends. Fix bugs in the codegen, dispatch, or memory management — not by duplicating the algorithm. Only acceptable if it is absolutely IMPOSSIBLE to fix any other way. - Blazor WASM is single-threaded — all async, no blocking calls
- T4 Templates in
ILGPU/— check for.ttbefore editing.cs. Generated files are silently overwritten. - Device loss detection — WebGPU:
device.lostpromise. WebGL:webglcontextlostevent. Guards on dispatch/synchronize. Intentional disposal filtered out.