Performance — Tier 2 (Medium Impact)
File: src/server.ts:346
Problem
getContextWindow() is called on every throttled WS emit (~4 calls/sec). It does a linear prefix scan on the model name to determine the context window size. The model does not change mid-stream, so this is wasted work.
Fix
Cache contextWindow once when actualModel is resolved (start of transform), store on the transform instance, and reuse for all subsequent emits.
Impact
Eliminates ~4 linear scans/sec during active streaming.
Source: Performance review (2026-04-25)
Performance — Tier 2 (Medium Impact)
File:
src/server.ts:346Problem
getContextWindow()is called on every throttled WS emit (~4 calls/sec). It does a linear prefix scan on the model name to determine the context window size. The model does not change mid-stream, so this is wasted work.Fix
Cache
contextWindowonce whenactualModelis resolved (start of transform), store on the transform instance, and reuse for all subsequent emits.Impact
Eliminates ~4 linear scans/sec during active streaming.
Source: Performance review (2026-04-25)