fix: correct context usage calculation to include cacheRead tokens an…#280
fix: correct context usage calculation to include cacheRead tokens an…#280
Conversation
…d handle accumulated values The context formula was excluding cacheReadInputTokens, causing the gauge to drastically underestimate usage (e.g., 3% when reality was 23%). During multi-tool turns, accumulated token totals could exceed the context window, producing false 100% readings and premature compact warnings. - Include cacheReadInputTokens in the formula (input + cacheRead + cacheCreation) - Detect accumulated values (total > window) and return null to preserve last valid % - Skip context updates during accumulated turns instead of displaying inflated values - Fix MainPanel tooltip deriving from raw tab stats instead of preserved session percentage - Handle group chat participant/moderator accumulated values with -1 sentinel
Pull Request Review: Context Usage Calculation FixSummaryThis PR fixes critical bugs in the context window usage gauge that were causing both underestimation and false 100% readings. The solution is well-designed and thoroughly tested. Code Quality: ✅ ExcellentStrengths
Potential Issues: Minor Observations1. Group Chat Sentinel Value (-1)Location: The use of // Current approach:
contextUsage: -1 // Magic numberSuggestion: Consider using a discriminated union or explicit type UsageUpdate =
| { contextUsage: number; tokenCount: number; totalCost: number }
| { totalCost: number }; // Omit context fields when accumulatedHowever, this is a minor point and the current approach works fine with clear comments. 2. MainPanel Fallback CalculationLocation: The fallback logic derives tokens from the preserved percentage: const preservedPercentage = activeSession?.contextUsage ?? 0;
return Math.round((preservedPercentage / 100) * activeTabContextWindow);This is clever but introduces a circular dependency: you're deriving tokens from a percentage that was originally derived from tokens. While this works for display purposes, consider whether this could cause any edge cases where the tooltip shows a value that doesn't quite match reality. Minor risk: If Suggestion: Could add a check to see if Performance: ✅ Excellent
Security: ✅ No Concerns
Architecture: ✅ Well-Designed
Testing: ✅ Comprehensive
Best Practices: ✅ Followed✅ DRY principle - shared utility functions RecommendationsMust-Have (None)All critical issues are resolved. This PR is ready to merge. Nice-to-Have (Optional)
ConclusionRecommendation: ✅ APPROVE This is an excellent PR that:
The fix properly addresses both the underestimation bug (missing Great work on the thorough problem analysis and implementation! 🎉 Minor suggestions above are optional improvements and should not block merging. |
Problem
The context window usage gauge had several bugs causing inaccurate readings:
Underestimation —
cacheReadInputTokenswas excluded from the formula. For a single Anthropic API call, the total input context isinputTokens + cacheReadInputTokens + cacheCreationInputTokens(three partitions of the same input: uncached, cache-hit, newly-cached). Excluding cache-read meant the gauge showed ~3% when reality was ~23%.False 100% — During complex multi-tool turns, Claude Code accumulates token values across all internal API calls in a single
resultJSONL message. With cache-read included, these accumulated totals easily exceed the 200K context window (e.g., 2.5M cache-read from 40 internal calls each reading ~62K). The old code capped at 100%, triggering false compact warnings.Tooltip inconsistency —
MainPanel.tsxrecalculated context tokens independently from raw tabusageStats, bypassing the fix inApp.tsx. The hover tooltip still showed 100% even when the gauge was correct.Group chat affected — The usage-listener for group chat participants and moderators had the same formula bug, passing through inflated percentages.
Root cause
Claude Code reports a single
resultJSONL event per turn with token values summed across all internal API calls. For a turn with N tool uses, each internal call reads the full conversation from cache, so:cacheReadInputTokens≈ context_size × N (e.g., 62K × 40 = 2.5M)inputTokens≈ uncached_portion × NoutputTokens≈ per_call_output × NThese sums are useful for billing but meaningless for measuring context window fill. A single API call's total input can never exceed the context window, so
total > windowdefinitively indicates accumulated values.Fix
calculateContextTokensnow includes all three input token types:input + cacheRead + cacheCreationestimateContextUsagereturnsnullwhentotal > contextWindow(impossible for a single call → must be accumulated)null, skips the update entirely — the last valid percentage is preserved. No estimates, no high-water mark, just real measurements.session.contextUsage(the preserved percentage)-1sentinel so the UI handler preserves previous values.Math.min(100, ...): No longer needed — thenullreturn handles overflow, and valid measurements are mathematically bounded to 0-100%.Behavior
Files changed (11)
Source (5):
src/renderer/utils/contextUsage.ts— Formula fix + accumulated detectionsrc/main/parsers/usage-aggregator.ts— Same formula fix (main process copy)src/renderer/App.tsx— Simplified onUsage handler, skip update when nullsrc/renderer/components/MainPanel.tsx— Derive from preserved percentage when accumulatedsrc/main/process-listeners/usage-listener.ts— Group chat accumulated value handlingTests (6):
contextUsage.test.ts— Updated assertions for new formula, added accumulated detection testsusage-aggregator.test.ts— Same pattern, added accumulated detection testsusage-listener.test.ts— Updated fallback test for 200K default windowMainPanel.test.tsx— Updated cap test to use preservedsession.contextUsagecontextExtractor.test.ts— Updated token totals (450→575, 350→475)HistoryDetailModal.test.tsx— Updated display percentage (10%→12%)Testing