Skip to content

fix: correct context usage calculation to include cacheRead tokens an…#280

Merged
reachraza merged 1 commit intomainfrom
code-refactor
Feb 2, 2026
Merged

fix: correct context usage calculation to include cacheRead tokens an…#280
reachraza merged 1 commit intomainfrom
code-refactor

Conversation

@reachraza
Copy link
Copy Markdown
Contributor

Problem

The context window usage gauge had several bugs causing inaccurate readings:

  1. UnderestimationcacheReadInputTokens was excluded from the formula. For a single Anthropic API call, the total input context is inputTokens + cacheReadInputTokens + cacheCreationInputTokens (three partitions of the same input: uncached, cache-hit, newly-cached). Excluding cache-read meant the gauge showed ~3% when reality was ~23%.

  2. False 100% — During complex multi-tool turns, Claude Code accumulates token values across all internal API calls in a single result JSONL message. With cache-read included, these accumulated totals easily exceed the 200K context window (e.g., 2.5M cache-read from 40 internal calls each reading ~62K). The old code capped at 100%, triggering false compact warnings.

  3. Tooltip inconsistencyMainPanel.tsx recalculated context tokens independently from raw tab usageStats, bypassing the fix in App.tsx. The hover tooltip still showed 100% even when the gauge was correct.

  4. Group chat affected — The usage-listener for group chat participants and moderators had the same formula bug, passing through inflated percentages.

Root cause

Claude Code reports a single result JSONL event per turn with token values summed across all internal API calls. For a turn with N tool uses, each internal call reads the full conversation from cache, so:

  • cacheReadInputTokens ≈ context_size × N (e.g., 62K × 40 = 2.5M)
  • inputTokens ≈ uncached_portion × N
  • outputTokens ≈ per_call_output × N

These sums are useful for billing but meaningless for measuring context window fill. A single API call's total input can never exceed the context window, so total > window definitively indicates accumulated values.

Fix

  • Formula: calculateContextTokens now includes all three input token types: input + cacheRead + cacheCreation
  • Accumulated detection: estimateContextUsage returns null when total > contextWindow (impossible for a single call → must be accumulated)
  • App.tsx: When null, skips the update entirely — the last valid percentage is preserved. No estimates, no high-water mark, just real measurements.
  • MainPanel.tsx: When raw tokens exceed window, back-derives display tokens from session.contextUsage (the preserved percentage)
  • usage-listener.ts: Group chat participant uses conditional update (omits context fields when accumulated). Moderator emits -1 sentinel so the UI handler preserves previous values.
  • Removed Math.min(100, ...): No longer needed — the null return handles overflow, and valid measurements are mathematically bounded to 0-100%.

Behavior

Scenario Before After
Normal turn underestimated accurate
Multi-tool turn 100% (false positive) Holds last valid %
Hover tooltip during tool turn 100% Same as gauge
Compact warning Premature Only at real thresholds

Files changed (11)

Source (5):

  • src/renderer/utils/contextUsage.ts — Formula fix + accumulated detection
  • src/main/parsers/usage-aggregator.ts — Same formula fix (main process copy)
  • src/renderer/App.tsx — Simplified onUsage handler, skip update when null
  • src/renderer/components/MainPanel.tsx — Derive from preserved percentage when accumulated
  • src/main/process-listeners/usage-listener.ts — Group chat accumulated value handling

Tests (6):

  • contextUsage.test.ts — Updated assertions for new formula, added accumulated detection tests
  • usage-aggregator.test.ts — Same pattern, added accumulated detection tests
  • usage-listener.test.ts — Updated fallback test for 200K default window
  • MainPanel.test.tsx — Updated cap test to use preserved session.contextUsage
  • contextExtractor.test.ts — Updated token totals (450→575, 350→475)
  • HistoryDetailModal.test.tsx — Updated display percentage (10%→12%)

Testing

  • 16,316 tests passing, 0 failures
  • TypeScript clean (all 3 configs)
  • ESLint: 0 errors (37 pre-existing warnings, none from this change)
  • Full production build passes (prompts, main, preload, renderer, web, cli)

…d handle accumulated values

The context formula was excluding cacheReadInputTokens, causing the gauge to
drastically underestimate usage (e.g., 3% when reality was 23%). During
multi-tool turns, accumulated token totals could exceed the context window,
producing false 100% readings and premature compact warnings.

- Include cacheReadInputTokens in the formula (input + cacheRead + cacheCreation)
- Detect accumulated values (total > window) and return null to preserve last valid %
- Skip context updates during accumulated turns instead of displaying inflated values
- Fix MainPanel tooltip deriving from raw tab stats instead of preserved session percentage
- Handle group chat participant/moderator accumulated values with -1 sentinel
@claude
Copy link
Copy Markdown

claude bot commented Feb 2, 2026

Pull Request Review: Context Usage Calculation Fix

Summary

This PR fixes critical bugs in the context window usage gauge that were causing both underestimation and false 100% readings. The solution is well-designed and thoroughly tested.

Code Quality: ✅ Excellent

Strengths

  1. Excellent Problem Analysis

    • The PR description provides outstanding clarity on the root cause
    • Clear explanation of why cacheReadInputTokens should be included (it's part of the total input partition, not cumulative)
    • Well-documented understanding of Claude Code's multi-tool turn behavior
  2. Clean Implementation

    • The fix correctly includes all three input token types: input + cacheRead + cacheCreation
    • Smart detection of accumulated values using total > contextWindow (impossible for a single API call)
    • Returns null to signal callers to preserve previous valid percentage instead of guessing
    • Removed unnecessary Math.min(100, ...) caps since the null return handles overflow
  3. Consistent Patterns

    • Same formula applied across renderer (contextUsage.ts) and main process (usage-aggregator.ts)
    • Coordinated handling in App.tsx, MainPanel.tsx, and usage-listener.ts
    • All callers properly handle the null return by preserving previous state
  4. Comprehensive Test Coverage

    • 16,316 tests passing with 0 failures
    • Added specific tests for accumulated value detection
    • Updated all existing tests to reflect the corrected formula
    • Real-world scenario tests (e.g., first turn with system prompt cache)

Potential Issues: Minor Observations

1. Group Chat Sentinel Value (-1)

Location: usage-listener.ts:566, App.tsx:3270

The use of -1 as a sentinel value is functional but could be more type-safe:

// Current approach:
contextUsage: -1  // Magic number

Suggestion: Consider using a discriminated union or explicit undefined:

type UsageUpdate = 
  | { contextUsage: number; tokenCount: number; totalCost: number }
  | { totalCost: number }; // Omit context fields when accumulated

However, this is a minor point and the current approach works fine with clear comments.

2. MainPanel Fallback Calculation

Location: MainPanel.tsx:732-735

The fallback logic derives tokens from the preserved percentage:

const preservedPercentage = activeSession?.contextUsage ?? 0;
return Math.round((preservedPercentage / 100) * activeTabContextWindow);

This is clever but introduces a circular dependency: you're deriving tokens from a percentage that was originally derived from tokens. While this works for display purposes, consider whether this could cause any edge cases where the tooltip shows a value that doesn't quite match reality.

Minor risk: If session.contextUsage was never initialized (null/undefined), this falls back to 0%, which may be misleading during the first accumulated turn.

Suggestion: Could add a check to see if activeSession?.contextUsage is actually defined before using it as a fallback, or at least document this assumption.

Performance: ✅ Excellent

  • No performance concerns identified
  • The totalContextTokens > effectiveContextWindow check is O(1)
  • Skipping updates during accumulated turns actually reduces UI churn

Security: ✅ No Concerns

  • No security implications
  • All calculations are done on trusted internal data
  • No user input validation required

Architecture: ✅ Well-Designed

  1. Separation of Concerns

    • Core calculation logic in utility functions (calculateContextTokens, estimateContextUsage)
    • State management in App.tsx
    • Display logic in MainPanel.tsx
  2. Error Handling

    • Graceful fallback when context window is not provided (uses default 200K)
    • Proper handling of edge cases (zero tokens, missing window)
  3. Maintainability

    • Excellent comments explaining the accumulated values scenario
    • Clear function names and parameter types
    • Consistent pattern across codebase

Testing: ✅ Comprehensive

  1. Unit Tests

    • Tests for normal calculations
    • Tests for accumulated value detection
    • Edge case tests (zero window, very large values)
    • Real-world scenario tests
  2. Integration

    • All 16,316 tests passing
    • TypeScript clean across all configs
    • ESLint clean (37 pre-existing warnings unchanged)

Best Practices: ✅ Followed

✅ DRY principle - shared utility functions
✅ Type safety - proper TypeScript types
✅ Documentation - comprehensive comments
✅ Testing - thorough test coverage
✅ No breaking changes - backward compatible
✅ Performance - no regressions

Recommendations

Must-Have (None)

All critical issues are resolved. This PR is ready to merge.

Nice-to-Have (Optional)

  1. Consider adding a debug log when accumulated values are detected, to help troubleshoot any future issues:

    if (totalContextTokens > effectiveContextWindow) {
      console.debug(`Skipping context update: accumulated values detected (${totalContextTokens} > ${effectiveContextWindow})`);
      return null;
    }
  2. Document the behavior in MainPanel where it derives from session.contextUsage:

    // When accumulated from multi-tool turns, derive tokens from session's preserved percentage.
    // App.tsx skips updating session.contextUsage when accumulated, so it holds
    // the last valid percentage from estimateContextUsage.
    // CAVEAT: If session.contextUsage was never initialized, this falls back to 0.
  3. Consider type-safe sentinel instead of -1 (as mentioned above), but this is purely optional.

Conclusion

Recommendation: ✅ APPROVE

This is an excellent PR that:

  • Correctly identifies and fixes the root cause
  • Implements a clean, maintainable solution
  • Includes comprehensive test coverage
  • Maintains backward compatibility
  • Has clear documentation

The fix properly addresses both the underestimation bug (missing cacheReadInputTokens) and the false 100% bug (accumulated values) with a mathematically sound approach. The "preserve last valid percentage" strategy is much better than trying to estimate or cap accumulated values.

Great work on the thorough problem analysis and implementation! 🎉


Minor suggestions above are optional improvements and should not block merging.

@reachraza reachraza merged commit 5373a5f into main Feb 2, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant