Skip to content

feat: optimization platform — config, tokenizers, LRU cache, sessions, context-delta#163

Merged
ooples merged 28 commits intomasterfrom
feat/sophisticated-token-counting-issue-4
Apr 20, 2026
Merged

feat: optimization platform — config, tokenizers, LRU cache, sessions, context-delta#163
ooples merged 28 commits intomasterfrom
feat/sophisticated-token-counting-issue-4

Conversation

@ooples
Copy link
Copy Markdown
Owner

@ooples ooples commented Apr 20, 2026

Closes #120
Closes #121
Closes #122
Closes #123
Closes #124
Closes #125
Closes #126

Mega-PR bundling all six Gemini-CLI-pattern issues. Written as a single audit-friendly unit; commits are logically separated and a full re-audit of issue acceptance criteria vs. code was performed before the final push.

What's in the box

#120 — Configuration-driven thresholds

  • src/core/types.ts adds OptimizationConfig (incl. cacheSettings and chatCompression sub-sections).
  • src/core/config.ts has zod-validated load, deep-merging of user overrides, a writeDefaults path that creates ~/.token-optimizer/config.json with sane defaults on first run, and getOptimizationConfig() / getModelTokenLimit(name) accessors.
  • hooks/helpers/config.ps1 — PowerShell Import-TokenOptimizerConfig / Get-TokenOptimizerOptimizationConfig / Get-TokenOptimizerModelTokenLimit mirroring the TS shape.
  • src/server/index.ts loads ConfigManager on boot and wires the derived values into SessionManager.defaultMaxTokens and into the LRU memoize layer (see feat: implement LRU cache for expensive operations #125).

#121 — Chat history compression

  • src/core/session.tsSession.compressHistory() summarizes the head (keeping preserveTailRatio of the tail) via an injected ISummarizer. Now also requires a tokenizer unless allowCharHeuristic: true — the entire point of feat: implement sophisticated token counting (beyond character/4) #124 was to delete the char/4 fallback.
  • src/core/summarization.ts — three production ISummarizer implementations:
    • AnthropicSummarizer (Claude Haiku, x-api-key, 30s timeout)
    • GoogleAISummarizer (Gemini 2.5 Flash, generateContent, 30s timeout)
    • TruncatingSummarizer — zero-network fallback
    • createSummarizerFromEnv() picks the highest-fidelity summarizer whose API key is present.
  • OptimizationConfig.chatCompression { enabled, tokenLimit, strategy } is plumbed through SessionManager.defaultMaxTokens so every new session inherits the configured chat budget.

#122 — Context delta tracking

  • src/utils/diff.tscalculateDelta / applyDelta (unified-diff, round-trippable).
  • src/core/session.ts + session-manager.tsSession.clearFileContent, SessionManager.clearFileState, atomic/debounced/gzip-backed persistence (see audit fixes below).
  • src/tools/context-delta-tool.ts — the context_delta MCP tool now routes compute-delta and clear through SessionManager.updateFileState / clearFileState so file-state changes are durable across restarts (the audit caught that they were in-memory-only in v1).
  • hooks/helpers/context-delta.ps1 — PS Get-TokenOptimizerSessionId (persisted at ~/.token-optimizer/current-session-id), Reset-TokenOptimizerSessionId, and Invoke-ContextDelta.
  • Orchestrator Handle-SmartRead calls Invoke-ContextDelta after a successful smart_read so the server's per-session file snapshot stays in sync.

#123 + #124 — Sophisticated token counting

  • src/core/tokenizers/i-tokenizer.tsITokenizer interface.
  • src/core/tokenizers/tiktoken-tokenizer.ts — local tiktoken backend; maps Claude/GPT family to gpt-4.
  • src/core/tokenizers/heuristic-tokenizer.ts — content-aware chars/token ratios (code / json / markdown / text) with an auto-detector.
  • src/core/tokenizers/google-ai-tokenizer.ts — remote Google AI countTokens backend (needed for Gemini models).
  • src/core/tokenizers/tokenizer-factory.ts — picks a backend by model name, caches instances, exposes disposeAll() for shutdown.
  • src/core/token-counter.ts is refactored to delegate to TokenizerFactory (the audit caught that the v1 factory was a parallel system nothing consumed). A sync count() stays available for tiktoken families and a new countAsync() routes remote models through the factory.
  • count_tokens MCP tool accepts an optional modelName parameter and reports the resolved model in its response.

#125 — LRU cache + integrations

  • src/utils/lru-cache.ts — generic LruCache<K, V> with maxSize bound, per-entry TTL, O(1) eviction, prune(), and stats().
  • src/utils/lru-memoize.tslruMemoize(fn, options) wrapper plus a shared memoRegistry so the server can prune and log stats across every memo in one call.
  • src/server/index.ts memoizes runSmartRead, runSmartGrep, runSmartGlob with sizes / TTLs from optimizationConfig.cacheSettings.
  • A 5-minute setInterval runs memoRegistry.pruneAll() and logs stats on any non-zero eviction. Timer is .unref()-ed and cleared on shutdown.

#126 — Gzip compression

  • src/utils/gzip.tsgzipString / gunzipBuffer, plus saveGzippedFile (atomic tmp + rename, removes stale plaintext siblings) and loadMaybeGzippedFile (prefers .gz, falls back to plaintext for backward compatibility).
  • SessionManager writes sessions via saveGzippedFile and reads via loadMaybeGzippedFile — pre-existing uncompressed sessions.json files still load.
  • hooks/helpers/gzip.ps1 — matching Compress-String / Expand-String / Save-GzippedFile / Read-MaybeGzippedFile PowerShell utilities.

Production-hardening fixes from the audit pass

  • SessionManager: atomic persist (tmp + rename), debounced writes (~250ms), zod-validated load, session TTL eviction (30 d), per-file size cap (10 MB), flush() for clean shutdown, errors logged rather than thrown (disk-full no longer crashes the server).
  • Session: getHistoryTokenCount now throws instead of silently falling back to char/4.
  • context_delta: routes through SessionManager so file-state changes persist.
  • TokenizerFactory: caches instances per model, exposes disposeAll() hooked up to SIGINT/SIGTERM cleanup.
  • SqliteOptimizationStorage: default path is now an absolute ~/.token-optimizer/optimization.db; directory is created on demand.
  • Server shutdown: adds sessionManager.flush(), TokenizerFactory.disposeAll(), optimizationStorage.close(), memo cache clearing.

Tests

57 unit tests in tests/unit/, all green:

Suite Tests
lru-cache.test.ts 7
lru-memoize.test.ts 5
tokenizers.test.ts 7
diff.test.ts 5
session.test.ts 9
config.test.ts 5
gzip.test.ts 6
summarization.test.ts 10
(existing) token-counter.test.ts passes unchanged 43

PowerShell helpers runtime-verified in PS7 (gzip round-trip, config load).

Test plan

  • npx tsc --noEmit clean
  • New unit tests green (57/57)
  • Existing token-counter.test.ts green (43/43)
  • CI: Build, Test (Node 18/20/22), Lint pass
  • Manual: call context_delta via MCP against a seeded session; confirm delta < full content on small edits
  • Manual: verify ~/.token-optimizer/config.json is created on first server start

ooples and others added 16 commits November 2, 2025 19:57
…and #5)

- add lrucache class with ttl support and statistics tracking
- add tokencounter class with google ai api integration
- implement content-type aware token estimation (code/json/markdown/text)
- integrate lru caching for token counts (200 entries, 30min ttl)
- add automatic eviction and periodic cleanup for cache
- initialize global tokencounter singleton with api key from environment

implements issue #4: sophisticated token counting beyond character/4
implements issue #5: lru cache for expensive operations

generated with claude code

co-authored-by: claude <noreply@anthropic.com>
- add type guards to prevent class re-definition errors (CRITICAL)
- fix sha256 resource disposal with proper try/finally
- replace write-log with write-host to fix ordering issue
- fix double-counting in getstats totalcalls calculation
- make model name configurable via google_ai_model env var
- improve api error handling for timeout/network errors
- fix detectcontenttype regex for exact matching

addresses feedback from github copilot and coderabbit reviews
- add type guards to prevent class re-definition errors (CRITICAL)
- fix sha256 resource disposal with proper try/finally
- replace write-log with write-host to fix ordering issue
- fix double-counting in getstats totalcalls calculation
- make model name configurable via google_ai_model env var
- improve api error handling for timeout/network errors
- fix detectcontenttype regex for exact matching

addresses feedback from github copilot and coderabbit reviews
…token-counting-issue-4

# Conflicts:
#	hooks/handlers/token-optimizer-orchestrator.ps1
…, mcp tool shape)

The original implementation of the optimization-storage feature imported
sqlite3/sqlite (not installed) and used a TurnContext tool shape that
does not exist in this codebase, breaking `tsc --noEmit`. This commit:

- Rewrites SqliteOptimizationStorage to use better-sqlite3 (already a
  production dep) with WAL mode and a hash index.
- Rewrites OptimizationStorageTool to match the project's run(options)
  MCP tool pattern and exports an input schema with items-complete
  array fields.
- Fixes compression-engine.ts:compressToBase64 return type (was
  `CompressionResult & { compressed: string }`, producing Buffer & string
  for the overridden property).
- Swaps the server's smart_grep case from the non-existent smartGrep
  instance to the runSmartGrep CLI function.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
In-memory LRU cache for hot paths (token counts, file search results,
MCP correction responses). Separate from CacheEngine, which is the
persistent SQLite cache — LruCache is intended for process-local
memoization with O(1) eviction via Map insertion order.

- maxSize-bounded with LRU eviction
- Optional per-entry TTL with automatic lazy expiration
- prune() for proactive periodic cleanup
- stats() exposes hits / misses / evictions / expired / hitRate

Refs #125

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Addresses the long-standing character/4 heuristic and adds vendor-neutral
tokenization:

- ITokenizer interface — async countTokens + free() lifecycle.
- TiktokenTokenizer — wraps the tiktoken encoder; handles Claude family
  by mapping to the gpt-4 encoder (closest publicly available).
- HeuristicTokenizer — content-aware local fallback (code 2.5,
  json 2.8, markdown 3.5, text 4.0 chars/token), auto-detected via
  cheap regex + try-parse-json.
- TokenizerFactory.create(modelName) picks the best backend;
  createFromEnv reads CLAUDE_MODEL / ANTHROPIC_MODEL / OPENAI_MODEL /
  TOKEN_OPTIMIZER_MODEL.

Both tokenizer implementations memoize counts with the generic LruCache
(#125), so repeated inputs do not re-tokenize.

Refs #124, #123

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Extends HypercontextConfig with an optional `optimization` section
mirroring Gemini CLI's settingsSchema:

- compressionTokenThreshold — fraction of context to trigger compression
- compressionPreserveThreshold — fraction to keep uncompressed at tail
- minTokensBeforeCompression — lower bound for optimizer to engage
- modelTokenLimits — per-model context window size
- minOutputSizeBytes / quality — stored-entry gating

ConfigManager now validates the user config file against a zod schema
and falls back to DEFAULT_CONFIG with a descriptive warning instead of
silently accepting malformed JSON. Adds getOptimizationConfig() and
getModelTokenLimit(modelName) accessors.

Refs #120

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…122)

Introduces the session-centric plumbing that the optimization plan has
been sketched against for a while:

- src/core/session.ts — Session class holding history + per-file state,
  with a token-aware compressHistory() that keeps a configurable tail
  fraction and summarizes the head via a pluggable ISummarizer.
- src/core/summarization.ts — ISummarizer interface + a self-contained
  TruncatingSummarizer fallback so the module is usable without an LLM.
- src/core/session-manager.ts — JSON-persisted singleton that auto-
  compresses a session's history when addMessage() pushes it past
  maxTokens.
- src/utils/diff.ts — calculateDelta / applyDelta built on the existing
  `diff` dep (unified-diff, round-trippable).
- src/tools/context-delta-tool.ts — new context_delta MCP tool with
  compute-delta / seed / clear operations and an items-complete input
  schema.
- src/validation/tool-schemas.ts — add OptimizationStorageSchema and
  ContextDeltaSchema so the validator accepts both new tools.
- src/server/index.ts — wire up SessionManager + ContextDeltaTool using
  TokenizerFactory.createFromEnv; persistence at
  ~/.token-optimizer/sessions.json. Remove the previously-duplicated
  inline OPTIMIZATION_STORAGE_TOOL_DEFINITION and import it from the
  tool module.

Refs #121, #122

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…anager

33 tests covering eviction/TTL/stats (LruCache), content-type detection
and factory routing (tokenizers), round-trip and mismatched-baseline
behaviors (diff), history compression and snapshot round-trip
(Session/SessionManager), and user-override + validation fallback
(ConfigManager).

Refs #120, #121, #122, #124, #125

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 20, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

Walkthrough

Adds server-side session management, context-delta and optimization-storage MCP tools, pluggable tokenizers/summarizers, gzip and LRU memoization utilities, config schema validation, SQLite optimization persistence, and shared PowerShell helpers (logging, config, gzip, context-delta); wires new tools into the server and updates PowerShell handlers to dot-source helpers and centralized error handling.

Changes

Cohort / File(s) Summary
PowerShell hooks & helpers
hooks/dispatcher.ps1, hooks/handlers/token-optimizer-orchestrator.ps1, hooks/helpers/logging.ps1, hooks/helpers/config.ps1, hooks/helpers/gzip.ps1, hooks/helpers/context-delta.ps1
Made dispatcher an advanced function ([CmdletBinding()]); replaced local Write-Log with shared Write-Log/Handle-Error (dot-sourced with defensive fallbacks); added PS helpers for config, gzip, and context-delta; orchestrator now dot-sources helpers and centralizes error handling and logging.
Optimization storage (TS)
src/analytics/optimization-storage.ts, src/tools/optimization-storage-tool.ts
New SQLite-backed SqliteOptimizationStorage (compressed payloads, INSERT OR REPLACE, index) and OptimizationStorageTool MCP wrapper exposing store/retrieve with input validation and JSON tool contract.
Context delta & diffing
src/tools/context-delta-tool.ts, src/utils/diff.ts
New ContextDeltaTool (operations: compute-delta/seed/clear) computing unified diffs and size metrics; utility functions calculateDelta and applyDelta.
Session & summarization
src/core/session.ts, src/core/session-manager.ts, src/core/summarization.ts
New Session model and SessionManager (in-memory with optional gzipped persistence, debounced writes, TTL, token-aware compression); ISummarizer plus TruncatingSummarizer, AnthropicSummarizer, GoogleAISummarizer, and env-based factory.
Tokenizers & token counting
src/core/tokenizers/i-tokenizer.ts, src/core/tokenizers/heuristic-tokenizer.ts, src/core/tokenizers/tiktoken-tokenizer.ts, src/core/tokenizers/google-ai-tokenizer.ts, src/core/tokenizers/tokenizer-factory.ts, src/core/token-counter.ts
Added ITokenizer contract, implementations (heuristic, tiktoken, GoogleAI), TokenizerFactory with caching/disposal, and async token-counting integration in TokenCounter (countAsync).
Compression engine & gzip utils
src/core/compression-engine.ts, src/utils/gzip.ts, hooks/helpers/gzip.ps1
Refactored compression API/heuristics; TS gzip utilities for Buffer/string and atomic gzipped file persistence; matching PS gzip helpers for orchestrator use.
Caching & memoization utilities
src/utils/lru-cache.ts, src/utils/lru-memoize.ts
New generic LruCache with TTL/stats/eviction and lruMemoize with global memoRegistry, stampede protection, prune/clear APIs, and integration points for memoizing IO-heavy tools.
Server wiring & validation
src/server/index.ts, src/validation/tool-schemas.ts
Wired OptimizationStorageTool and ContextDeltaTool into server, instantiated SessionManager, tokenizer and summarizer wiring, memoized smart-read/grep/glob wrappers, extended tool schemas (optimization_storage, context_delta, count_tokens accepts modelName), and updated shutdown cleanup.
Types & config
src/core/types.ts, src/core/config.ts
Added OptimizationConfig types, extended HypercontextConfig, introduced Zod validation and deep-merge semantics, ConfigManager write-defaults option, and accessors getOptimizationConfig() / getModelTokenLimit().
Tests & benchmarks
tests/unit/*.test.ts, tests/benchmarks/results.json
Added many unit tests (config, session, tokenizers, lru cache, diff, gzip, lru-memoize, summarization) and adjusted async teardowns; updated benchmark results JSON.

Sequence Diagram(s)

sequenceDiagram
    participant PS as PowerShell Orchestrator
    participant Tool as ContextDeltaTool
    participant Manager as SessionManager
    participant Session as Session
    participant Diff as DiffUtils

    PS->>Tool: run({operation:'compute-delta', sessionId, filePath, currentContent})
    activate Tool
    Tool->>Manager: getSession(sessionId)
    Manager-->>Tool: Session
    Tool->>Session: getFileContent(filePath)
    Session-->>Tool: previousContent
    alt previousContent exists
        Tool->>Diff: calculateDelta(previousContent, currentContent, filePath)
        Diff-->>Tool: delta
        Tool->>Manager: updateFileState(sessionId, filePath, currentContent)
        Tool-->>PS: {success:true, isBaseline:false, delta, sizes}
    else baseline
        Tool->>Manager: updateFileState(sessionId, filePath, currentContent)
        Tool-->>PS: {success:true, isBaseline:true, delta: currentContent, sizes}
    end
    deactivate Tool
Loading
sequenceDiagram
    participant Server as MCP Server
    participant Factory as TokenizerFactory
    participant Tokenizer as ITokenizer
    participant Caller as ToolHandler

    Caller->>Server: request count_tokens(text, modelName?)
    Server->>Factory: create(modelName)
    Factory-->>Server: tokenizer
    Server->>Tokenizer: countTokens(text)
    activate Tokenizer
    Tokenizer-->>Server: tokenCount
    deactivate Tokenizer
    Server-->>Caller: return tokenCount (annotated with model)
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Poem

🐰
I hopped through logs and zipped a file,
I cached small crumbs to travel light a mile,
I counted tokens, trimmed the rest,
Sessions nap safe in SQLite nest,
Tiny hops, big saves — a rabbit's optimization smile.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/sophisticated-token-counting-issue-4

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link
Copy Markdown

Commit Message Format Issue

Your commit messages don't follow the Conventional Commits specification.

Required Format:

<type>(<optional scope>): <description>

[optional body]

[optional footer]

Valid Types:

  • feat: A new feature
  • fix: A bug fix
  • docs: Documentation only changes
  • style: Changes that don't affect code meaning (white-space, formatting)
  • refactor: Code change that neither fixes a bug nor adds a feature
  • perf: Code change that improves performance
  • test: Adding missing tests or correcting existing tests
  • build: Changes that affect the build system or external dependencies
  • ci: Changes to CI configuration files and scripts
  • chore: Other changes that don't modify src or test files
  • revert: Reverts a previous commit

Examples:

feat(auth): add OAuth2 authentication
fix(api): resolve race condition in token refresh
docs(readme): update installation instructions
refactor(core): simplify token optimization logic

Breaking Changes:

Add BREAKING CHANGE: in the footer or append ! after the type:

feat!: remove deprecated API endpoints

Please amend your commit messages to follow this format.

Learn more: Conventional Commits

@github-actions
Copy link
Copy Markdown

Performance Benchmark Results


@github-actions
Copy link
Copy Markdown

Commit Message Format Issue

Your commit messages don't follow the Conventional Commits specification.

Required Format:

<type>(<optional scope>): <description>

[optional body]

[optional footer]

Valid Types:

  • feat: A new feature
  • fix: A bug fix
  • docs: Documentation only changes
  • style: Changes that don't affect code meaning (white-space, formatting)
  • refactor: Code change that neither fixes a bug nor adds a feature
  • perf: Code change that improves performance
  • test: Adding missing tests or correcting existing tests
  • build: Changes that affect the build system or external dependencies
  • ci: Changes to CI configuration files and scripts
  • chore: Other changes that don't modify src or test files
  • revert: Reverts a previous commit

Examples:

feat(auth): add OAuth2 authentication
fix(api): resolve race condition in token refresh
docs(readme): update installation instructions
refactor(core): simplify token optimization logic

Breaking Changes:

Add BREAKING CHANGE: in the footer or append ! after the type:

feat!: remove deprecated API endpoints

Please amend your commit messages to follow this format.

Learn more: Conventional Commits

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 17

Note

Due to the large number of review comments, Critical, Major severity comments were prioritized as inline comments.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
hooks/handlers/token-optimizer-orchestrator.ps1 (1)

14-31: ⚠️ Potential issue | 🔴 Critical

Bootstrap logging before the first Write-Log call.

Line 15 runs before logging.ps1 is dot-sourced and before the log file path is initialized. When this script is executed directly, the first diagnostic log can fail before the handler starts, and the catch on Line 24 has the same problem.

Suggested fix
 $SCRIPT_VERSION = Get-Date -Format 'yyyyMMdd.HHmmss'
-Write-Log "token-optimizer-orchestrator.ps1 version $SCRIPT_VERSION loaded. Phase=$Phase, Action=$Action" "DEBUG"
-
-# Read JSON from temp file if provided
-# DO NOT delete temp file - dispatcher will clean it up after all handlers run
-$InputJson = ""
-if ($InputJsonFile -and (Test-Path $InputJsonFile)) {
-    try {
-        $InputJson = Get-Content -Path $InputJsonFile -Raw -Encoding UTF8
-    } catch {
-        Write-Log "Failed to read InputJsonFile: $($_.Exception.Message)" "ERROR"
-    }
-}
-
 $HELPERS_DIR = "C:\Users\cheat\.claude-global\hooks\helpers"
 $INVOKE_MCP = "$HELPERS_DIR\invoke-mcp.ps1"
 . "$PSScriptRoot\..\helpers\logging.ps1"
-$LOG_FILE = "C:\Users\cheat\.claude-global\hooks\logs\token-optimizer-orchestrator.log"
+$script:LOG_FILE = "C:\Users\cheat\.claude-global\hooks\logs\token-optimizer-orchestrator.log"
+Write-Log "token-optimizer-orchestrator.ps1 version $SCRIPT_VERSION loaded. Phase=$Phase, Action=$Action" "DEBUG"
+
+# Read JSON from temp file if provided
+# DO NOT delete temp file - dispatcher will clean it up after all handlers run
+$InputJson = ""
+if ($InputJsonFile -and (Test-Path $InputJsonFile)) {
+    try {
+        $InputJson = Get-Content -Path $InputJsonFile -Raw -Encoding UTF8
+    } catch {
+        Write-Log "Failed to read InputJsonFile: $($_.Exception.Message)" "ERROR"
+    }
+}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@hooks/handlers/token-optimizer-orchestrator.ps1` around lines 14 - 31, The
initial Write-Log runs before logging.ps1 is dot-sourced and $LOG_FILE is set;
move the logging bootstrap (set $HELPERS_DIR, $LOG_FILE and dot-source
"$PSScriptRoot\..\helpers\logging.ps1") to the top of
token-optimizer-orchestrator.ps1 before any Write-Log or Get-Content error
handling (e.g., before the $SCRIPT_VERSION Write-Log and the try/catch that
reads $InputJsonFile), so Write-Log is defined and $LOG_FILE initialized;
alternatively add a minimal fallback Write-Log stub early to avoid failures if
logging.ps1 is unavailable. Ensure references to $HELPERS_DIR, $INVOKE_MCP,
$LOG_FILE and the dot-sourcing of logging.ps1 are preserved and adjusted to use
$PSScriptRoot consistently.
🟡 Minor comments (5)
tests/benchmarks/results.json-72-72 (1)

72-72: ⚠️ Potential issue | 🟡 Minor

Clarify memoryUsed semantics—rename to memoryDelta or document delta calculation.

memoryUsed values are calculated as heap memory delta (memAfter - memBefore in performance.bench.ts:89), making negative values valid when operations release memory. However, the field name implies absolute memory usage, creating ambiguity for downstream consumers. Either rename to memoryDelta or add inline documentation to the type definition clarifying it represents heap memory change, not absolute usage.

Also applies to: 84-84, 120-120, 228-228, 300-300, 312-312

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/benchmarks/results.json` at line 72, The JSON field memoryUsed actually
stores a heap delta computed as memAfter - memBefore (see performance.bench.ts
where memAfter and memBefore are used), which makes negative values valid;
update the schema to make this explicit by renaming memoryUsed to memoryDelta
across the codebase (including the type/interface that defines the benchmark
result and all consumers) or, if renaming is undesirable, add an inline doc
comment on the benchmark result type (e.g., the BenchmarkResult/metrics type
that declares memoryUsed) stating it represents heap memory change (memAfter -
memBefore) rather than absolute memory, and update any code that reads/writes
memoryUsed to reflect the new name or documented semantics.
hooks/helpers/logging.ps1-35-38 (1)

35-38: ⚠️ Potential issue | 🟡 Minor

Avoid assigning to PowerShell's $StackTrace automatic variable.

$stackTrace is case-insensitive, so this assignment can stomp the built-in automatic variable and make later diagnostics harder to trust. Use a different local name here.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@hooks/helpers/logging.ps1` around lines 35 - 38, The local variable
$stackTrace in the logging helper shadows PowerShell's automatic $StackTrace
(case-insensitive); rename the local to something like $exceptionStackTrace (or
$exStackTrace) and update the assignment and subsequent Write-Log call to use
that new name, leaving $errorMessage, $Message, $Exception and Write-Log
unchanged so you no longer stomp the automatic variable.
src/core/summarization.ts-25-47 (1)

25-47: ⚠️ Potential issue | 🟡 Minor

Guard tiny maxChars values before computing head/tail budgets.

When maxChars is 0, negative, or smaller than the truncation marker, keepTail becomes negative and the final slice() can return more text instead of less. That breaks the contract of using this class to cap summary size.

💡 Proposed fix
 export class TruncatingSummarizer implements ISummarizer {
+    private static readonly TRUNCATION_MARKER = '\n... [truncated] ...\n';
     private readonly maxChars: number;
 
     constructor(options: TruncatingSummarizerOptions = {}) {
-        this.maxChars = options.maxChars ?? 2000;
+        this.maxChars = Math.max(1, options.maxChars ?? 2000);
     }
@@
-        const keepHead = Math.floor(this.maxChars * 0.4);
-        const keepTail = this.maxChars - keepHead - 20;
+        if (this.maxChars <= TruncatingSummarizer.TRUNCATION_MARKER.length) {
+            return TruncatingSummarizer.TRUNCATION_MARKER.slice(0, this.maxChars);
+        }
+
+        const budget = this.maxChars - TruncatingSummarizer.TRUNCATION_MARKER.length;
+        const keepHead = Math.floor(budget * 0.4);
+        const keepTail = budget - keepHead;
         return (
             joined.slice(0, keepHead) +
-            '\n... [truncated] ...\n' +
+            TruncatingSummarizer.TRUNCATION_MARKER +
             joined.slice(-keepTail)
         );
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/core/summarization.ts` around lines 25 - 47, The summarize method can
produce negative keepTail when this.maxChars is 0 or very small; update the
constructor or summarize to guard against tiny maxChars by enforcing a minimum
(e.g., at least the length of the truncation marker + 1) or by clamping computed
budgets: compute keepHead = Math.max(0, Math.floor(this.maxChars * 0.4)) and
keepTail = Math.max(0, this.maxChars - keepHead - truncationMarker.length) (use
the actual marker string used in summarize), and if keepHead + keepTail <
this.maxChars then fall back to returning joined.slice(0, this.maxChars) or a
safe truncated string; adjust the logic in the
constructor/TruncatingSummarizerOptions and in summarize (referencing
constructor, this.maxChars, summarize, keepHead, keepTail, and the truncation
marker) so slices never receive negative indexes and the result respects the
maxChars cap.
src/core/session-manager.ts-138-143 (1)

138-143: ⚠️ Potential issue | 🟡 Minor

Session restore currently loses createdAt.

SessionSnapshot persists createdAt, but the Session.fromSnapshot() path used here only reapplies history, fileState, and updatedAt from src/core/session.ts. After a restart every loaded session looks newly created, so persistence is not full-fidelity yet.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/core/session-manager.ts` around lines 138 - 143, The restore loop is
losing the persisted createdAt because Session.fromSnapshot currently only
reapplies history, fileState, and updatedAt; update the restoration so createdAt
is preserved: modify Session.fromSnapshot (or immediately after calling it in
the loop) to set session.createdAt = snapshot.createdAt (or accept createdAt in
the Session.fromSnapshot signature and assign it when constructing the Session)
before calling this.sessions.set(session.id, session) so restored sessions keep
their original creation timestamp.
src/core/session.ts-60-60 (1)

60-60: ⚠️ Potential issue | 🟡 Minor

Clamp preserveTailRatio to a safe range.

Line 60 accepts arbitrary values. Ratios above 1 can make compression a no-op even when token limits are exceeded.

Proposed fix
-        this.preserveTailRatio = options.preserveTailRatio ?? DEFAULT_PRESERVE_TAIL_RATIO;
+        const ratio = options.preserveTailRatio ?? DEFAULT_PRESERVE_TAIL_RATIO;
+        this.preserveTailRatio = Math.min(1, Math.max(0, ratio));
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/core/session.ts` at line 60, The assignment to this.preserveTailRatio
allows values >1 or <0 which can break compression; clamp the value to a safe
range (0..1) before storing it. In the Session constructor (the line setting
this.preserveTailRatio), replace the direct nullish-coalescing assignment with
logic that takes options.preserveTailRatio ?? DEFAULT_PRESERVE_TAIL_RATIO, then
applies a clamp (e.g., Math.max(0, Math.min(1, value))) and assign that clamped
value to this.preserveTailRatio so it never falls outside [0,1].
🧹 Nitpick comments (5)
tests/benchmarks/results.json (1)

1-314: Add run metadata to make benchmark snapshots comparable.

The numeric snapshot is useful, but reproducibility is weak without environment metadata (runtime version, OS/arch, sample size/warmups, timestamp/commit). Consider storing metadata alongside these results to reduce false regression signals.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/benchmarks/results.json` around lines 1 - 314, Add a top-level metadata
object to tests/benchmarks/results.json so snapshots are comparable: wrap the
existing array into an object with keys "metadata" and "results" (or add a
sibling "metadata" field) and populate metadata with runtime info (runtime
name/version, OS, architecture, CPU count, total memory), benchmark parameters
(sampleSize, warmups, concurrency), VCS info (commit hash, branch), and a
timestamp; ensure consumers that read the old array still find data by keeping
"results" as the exact existing array and update any parsers to read results via
the "results" key.
tests/unit/cache-engine.test.ts (1)

51-69: Prefer deterministic teardown over a fixed 100 ms sleep.

This delay makes the suite slower and can still flake on busy CI runners. A small retry loop around deleting the DB/WAL/SHM files, or reusing cleanupCacheArtifacts(), will be more reliable than guessing when SQLite has released its handles.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/unit/cache-engine.test.ts` around lines 51 - 69, Teardown uses a fixed
100ms sleep after cache.close() which can be flaky; replace it with a
deterministic retry that waits for file handles to be released or call the
existing cleanup helper instead. In the afterEach block (where cache.close(),
testDbPath, walPath, shmPath are used) remove the setTimeout and implement a
short retry loop (e.g., attempt up to N times with small awaits) that checks
fs.existsSync(testDbPath/wal/shm) and only unlinkSync when files are absent/
deletable, or simply invoke cleanupCacheArtifacts() if available to perform the
safe deletion; ensure cache.close() is awaited or its close completion is
observed before starting retries.
tests/unit/config.test.ts (1)

35-52: Assert that default model limits survive user overrides.

This test only checks the added custom-model. If config merging replaces modelTokenLimits wholesale instead of merging it, the built-in limits would disappear and this case would still pass.

Suggested assertion
     expect(opt.compressionTokenThreshold).toBe(0.9);
     expect(opt.quality).toBe('max');
     expect(mgr.getModelTokenLimit('custom-model')).toBe(500000);
+    expect(mgr.getModelTokenLimit('gpt-4')).toBe(128000);
     // Unrelated defaults still filled in
     expect(opt.compressionPreserveThreshold).toBe(0.3);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/unit/config.test.ts` around lines 35 - 52, Update the "overrides
defaults with user config" test to assert that existing built-in model token
limits are preserved when user config adds entries: after creating ConfigManager
and verifying the custom-model value, add an assertion using
mgr.getModelTokenLimit for a known built-in model (e.g., 'gpt-4' or whatever
default map contains) to ensure it still returns the default limit; this
verifies modelTokenLimits are merged rather than replaced.
tests/unit/lru-cache.test.ts (1)

47-64: Use fake timers for the TTL cases.

The 20ms/30ms sleeps make these assertions timing-sensitive on busy CI runners. Advancing Jest's fake clock would make expiry and prune behavior deterministic.

Suggested direction
-import { describe, it, expect } from '@jest/globals';
+import { describe, it, expect, jest, afterEach } from '@jest/globals';
 import { LruCache } from '../../src/utils/lru-cache.js';

 describe('LruCache', () => {
+  afterEach(() => {
+    jest.useRealTimers();
+  });
+
   it('rejects non-positive maxSize', () => {
     expect(() => new LruCache<string, number>(0)).toThrow();
     expect(() => new LruCache<string, number>(-1)).toThrow();
@@
-  it('expires entries past the TTL', async () => {
+  it('expires entries past the TTL', () => {
+    jest.useFakeTimers();
     const cache = new LruCache<string, number>(2, 20);
     cache.set('a', 1);
-    await new Promise((r) => setTimeout(r, 30));
+    jest.advanceTimersByTime(30);
     expect(cache.get('a')).toBeUndefined();
     expect(cache.stats().expired).toBe(1);
   });

-  it('prune removes expired entries', async () => {
+  it('prune removes expired entries', () => {
+    jest.useFakeTimers();
     const cache = new LruCache<string, number>(4, 20);
     cache.set('a', 1);
     cache.set('b', 2);
-    await new Promise((r) => setTimeout(r, 30));
+    jest.advanceTimersByTime(30);
     cache.set('c', 3);
     const removed = cache.prune();
     expect(removed).toBe(2);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/unit/lru-cache.test.ts` around lines 47 - 64, Replace real-time sleeps
in the two TTL tests with Jest fake timers: call jest.useFakeTimers() before
creating the LruCache instances, advance the clock by the desired milliseconds
(e.g., jest.advanceTimersByTime(30)) instead of awaiting setTimeout, and restore
timers after the test; update the 'expires entries past the TTL' and 'prune
removes expired entries' tests that reference LruCache to use
jest.advanceTimersByTime so expiry and prune() behavior is deterministic on CI.
tests/unit/session.test.ts (1)

52-76: Add a persistence round-trip case for SessionManager.

These tests only cover the in-memory lifecycle. Since session persistence across restarts is one of the new behaviors, a bug in save/load or snapshot serialization would still pass this suite.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/unit/session.test.ts` around lines 52 - 76, Add a unit test that
exercises SessionManager persistence: create a SessionManager, call
createSession and addMessage (using createSession, addMessage), then persist the
state using the SessionManager persistence API (e.g., snapshot/save/serialize
method provided by the class), instantiate a new SessionManager and
restore/load/deserialize that persisted snapshot, and finally assert the
restored manager.getSession(session.id) returns a session with the same metadata
and that session.getHistory() preserves the messages (including any
compression/summary behavior); this will catch save/load or snapshot
serialization bugs.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@hooks/dispatcher.ps1`:
- Line 11: The dispatcher currently dot-sources the shared helper (.
"$PSScriptRoot\helpers\logging.ps1") before the protected try path, making that
file a single point of failure; move that dot-source into the existing protected
try block (or the safe startup/initialization section) and guard it with
Test-Path and a Try/Catch so a missing or malformed hooks/helpers/logging.ps1
does not abort execution—if the file is absent or errors, fallback to minimal
no-op logging functions or a lightweight inline logger so the dispatcher can
continue to the non-blocking error path.

In `@hooks/handlers/token-optimizer-orchestrator.ps1`:
- Around line 1950-1956: The cache-hit branch is reading properties directly off
$retrieveResult but OptimizationStorageTool.retrieve() returns an object with {
success, result }, so update the handling to pull values from
$retrieveResult.result (e.g. use $retrieveResult.result.optimizedText,
$retrieveResult.result.optimizedTokens, $retrieveResult.result.tokensSaved)
before decoding and computing stats; ensure you null-check
$retrieveResult.result and preserve existing variable names ($optimizedText,
$afterTokens, $saved, $percent) so downstream code (and the logging call using
$originalTextHash) continues to work.

In `@hooks/helpers/logging.ps1`:
- Around line 22-25: The logging helper writes $logMessage to $script:LOG_FILE
but doesn't ensure the parent directory exists; before the first Out-File call
(e.g., in the function or script that defines $timestamp/$logMessage), compute
the parent directory with Split-Path $script:LOG_FILE -Parent and create it if
missing (Test-Path / New-Item -ItemType Directory -Force) so the initial write
never fails; update the write path in hooks/helpers/logging.ps1 (references:
$script:LOG_FILE, $logMessage, Out-File) to perform this check/create once
before any Out-File is executed.

In `@src/analytics/optimization-storage.ts`:
- Around line 50-60: The INSERT is hard-coding 'brotli' instead of using the
algorithm returned by CompressionEngine.compress(), and the corresponding reader
(get()) ignores compression_algorithm; update the write path (where compressed =
this.compressionEngine.compress(entry.optimizedText) is used) to persist
compressed.algorithm (or compressed.algorithmName) into the
compression_algorithm column instead of the literal 'brotli', and update the
read path in get() to SELECT compression_algorithm and pass that algorithm value
into CompressionEngine.decompress()/decompress call so decompression uses the
stored algorithm; apply the same change to the other insert block referenced
around lines 69-88 so all writes/readers consistently store and use the actual
compression_algorithm.

In `@src/core/compression-engine.ts`:
- Around line 43-47: The decompress method currently assumes Brotli and throws
on legacy/plaintext or malformed blobs; update decompress (in
compression-engine.ts) to attempt brotliDecompressSync(buffer) inside a
try/catch and if it throws fall back to returning buffer.toString('utf8') (or
decode as-is) so plaintext rows are returned instead of raising; reference the
decompress function and brotliDecompressSync when making this change.
- Around line 73-83: getCompressionStats currently hardcodes the 500-byte
threshold when setting the `recommended` flag, causing inconsistency with
caller-driven thresholds like `shouldCompress(text, minSize)`; change
`getCompressionStats(text)` to accept an optional `minSize: number` (default
500) and compute `recommended` using that parameter (e.g., `result.originalSize
>= minSize && result.percentSaved >= 20`), then update callers (including the
CompressionModule usage) to pass their configured threshold (e.g., the same
`minSize` used with `shouldCompress`) so both methods use the same threshold
logic.

In `@src/core/config.ts`:
- Around line 172-173: getOptimizationConfig currently returns the live
OptimizationConfig (possibly the shared DEFAULT_OPTIMIZATION), which allows
callers to mutate internal state (e.g., modelTokenLimits) and leak into other
ConfigManager instances; change getOptimizationConfig to return a defensive
copy: create and return a new OptimizationConfig object (shallow clone or deep
clone as appropriate for fields like modelTokenLimits) instead of returning
this.config.optimization or DEFAULT_OPTIMIZATION directly so callers cannot
mutate the manager's internal/default instance.
- Line 168: The current shallow merge of optimization (optimization: {
...DEFAULT_OPTIMIZATION, ...(user.optimization ?? {}) }) allows
user.optimization.modelTokenLimits to completely replace
DEFAULT_OPTIMIZATION.modelTokenLimits; change the merge so modelTokenLimits is
merged deeply: keep the top-level spread but explicitly set modelTokenLimits to
a merged object that spreads DEFAULT_OPTIMIZATION.modelTokenLimits and then
spreads user.optimization?.modelTokenLimits (or {}), ensuring custom entries are
added/override while preserving defaults; update the code around the
optimization assignment in src/core/config.ts to implement this deep-merge for
modelTokenLimits.

In `@src/core/session.ts`:
- Around line 157-169: fromSnapshot currently restores id, maxTokens, history,
fileState, and updatedAt but omits restoring createdAt from the SessionSnapshot;
update the Session.restore path in fromSnapshot to assign session.createdAt =
snapshot.createdAt (after constructing the Session) so the persisted createdAt
metadata is preserved when recreating a Session from a snapshot.

In `@src/core/tokenizers/heuristic-tokenizer.ts`:
- Around line 45-53: The method HeuristicTokenizer.countTokens currently uses
the full text as the cache key, which keeps entire inputs in memory; change it
to use a bounded cache key instead (e.g., derive a fixed-size key by hashing the
text with a stable digest like SHA-256 or truncating to a MAX_CACHE_KEY_LENGTH
prefix) before calling this.cache.get/set. Update countTokens to compute the
bounded key (use a new constant MAX_CACHE_KEY_LENGTH or a digest util), call
HeuristicTokenizer.detectContentType(text) and CHARS_PER_TOKEN as before, and
then store/retrieve counts using the bounded key rather than the raw text to
avoid retaining full input strings in the LRU.

In `@src/core/tokenizers/tiktoken-tokenizer.ts`:
- Around line 22-28: The cache currently uses the full text string as the key in
countTokens, causing large payloads to be retained; replace the raw-text key
with a compact deterministic fingerprint (e.g., a SHA-256 or xxhash digest) of
text before calling this.cache.get/set so only the small hash is stored; update
countTokens to compute the fingerprint (for very large texts you can hash a
trimmed prefix+suffix if desired) and use that fingerprint as the cache key
while leaving the encoder.encode and cache.set/count logic unchanged.

In `@src/core/tokenizers/tokenizer-factory.ts`:
- Around line 25-30: The model selection sets modelName by checking
CLAUDE_MODEL, ANTHROPIC_MODEL, OPENAI_MODEL before TOKEN_OPTIMIZER_MODEL, so
TOKEN_OPTIMIZER_MODEL can't override broader env vars; change the precedence in
the modelName assignment to check process.env.TOKEN_OPTIMIZER_MODEL first (i.e.,
return process.env.TOKEN_OPTIMIZER_MODEL || process.env.CLAUDE_MODEL ||
process.env.ANTHROPIC_MODEL || process.env.OPENAI_MODEL || 'gpt-4') so the
tokenizer factory uses the tool-specific override when present.

In `@src/tools/context-delta-tool.ts`:
- Around line 71-77: The compute-delta code is updating only in-memory state via
session.setFileContent(filePath, currentContent), which skips persistence;
replace that call with the session manager's persistence API by invoking
this.sessionManager.updateFileState(sessionId, filePath, currentContent) (or
call it in addition to session.setFileContent if in-memory must be updated
first), and handle its return/await if it's async so the persisted file state is
guaranteed before continuing. Ensure you reference the existing
sessionManager.getSession(...) usage and remove or retain session.setFileContent
only as needed to avoid duplicate/unsynced updates.
- Around line 114-123: The current clear method in context-delta-tool.ts
(private clear(options: ContextDeltaOptions): ContextDeltaResponse) incorrectly
resets a file by setting empty content via
session.setFileContent(options.filePath, '') which leaves a `previous` baseline
and skips persistence; instead add a new session-manager API
clearFileState(sessionId: string, filePath: string) that deletes the file's
baseline state and persists the session, then update this clear method to call
this new sessionManager.clearFileState(options.sessionId, options.filePath)
after validating the session (using sessionManager.getSession), returning the
same success/error response; reference the symbols sessionManager.getSession,
ContextDeltaOptions, ContextDeltaResponse, and the proposed clearFileState to
locate and implement the change.

In `@src/tools/optimization-storage-tool.ts`:
- Around line 104-138: The inputSchema in OPTIMIZATION_STORAGE_TOOL_DEFINITION
makes only operation required so a "store" call can miss required fields; change
inputSchema to be discriminated by operation (e.g., use oneOf or if/then with
property "operation") so that when operation=="store" the schema requires
originalTextHash, optimizedText, originalTokens, optimizedTokens, tokensSaved,
and when operation=="retrieve" it requires originalTextHash only; update the
parallel schema in src/validation/tool-schemas.ts to match the same
discriminated/conditional requirements and keep the enum for operation as
['store','retrieve'].

In `@src/utils/lru-cache.ts`:
- Around line 105-119: The prune() method in LruCache currently returns early
when defaultTtlMs === 0 which prevents removing entries that have per-entry
TTLs; update LruCache.prune() to always iterate this.cache and check each
entry.expiresAt (not gated by this.defaultTtlMs), deleting entries where
expiresAt !== 0 && Date.now() > expiresAt and incrementing removed and
this.expired accordingly so per-entry TTLs are honored.

In `@src/validation/tool-schemas.ts`:
- Around line 417-424: The schema currently allows { operation: 'store' } but
the store path requires several fields at runtime; change
OptimizationStorageSchema to a discriminated union on operation: use
z.discriminatedUnion('operation', [ z.object({ operation: z.literal('store'),
originalTextHash: z.string(), optimizedText: z.string(), originalTokens:
z.number(), optimizedTokens: z.number(), tokensSaved: z.number() }), z.object({
operation: z.literal('retrieve'), originalTextHash: z.string().optional(),
optimizedText: z.string().optional(), originalTokens: z.number().optional(),
optimizedTokens: z.number().optional(), tokensSaved: z.number().optional() }) ])
so that calls to validateToolArgs fail for missing store fields; update any
references to OptimizationStorageSchema accordingly.

---

Outside diff comments:
In `@hooks/handlers/token-optimizer-orchestrator.ps1`:
- Around line 14-31: The initial Write-Log runs before logging.ps1 is
dot-sourced and $LOG_FILE is set; move the logging bootstrap (set $HELPERS_DIR,
$LOG_FILE and dot-source "$PSScriptRoot\..\helpers\logging.ps1") to the top of
token-optimizer-orchestrator.ps1 before any Write-Log or Get-Content error
handling (e.g., before the $SCRIPT_VERSION Write-Log and the try/catch that
reads $InputJsonFile), so Write-Log is defined and $LOG_FILE initialized;
alternatively add a minimal fallback Write-Log stub early to avoid failures if
logging.ps1 is unavailable. Ensure references to $HELPERS_DIR, $INVOKE_MCP,
$LOG_FILE and the dot-sourcing of logging.ps1 are preserved and adjusted to use
$PSScriptRoot consistently.

---

Minor comments:
In `@hooks/helpers/logging.ps1`:
- Around line 35-38: The local variable $stackTrace in the logging helper
shadows PowerShell's automatic $StackTrace (case-insensitive); rename the local
to something like $exceptionStackTrace (or $exStackTrace) and update the
assignment and subsequent Write-Log call to use that new name, leaving
$errorMessage, $Message, $Exception and Write-Log unchanged so you no longer
stomp the automatic variable.

In `@src/core/session-manager.ts`:
- Around line 138-143: The restore loop is losing the persisted createdAt
because Session.fromSnapshot currently only reapplies history, fileState, and
updatedAt; update the restoration so createdAt is preserved: modify
Session.fromSnapshot (or immediately after calling it in the loop) to set
session.createdAt = snapshot.createdAt (or accept createdAt in the
Session.fromSnapshot signature and assign it when constructing the Session)
before calling this.sessions.set(session.id, session) so restored sessions keep
their original creation timestamp.

In `@src/core/session.ts`:
- Line 60: The assignment to this.preserveTailRatio allows values >1 or <0 which
can break compression; clamp the value to a safe range (0..1) before storing it.
In the Session constructor (the line setting this.preserveTailRatio), replace
the direct nullish-coalescing assignment with logic that takes
options.preserveTailRatio ?? DEFAULT_PRESERVE_TAIL_RATIO, then applies a clamp
(e.g., Math.max(0, Math.min(1, value))) and assign that clamped value to
this.preserveTailRatio so it never falls outside [0,1].

In `@src/core/summarization.ts`:
- Around line 25-47: The summarize method can produce negative keepTail when
this.maxChars is 0 or very small; update the constructor or summarize to guard
against tiny maxChars by enforcing a minimum (e.g., at least the length of the
truncation marker + 1) or by clamping computed budgets: compute keepHead =
Math.max(0, Math.floor(this.maxChars * 0.4)) and keepTail = Math.max(0,
this.maxChars - keepHead - truncationMarker.length) (use the actual marker
string used in summarize), and if keepHead + keepTail < this.maxChars then fall
back to returning joined.slice(0, this.maxChars) or a safe truncated string;
adjust the logic in the constructor/TruncatingSummarizerOptions and in summarize
(referencing constructor, this.maxChars, summarize, keepHead, keepTail, and the
truncation marker) so slices never receive negative indexes and the result
respects the maxChars cap.

In `@tests/benchmarks/results.json`:
- Line 72: The JSON field memoryUsed actually stores a heap delta computed as
memAfter - memBefore (see performance.bench.ts where memAfter and memBefore are
used), which makes negative values valid; update the schema to make this
explicit by renaming memoryUsed to memoryDelta across the codebase (including
the type/interface that defines the benchmark result and all consumers) or, if
renaming is undesirable, add an inline doc comment on the benchmark result type
(e.g., the BenchmarkResult/metrics type that declares memoryUsed) stating it
represents heap memory change (memAfter - memBefore) rather than absolute
memory, and update any code that reads/writes memoryUsed to reflect the new name
or documented semantics.

---

Nitpick comments:
In `@tests/benchmarks/results.json`:
- Around line 1-314: Add a top-level metadata object to
tests/benchmarks/results.json so snapshots are comparable: wrap the existing
array into an object with keys "metadata" and "results" (or add a sibling
"metadata" field) and populate metadata with runtime info (runtime name/version,
OS, architecture, CPU count, total memory), benchmark parameters (sampleSize,
warmups, concurrency), VCS info (commit hash, branch), and a timestamp; ensure
consumers that read the old array still find data by keeping "results" as the
exact existing array and update any parsers to read results via the "results"
key.

In `@tests/unit/cache-engine.test.ts`:
- Around line 51-69: Teardown uses a fixed 100ms sleep after cache.close() which
can be flaky; replace it with a deterministic retry that waits for file handles
to be released or call the existing cleanup helper instead. In the afterEach
block (where cache.close(), testDbPath, walPath, shmPath are used) remove the
setTimeout and implement a short retry loop (e.g., attempt up to N times with
small awaits) that checks fs.existsSync(testDbPath/wal/shm) and only unlinkSync
when files are absent/ deletable, or simply invoke cleanupCacheArtifacts() if
available to perform the safe deletion; ensure cache.close() is awaited or its
close completion is observed before starting retries.

In `@tests/unit/config.test.ts`:
- Around line 35-52: Update the "overrides defaults with user config" test to
assert that existing built-in model token limits are preserved when user config
adds entries: after creating ConfigManager and verifying the custom-model value,
add an assertion using mgr.getModelTokenLimit for a known built-in model (e.g.,
'gpt-4' or whatever default map contains) to ensure it still returns the default
limit; this verifies modelTokenLimits are merged rather than replaced.

In `@tests/unit/lru-cache.test.ts`:
- Around line 47-64: Replace real-time sleeps in the two TTL tests with Jest
fake timers: call jest.useFakeTimers() before creating the LruCache instances,
advance the clock by the desired milliseconds (e.g.,
jest.advanceTimersByTime(30)) instead of awaiting setTimeout, and restore timers
after the test; update the 'expires entries past the TTL' and 'prune removes
expired entries' tests that reference LruCache to use jest.advanceTimersByTime
so expiry and prune() behavior is deterministic on CI.

In `@tests/unit/session.test.ts`:
- Around line 52-76: Add a unit test that exercises SessionManager persistence:
create a SessionManager, call createSession and addMessage (using createSession,
addMessage), then persist the state using the SessionManager persistence API
(e.g., snapshot/save/serialize method provided by the class), instantiate a new
SessionManager and restore/load/deserialize that persisted snapshot, and finally
assert the restored manager.getSession(session.id) returns a session with the
same metadata and that session.getHistory() preserves the messages (including
any compression/summary behavior); this will catch save/load or snapshot
serialization bugs.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 55f222e3-4695-47ba-bf7d-e7f7d9c0c830

📥 Commits

Reviewing files that changed from the base of the PR and between 4503d0b and 96f6e2b.

📒 Files selected for processing (27)
  • hooks/dispatcher.ps1
  • hooks/handlers/token-optimizer-orchestrator.ps1
  • hooks/helpers/logging.ps1
  • src/analytics/optimization-storage.ts
  • src/core/compression-engine.ts
  • src/core/config.ts
  • src/core/session-manager.ts
  • src/core/session.ts
  • src/core/summarization.ts
  • src/core/tokenizers/heuristic-tokenizer.ts
  • src/core/tokenizers/i-tokenizer.ts
  • src/core/tokenizers/tiktoken-tokenizer.ts
  • src/core/tokenizers/tokenizer-factory.ts
  • src/core/types.ts
  • src/server/index.ts
  • src/tools/context-delta-tool.ts
  • src/tools/optimization-storage-tool.ts
  • src/utils/diff.ts
  • src/utils/lru-cache.ts
  • src/validation/tool-schemas.ts
  • tests/benchmarks/results.json
  • tests/unit/cache-engine.test.ts
  • tests/unit/config.test.ts
  • tests/unit/diff.test.ts
  • tests/unit/lru-cache.test.ts
  • tests/unit/session.test.ts
  • tests/unit/tokenizers.test.ts

Comment thread hooks/dispatcher.ps1 Outdated
Comment thread hooks/handlers/token-optimizer-orchestrator.ps1 Outdated
Comment thread hooks/helpers/logging.ps1 Outdated
Comment thread src/analytics/optimization-storage.ts Outdated
Comment thread src/core/compression-engine.ts
Comment thread src/tools/context-delta-tool.ts Outdated
Comment thread src/tools/context-delta-tool.ts Outdated
Comment thread src/tools/optimization-storage-tool.ts
Comment thread src/utils/lru-cache.ts Outdated
Comment thread src/validation/tool-schemas.ts Outdated
ooples and others added 7 commits April 19, 2026 21:43
…r factory

Addresses the audit gaps:

- SessionManager: atomic persist (tmp + rename), debounced writes
  (~250ms), error-isolated (disk-full no longer crashes the server),
  zod-validated load, session TTL eviction (30d), per-file size cap
  (10 MB), and flush() for clean shutdown.
- Session: getHistoryTokenCount now REQUIRES a tokenizer unless the
  caller opts into the char/4 fallback via allowCharHeuristic — the
  whole point of #124 is removing that heuristic. Added
  clearFileContent.
- context_delta tool: compute-delta and clear now route through
  SessionManager.updateFileState / clearFileState so file-state changes
  are durable across restarts instead of in-memory-only.
- TokenizerFactory: caches instances per model (one native tiktoken
  encoder instead of one per call) and exposes disposeAll() for
  shutdown. Added Gemini/Google routing path.
- GoogleAITokenizer: new — calls Google AI countTokens REST with a
  10s timeout, LRU-memoized, surfaces errors so the factory can pick
  a fallback.
- SqliteOptimizationStorage: default path now an absolute
  ~/.token-optimizer/optimization.db and the directory is created
  on demand. Relative "./optimization.db" was unusable when the MCP
  server launched from an unknown cwd.
- Server shutdown: sessionManager.flush(), TokenizerFactory.disposeAll(),
  and optimizationStorage.close() added to the cleanup pipeline.

Refs #121, #122, #124

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ory (#124)

Completes the tokenization framework requirements:

- TokenCounter now delegates tokenization to the pluggable
  TokenizerFactory; the sync count() path still uses a local tiktoken
  encoder for tiktoken-compatible models (we need raw token arrays for
  truncate()), and a new countAsync() path routes through the factory
  for remote models (Google AI). Deeper callers that used the existing
  sync surface keep working unchanged.
- count_tokens MCP tool accepts an optional modelName parameter and
  returns the resolved model in the response. When omitted, the server-
  configured TokenCounter is used.
- TokenCounter.free() no longer tears down the factory-owned tokenizer
  — the factory owns that lifecycle (#16). Local tiktoken encoders are
  freed as before.

Refs #124

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…#120)

Completes the remaining #120 acceptance criteria:

- Adds OptimizationConfig.cacheSettings { maxSize, ttlSeconds } and
  OptimizationConfig.chatCompression { enabled, tokenLimit, strategy }
  — the two sections called out in the issue example config.
- ConfigManager writes ~/.token-optimizer/config.json with DEFAULT_CONFIG
  on first run so the user can edit a real file (was previously
  in-memory-only). writeDefaults can be disabled for tests.
- User config sub-objects deep-merge: overriding `cacheSettings.maxSize`
  no longer wipes out `ttlSeconds`. Zod schema is partial at every
  depth via OptimizationConfigUserSchema.
- server/index.ts: loads ConfigManager on startup and derives the
  SessionManager's defaultMaxTokens from
  chatCompression.tokenLimit ?? modelTokenLimit × compressionTokenThreshold,
  so every Session created through the manager respects the configured
  compression budget.

Refs #120, #121

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Completes the missing LruCache integration from the audit:

- Adds src/utils/lru-memoize.ts — a generic async-fn wrapper backed by
  LruCache (#125). Each wrapped function registers its cache with a
  shared memoRegistry so callers can prune and snapshot stats across
  every memo in one shot.
- server/index.ts wraps runSmartRead, runSmartGrep, and runSmartGlob
  with lruMemoize, sized from optimizationConfig.cacheSettings
  (maxSize / ttlSeconds × 1000). The case handlers call the memoized
  variant, so repeated tool invocations with identical arguments hit
  the LRU instead of re-running the expensive read/search.
- Periodic cleanup: a 5-minute interval calls memoRegistry.pruneAll()
  and logs stats when anything was removed. The timer is unref'd so it
  never keeps the event loop alive.
- Server cleanup handler clears the interval and the memo caches on
  shutdown.

Refs #125

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Addresses the audit gap: the issue requires "use a foundation model to
perform the summarization", the previous ISummarizer default was just
text truncation.

- AnthropicSummarizer — calls /v1/messages with claude-haiku-4-5 by
  default. Needs ANTHROPIC_API_KEY.
- GoogleAISummarizer — calls generativelanguage.googleapis.com with
  gemini-2.5-flash by default. Needs GOOGLE_AI_API_KEY.
- createSummarizerFromEnv() picks the best available summarizer
  (Anthropic → Google → TruncatingSummarizer fallback) so the server
  works unchanged whether or not API keys are configured.
- Both remote summarizers use AbortController with a 30s timeout and
  share a common system prompt that asks for preservation of decisions
  and open TODOs.
- TruncatingSummarizer remains the zero-dep fallback and is used by
  tests to avoid network flakes.

Wired into the server: SessionManager.summarizer is
createSummarizerFromEnv(), so Session.compressHistory uses a real LLM
when a key is present.

Refs #121

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Fulfills the #126 acceptance criteria on the TypeScript side (the PS
side lands in a follow-up commit):

- src/utils/gzip.ts: gzipString / gunzipBuffer primitives, plus
  saveGzippedFile (atomic tmp + rename, removes stale plaintext) and
  loadMaybeGzippedFile (reads .gz if present, otherwise plaintext so
  sessions.json files written before this change still load — the
  "backward compatibility" bullet from the issue).
- SessionManager persistNow now writes sessions via saveGzippedFile;
  load() uses loadMaybeGzippedFile. Existing checks for the sessions
  file at startup also look for the .gz sibling.

Refs #126

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…126)

Completes the PowerShell-side integration gaps found in the audit:

- hooks/helpers/config.ps1: Import-TokenOptimizerConfig loads
  ~/.token-optimizer/config.json (the same file the TS server reads),
  falls back to defaults, and auto-writes the default file on first
  run. Exposes Get-TokenOptimizerOptimizationConfig and
  Get-TokenOptimizerModelTokenLimit for orchestrator consumers (#120).
- hooks/helpers/gzip.ps1: Compress-String / Expand-String primitives
  and Save-GzippedFile / Read-MaybeGzippedFile that mirror the TS
  src/utils/gzip.ts semantics (atomic tmp+rename, backward-compat
  read of plaintext siblings) (#126).
- hooks/helpers/context-delta.ps1: Get-TokenOptimizerSessionId returns
  a stable per-Claude-session UUID persisted at
  ~/.token-optimizer/current-session-id, Reset-TokenOptimizerSessionId
  clears it, Invoke-ContextDelta wraps the context_delta MCP tool via
  the existing Invoke-TokenOptimizer helper (#122).
- Orchestrator dot-sources the three new helpers and Handle-SmartRead
  now calls Invoke-ContextDelta with the smart_read content after a
  successful read so the server's per-session file snapshot stays in
  sync (#122 Phase 2).

Runtime-verified in PS7 that gzip round-trips the content and the
helpers parse without errors.

Refs #120, #122, #126

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown

Commit Message Format Issue

Your commit messages don't follow the Conventional Commits specification.

Required Format:

<type>(<optional scope>): <description>

[optional body]

[optional footer]

Valid Types:

  • feat: A new feature
  • fix: A bug fix
  • docs: Documentation only changes
  • style: Changes that don't affect code meaning (white-space, formatting)
  • refactor: Code change that neither fixes a bug nor adds a feature
  • perf: Code change that improves performance
  • test: Adding missing tests or correcting existing tests
  • build: Changes that affect the build system or external dependencies
  • ci: Changes to CI configuration files and scripts
  • chore: Other changes that don't modify src or test files
  • revert: Reverts a previous commit

Examples:

feat(auth): add OAuth2 authentication
fix(api): resolve race condition in token refresh
docs(readme): update installation instructions
refactor(core): simplify token optimization logic

Breaking Changes:

Add BREAKING CHANGE: in the footer or append ! after the type:

feat!: remove deprecated API endpoints

Please amend your commit messages to follow this format.

Learn more: Conventional Commits

@github-actions
Copy link
Copy Markdown

Commit Message Format Issue

Your commit messages don't follow the Conventional Commits specification.

Required Format:

<type>(<optional scope>): <description>

[optional body]

[optional footer]

Valid Types:

  • feat: A new feature
  • fix: A bug fix
  • docs: Documentation only changes
  • style: Changes that don't affect code meaning (white-space, formatting)
  • refactor: Code change that neither fixes a bug nor adds a feature
  • perf: Code change that improves performance
  • test: Adding missing tests or correcting existing tests
  • build: Changes that affect the build system or external dependencies
  • ci: Changes to CI configuration files and scripts
  • chore: Other changes that don't modify src or test files
  • revert: Reverts a previous commit

Examples:

feat(auth): add OAuth2 authentication
fix(api): resolve race condition in token refresh
docs(readme): update installation instructions
refactor(core): simplify token optimization logic

Breaking Changes:

Add BREAKING CHANGE: in the footer or append ! after the type:

feat!: remove deprecated API endpoints

Please amend your commit messages to follow this format.

Learn more: Conventional Commits

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 8

♻️ Duplicate comments (3)
src/core/config.ts (2)

230-241: ⚠️ Potential issue | 🟠 Major

mergeConfig() can wipe existing optimization state and drop built-in model limits.

Line 231 hardcodes DEFAULT_OPTIMIZATION instead of merging from defaults.optimization, so update() calls without updates.optimization can reset prior values. Also, Line 232 allows userOpt.modelTokenLimits to replace the entire map.

Suggested fix
-    const userOpt = user.optimization ?? {};
+    const userOpt = user.optimization ?? {};
+    const baseOpt = defaults.optimization ?? DEFAULT_OPTIMIZATION;

     return {
@@
       optimization: {
-        ...DEFAULT_OPTIMIZATION,
+        ...baseOpt,
         ...userOpt,
+        modelTokenLimits: {
+          ...baseOpt.modelTokenLimits,
+          ...(userOpt.modelTokenLimits ?? {}),
+        },
         cacheSettings: {
-          ...DEFAULT_OPTIMIZATION.cacheSettings,
+          ...baseOpt.cacheSettings,
           ...(userOpt.cacheSettings ?? {}),
         },
         chatCompression: {
-          ...DEFAULT_OPTIMIZATION.chatCompression,
+          ...baseOpt.chatCompression,
           ...(userOpt.chatCompression ?? {}),
         },
       },
     };
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/core/config.ts` around lines 230 - 241, The mergeConfig() logic is
resetting existing optimization state by using the constant DEFAULT_OPTIMIZATION
instead of the current defaults.optimization and by letting
userOpt.modelTokenLimits replace the whole map; change the merge to start from
defaults.optimization (not DEFAULT_OPTIMIZATION) when building the optimization
object and merge modelTokenLimits entries (preserving existing keys and only
overriding keys provided in userOpt.modelTokenLimits) rather than replacing the
entire map; update the code paths that build optimization.cacheSettings and
optimization.chatCompression to continue merging from
defaults.optimization.cacheSettings and defaults.optimization.chatCompression so
update() calls without updates.optimization do not wipe prior values.

245-247: ⚠️ Potential issue | 🟠 Major

Return a defensive copy from getOptimizationConfig().

Line 246 returns a live object reference; caller mutation can leak into internal state (and default-backed instances).

Suggested fix
   public getOptimizationConfig(): OptimizationConfig {
-    return this.config.optimization ?? DEFAULT_OPTIMIZATION;
+    const optimization = this.config.optimization ?? DEFAULT_OPTIMIZATION;
+    return {
+      ...optimization,
+      modelTokenLimits: { ...optimization.modelTokenLimits },
+      cacheSettings: { ...optimization.cacheSettings },
+      chatCompression: { ...optimization.chatCompression },
+    };
   }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/core/config.ts` around lines 245 - 247, getOptimizationConfig currently
returns a live reference (this.config.optimization or DEFAULT_OPTIMIZATION)
which allows callers to mutate internal or default state; fix by returning a
defensive copy instead: inside getOptimizationConfig create and return a cloned
object of this.config.optimization when present, otherwise return a clone of
DEFAULT_OPTIMIZATION (use structuredClone if available or a shallow copy via
{...} / Object.assign for plain objects, or deep-clone for nested structures) so
callers cannot mutate internal state.
hooks/handlers/token-optimizer-orchestrator.ps1 (1)

1953-1958: ⚠️ Potential issue | 🔴 Critical

Read cached optimization fields from $retrieveResult.result.

optimization_storage returns { success, result }. This branch still reads optimizedText, optimizedTokens, and tokensSaved directly off $retrieveResult, so a real cache hit will decode $null instead of the stored payload.

Suggested fix
-            if ($retrieveResult -and $retrieveResult.success) {
+            if ($retrieveResult -and $retrieveResult.success -and $retrieveResult.result) {
                 Write-Log "Cache HIT for optimization result. Hash: $originalTextHash" "INFO"
-                $optimizedTextBytes = [System.Convert]::FromBase64String($retrieveResult.optimizedText)
+                $optimizedTextBytes = [System.Convert]::FromBase64String($retrieveResult.result.optimizedText)
                 $optimizedText = [System.Text.Encoding]::UTF8.GetString($optimizedTextBytes)
-                $afterTokens = $retrieveResult.optimizedTokens
-                $saved = $retrieveResult.tokensSaved
+                $afterTokens = $retrieveResult.result.optimizedTokens
+                $saved = $retrieveResult.result.tokensSaved
                 $percent = if ($beforeTokens -gt 0) { [math]::Round(($saved / $beforeTokens) * 100, 1) } else { 0 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@hooks/handlers/token-optimizer-orchestrator.ps1` around lines 1953 - 1958,
The cache-hit branch reads optimization fields directly from $retrieveResult
instead of from $retrieveResult.result, causing null decodes; update the branch
so optimizedTextBytes is derived from
[System.Convert]::FromBase64String($retrieveResult.result.optimizedText), set
$optimizedText from the decoded bytes, and assign $afterTokens and $saved from
$retrieveResult.result.optimizedTokens and $retrieveResult.result.tokensSaved
respectively (preserve the existing Write-Log using $originalTextHash and ensure
you guard against null/empty optimizedText before decoding).
🧹 Nitpick comments (1)
hooks/helpers/context-delta.ps1 (1)

42-47: Consider adding ShouldProcess support for state-changing function.

Reset-TokenOptimizerSessionId removes a file, which is a state-changing operation. Adding ShouldProcess support would follow PowerShell best practices and allow -WhatIf/-Confirm usage. This is a minor improvement for an internal helper.

Suggested enhancement
+[CmdletBinding(SupportsShouldProcess)]
 function Reset-TokenOptimizerSessionId {
     $script:TokenOptimizerCurrentSessionId = $null
     if (Test-Path $script:TokenOptimizerSessionIdPath) {
-        Remove-Item -Path $script:TokenOptimizerSessionIdPath -Force
+        if ($PSCmdlet.ShouldProcess($script:TokenOptimizerSessionIdPath, 'Remove session ID file')) {
+            Remove-Item -Path $script:TokenOptimizerSessionIdPath -Force
+        }
     }
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@hooks/helpers/context-delta.ps1` around lines 42 - 47,
Reset-TokenOptimizerSessionId is state-changing and should support
ShouldProcess; add an advanced function attribute like
[CmdletBinding(SupportsShouldProcess=$true)] to Reset-TokenOptimizerSessionId
and wrap the operations in a ShouldProcess check (use $PSCmdlet.ShouldProcess
with a descriptive target/message) so Remove-Item (and optionally clearing
$script:TokenOptimizerCurrentSessionId) only runs when the user confirms or
-WhatIf/-Confirm are honored; reference the function name
Reset-TokenOptimizerSessionId and the session path variable
$script:TokenOptimizerSessionIdPath when implementing the ShouldProcess guard.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@hooks/handlers/token-optimizer-orchestrator.ps1`:
- Line 15: Move the dot-source that imports the logging helpers so the Write-Log
function is available before any logging calls: locate the dot-sourcing of
hooks/helpers/logging.ps1 (the line that loads Write-Log) and place it at the
top of token-optimizer-orchestrator.ps1 before the first Write-Log invocation
(the call using SCRIPT_VERSION, Phase and Action) and before any other Write-Log
calls (including the later diagnostic log), ensuring Write-Log is defined prior
to use.
- Around line 2251-2265: The current code calls Invoke-ContextDelta with
whatever smart_read returned, which can be a diff patch and would then be
persisted as the new baseline; update the guard so you only call
Invoke-ContextDelta when $result.metadata.isDiff is false (i.e., the returned
content is a full snapshot), e.g. check $result.metadata.isDiff before using
$result.content[0].text and skip the compute-delta call when it is $true so
Invoke-ContextDelta (and the underlying context-delta-tool.ts) never persists a
diff patch as the baseline; reference symbols: $result, $result.metadata.isDiff,
$result.content[0].text, $filePath, and Invoke-ContextDelta.

In `@hooks/helpers/config.ps1`:
- Around line 100-106: Get-TokenOptimizerOptimizationConfig returns the raw user
optimization object from Import-TokenOptimizerConfig instead of merging it with
defaults, which allows partial overrides to drop required fields (e.g.,
modelTokenLimits); fix by performing a deep/field-wise merge of the imported
$config.optimization onto $script:TokenOptimizerDefaultConfig.optimization
(preserving defaults for missing keys) before returning; use or add a helper
merge function (e.g., Merge-TokenOptimizerConfig or a reusable
DeepMerge-Objects) and call it from Get-TokenOptimizerOptimizationConfig so the
result contains all default fields merged with user overrides.

In `@hooks/helpers/gzip.ps1`:
- Around line 74-79: The current Save-GzippedFile flow deletes the existing
$gzPath then moves $tmpPath into place, which can lose the file if the process
crashes between Remove-Item and Move-Item; instead perform an atomic replace:
remove the Remove-Item call and replace the Move-Item pattern with a single
atomic file replace such as calling [System.IO.File]::Replace($tmpPath, $gzPath,
$null) (or use Move-Item -Force if you prefer platform semantics but
[System.IO.File]::Replace is the safest atomic option); update the code around
$tmpPath and $gzPath in Save-GzippedFile to use that replace so there is never a
window where the .gz is missing.

In `@src/tools/context-delta-tool.ts`:
- Around line 89-103: The sizes use string.length (UTF-16 code units) but must
be byte counts; update all places in context-delta-tool.ts that set
originalSize, deltaSize, and bytesSaved (including the baseline return where
delta: currentContent and the return after calculateDelta) to use
Buffer.byteLength(...) on the string values (e.g.,
Buffer.byteLength(currentContent) and Buffer.byteLength(delta)) and compute
bytesSaved as Math.max(0, originalBytes - deltaBytes); reference calculateDelta
and the function that returns the baseline delta to locate the assignments.

In `@src/utils/lru-memoize.ts`:
- Around line 83-91: The memoize implementation in the returned async function
uses cache.get(key) then awaits fn(...args) and sets the result, so concurrent
callers for the same key will each call fn; fix by creating and storing the
pending Promise in the cache immediately (use keyFn, cache, and fn identifiers)
so subsequent callers get the same Promise; await the stored promise, and on
rejection remove the cache entry (or avoid leaving a rejected promise cached)
and on success you may replace the stored Promise with the resolved value if you
want values instead of promises.

In `@tests/unit/config.test.ts`:
- Around line 54-77: The test needs to assert that deep-merge preserves built-in
modelTokenLimits rather than replacing the whole map: in the 'overrides
defaults...' spec update to also check that a built-in default model token limit
is still present after merging by calling mgr.getModelTokenLimit for the
package's default model key (use the same constant/identifier your code uses,
e.g., DEFAULT_MODEL_NAME or the literal default model id used in defaults) and
expecting its default limit value; keep the existing custom-model assertion
(mgr.getModelTokenLimit('custom-model') === 500000) and add the new assertion to
ensure modelTokenLimits are merged, not replaced.

In `@tests/unit/lru-memoize.test.ts`:
- Around line 5-68: Tests only cover sequential calls; add a concurrency test to
assert in-flight deduplication by invoking lruMemoize's memo function
concurrently for the same key and ensuring the wrapped function (fn) is called
only once. Create an async fn that awaits a short delay (e.g., Promise that
resolves after ~20ms) and increments a counter, then call memo(...) multiple
times in parallel via Promise.all([...memo(sameArgs), ...]) and assert calls ===
1 and all results equal; reference lruMemoize, fn, and the memo variable to
locate where to add this test.

---

Duplicate comments:
In `@hooks/handlers/token-optimizer-orchestrator.ps1`:
- Around line 1953-1958: The cache-hit branch reads optimization fields directly
from $retrieveResult instead of from $retrieveResult.result, causing null
decodes; update the branch so optimizedTextBytes is derived from
[System.Convert]::FromBase64String($retrieveResult.result.optimizedText), set
$optimizedText from the decoded bytes, and assign $afterTokens and $saved from
$retrieveResult.result.optimizedTokens and $retrieveResult.result.tokensSaved
respectively (preserve the existing Write-Log using $originalTextHash and ensure
you guard against null/empty optimizedText before decoding).

In `@src/core/config.ts`:
- Around line 230-241: The mergeConfig() logic is resetting existing
optimization state by using the constant DEFAULT_OPTIMIZATION instead of the
current defaults.optimization and by letting userOpt.modelTokenLimits replace
the whole map; change the merge to start from defaults.optimization (not
DEFAULT_OPTIMIZATION) when building the optimization object and merge
modelTokenLimits entries (preserving existing keys and only overriding keys
provided in userOpt.modelTokenLimits) rather than replacing the entire map;
update the code paths that build optimization.cacheSettings and
optimization.chatCompression to continue merging from
defaults.optimization.cacheSettings and defaults.optimization.chatCompression so
update() calls without updates.optimization do not wipe prior values.
- Around line 245-247: getOptimizationConfig currently returns a live reference
(this.config.optimization or DEFAULT_OPTIMIZATION) which allows callers to
mutate internal or default state; fix by returning a defensive copy instead:
inside getOptimizationConfig create and return a cloned object of
this.config.optimization when present, otherwise return a clone of
DEFAULT_OPTIMIZATION (use structuredClone if available or a shallow copy via
{...} / Object.assign for plain objects, or deep-clone for nested structures) so
callers cannot mutate internal state.

---

Nitpick comments:
In `@hooks/helpers/context-delta.ps1`:
- Around line 42-47: Reset-TokenOptimizerSessionId is state-changing and should
support ShouldProcess; add an advanced function attribute like
[CmdletBinding(SupportsShouldProcess=$true)] to Reset-TokenOptimizerSessionId
and wrap the operations in a ShouldProcess check (use $PSCmdlet.ShouldProcess
with a descriptive target/message) so Remove-Item (and optionally clearing
$script:TokenOptimizerCurrentSessionId) only runs when the user confirms or
-WhatIf/-Confirm are honored; reference the function name
Reset-TokenOptimizerSessionId and the session path variable
$script:TokenOptimizerSessionIdPath when implementing the ShouldProcess guard.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 901efe0b-0ab5-45b6-8f98-5b9f26e022e4

📥 Commits

Reviewing files that changed from the base of the PR and between 96f6e2b and fd006e9.

📒 Files selected for processing (23)
  • hooks/handlers/token-optimizer-orchestrator.ps1
  • hooks/helpers/config.ps1
  • hooks/helpers/context-delta.ps1
  • hooks/helpers/gzip.ps1
  • src/analytics/optimization-storage.ts
  • src/core/config.ts
  • src/core/session-manager.ts
  • src/core/session.ts
  • src/core/summarization.ts
  • src/core/token-counter.ts
  • src/core/tokenizers/google-ai-tokenizer.ts
  • src/core/tokenizers/tokenizer-factory.ts
  • src/core/types.ts
  • src/server/index.ts
  • src/tools/context-delta-tool.ts
  • src/utils/gzip.ts
  • src/utils/lru-memoize.ts
  • src/validation/tool-schemas.ts
  • tests/unit/config.test.ts
  • tests/unit/gzip.test.ts
  • tests/unit/lru-memoize.test.ts
  • tests/unit/session.test.ts
  • tests/unit/summarization.test.ts
🚧 Files skipped from review as they are similar to previous changes (5)
  • tests/unit/session.test.ts
  • src/core/types.ts
  • src/analytics/optimization-storage.ts
  • src/core/summarization.ts
  • src/core/session.ts

Comment thread hooks/handlers/token-optimizer-orchestrator.ps1
Comment thread hooks/handlers/token-optimizer-orchestrator.ps1
Comment thread hooks/helpers/config.ps1
Comment thread hooks/helpers/gzip.ps1
Comment thread src/tools/context-delta-tool.ts Outdated
Comment thread src/utils/lru-memoize.ts Outdated
Comment thread tests/unit/config.test.ts
Comment thread tests/unit/lru-memoize.test.ts
@github-actions
Copy link
Copy Markdown

Commit Message Format Issue

Your commit messages don't follow the Conventional Commits specification.

Required Format:

<type>(<optional scope>): <description>

[optional body]

[optional footer]

Valid Types:

  • feat: A new feature
  • fix: A bug fix
  • docs: Documentation only changes
  • style: Changes that don't affect code meaning (white-space, formatting)
  • refactor: Code change that neither fixes a bug nor adds a feature
  • perf: Code change that improves performance
  • test: Adding missing tests or correcting existing tests
  • build: Changes that affect the build system or external dependencies
  • ci: Changes to CI configuration files and scripts
  • chore: Other changes that don't modify src or test files
  • revert: Reverts a previous commit

Examples:

feat(auth): add OAuth2 authentication
fix(api): resolve race condition in token refresh
docs(readme): update installation instructions
refactor(core): simplify token optimization logic

Breaking Changes:

Add BREAKING CHANGE: in the footer or append ! after the type:

feat!: remove deprecated API endpoints

Please amend your commit messages to follow this format.

Learn more: Conventional Commits

1 similar comment
@github-actions
Copy link
Copy Markdown

Commit Message Format Issue

Your commit messages don't follow the Conventional Commits specification.

Required Format:

<type>(<optional scope>): <description>

[optional body]

[optional footer]

Valid Types:

  • feat: A new feature
  • fix: A bug fix
  • docs: Documentation only changes
  • style: Changes that don't affect code meaning (white-space, formatting)
  • refactor: Code change that neither fixes a bug nor adds a feature
  • perf: Code change that improves performance
  • test: Adding missing tests or correcting existing tests
  • build: Changes that affect the build system or external dependencies
  • ci: Changes to CI configuration files and scripts
  • chore: Other changes that don't modify src or test files
  • revert: Reverts a previous commit

Examples:

feat(auth): add OAuth2 authentication
fix(api): resolve race condition in token refresh
docs(readme): update installation instructions
refactor(core): simplify token optimization logic

Breaking Changes:

Add BREAKING CHANGE: in the footer or append ! after the type:

feat!: remove deprecated API endpoints

Please amend your commit messages to follow this format.

Learn more: Conventional Commits

@github-actions
Copy link
Copy Markdown

Performance Benchmark Results


Addresses all 22 unresolved review threads. Grouped logically:

Critical PS defects
- token-optimizer-orchestrator.ps1: move helper dot-sources before the
  first Write-Log call (script version diagnostic + InputJsonFile read
  were calling Write-Log before logging.ps1 was loaded).
- token-optimizer-orchestrator.ps1: optimization_storage retrieve path
  now reads $retrieveResult.result.optimizedText (and mirrors the
  base64 wrapping used on store) instead of the top-level response.
- dispatcher.ps1: load logging.ps1 defensively — missing/broken helper
  falls back to no-op shims instead of killing every hook phase.
- helpers/logging.ps1: create the log directory on demand and swallow
  write failures so logging never becomes the failure mode for callers.
- orchestrator Handle-SmartRead: skip context_delta update when
  smart_read returned a diff payload — persisting a diff as the new
  baseline would compare the next read against the previous patch.

Compression + storage
- compression-engine.ts decompress(): fall back to raw UTF-8 when the
  buffer isn't Brotli so legacy plaintext rows keep working.
- compression-engine.ts: getCompressionStats / shouldCompress share
  a DEFAULT_MIN_SIZE_BYTES knob instead of hard-coding 500 in one
  place and 1000 in callers.
- optimization-storage.ts: persist AND read compression_algorithm;
  decodePayload dispatches per-algorithm with explicit error on
  unknown labels.

Config deep-merge
- config.ts: merge optimization.modelTokenLimits so user overrides
  add to the default map instead of replacing it.
- config.ps1: mirror the deep-merge via a new
  Merge-TokenOptimizerHashtable recursive helper.
- tests/config.test.ts: lock the invariant with a gpt-4 assertion.

Tokenizer hardening
- tiktoken-tokenizer / heuristic-tokenizer / google-ai-tokenizer:
  SHA-256 hash cache keys longer than 256 chars so the LRU stores
  digests, not full prompt text.
- tokenizer-factory.createFromEnv: TOKEN_OPTIMIZER_MODEL has highest
  precedence so users can pin the optimizer model even when
  CLAUDE_MODEL or similar are already set.

Session + context-delta fidelity
- session.ts: fromSnapshot preserves createdAt and updatedAt from the
  persisted snapshot; added SessionOptions.createdAt / .updatedAt
  overrides. Test asserts round-trip.
- context-delta-tool.ts: originalSize / deltaSize / bytesSaved use
  Buffer.byteLength(utf8) so multi-byte content reports honest bytes,
  matching the byte cap that SessionManager.updateFileState enforces.

Schema strictness
- tool-schemas.ts: OptimizationStorageSchema is now a discriminated
  union — store requires hash+text+token fields, retrieve requires
  just hash. Invalid payloads fail in validateToolArgs instead of
  after dispatch.
- optimization-storage-tool.ts: MCP inputSchema mirrors the same
  oneOf shape with additionalProperties:false.

Cache utilities
- lru-cache.ts prune(): scan every entry so per-entry TTLs set via
  set(key, val, ttlMs) are cleaned up even when defaultTtlMs is 0.
  Regression test added.
- lru-memoize.ts: deduplicate concurrent calls for the same key with
  an inFlight Map — a stampede while the first promise is pending
  collapses to a single fn() invocation. Concurrency test added.

PS atomic gzip
- helpers/gzip.ps1 Save-GzippedFile: atomic swap via
  File::Move(src, dst, overwrite:true) on .NET 5+, so a crash
  mid-write never leaves the caller with a missing .gz. Runtime-
  verified in PS7.

All 59 new/updated unit tests pass; tsc --noEmit clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
ooples added a commit that referenced this pull request Apr 20, 2026
Addresses all 22 unresolved review threads. Grouped logically:

Critical PS defects
- token-optimizer-orchestrator.ps1: move helper dot-sources before the
  first Write-Log call (script version diagnostic + InputJsonFile read
  were calling Write-Log before logging.ps1 was loaded).
- token-optimizer-orchestrator.ps1: optimization_storage retrieve path
  now reads $retrieveResult.result.optimizedText (and mirrors the
  base64 wrapping used on store) instead of the top-level response.
- dispatcher.ps1: load logging.ps1 defensively — missing/broken helper
  falls back to no-op shims instead of killing every hook phase.
- helpers/logging.ps1: create the log directory on demand and swallow
  write failures so logging never becomes the failure mode for callers.
- orchestrator Handle-SmartRead: skip context_delta update when
  smart_read returned a diff payload — persisting a diff as the new
  baseline would compare the next read against the previous patch.

Compression + storage
- compression-engine.ts decompress(): fall back to raw UTF-8 when the
  buffer isn't Brotli so legacy plaintext rows keep working.
- compression-engine.ts: getCompressionStats / shouldCompress share
  a DEFAULT_MIN_SIZE_BYTES knob instead of hard-coding 500 in one
  place and 1000 in callers.
- optimization-storage.ts: persist AND read compression_algorithm;
  decodePayload dispatches per-algorithm with explicit error on
  unknown labels.

Config deep-merge
- config.ts: merge optimization.modelTokenLimits so user overrides
  add to the default map instead of replacing it.
- config.ps1: mirror the deep-merge via a new
  Merge-TokenOptimizerHashtable recursive helper.
- tests/config.test.ts: lock the invariant with a gpt-4 assertion.

Tokenizer hardening
- tiktoken-tokenizer / heuristic-tokenizer / google-ai-tokenizer:
  SHA-256 hash cache keys longer than 256 chars so the LRU stores
  digests, not full prompt text.
- tokenizer-factory.createFromEnv: TOKEN_OPTIMIZER_MODEL has highest
  precedence so users can pin the optimizer model even when
  CLAUDE_MODEL or similar are already set.

Session + context-delta fidelity
- session.ts: fromSnapshot preserves createdAt and updatedAt from the
  persisted snapshot; added SessionOptions.createdAt / .updatedAt
  overrides. Test asserts round-trip.
- context-delta-tool.ts: originalSize / deltaSize / bytesSaved use
  Buffer.byteLength(utf8) so multi-byte content reports honest bytes,
  matching the byte cap that SessionManager.updateFileState enforces.

Schema strictness
- tool-schemas.ts: OptimizationStorageSchema is now a discriminated
  union — store requires hash+text+token fields, retrieve requires
  just hash. Invalid payloads fail in validateToolArgs instead of
  after dispatch.
- optimization-storage-tool.ts: MCP inputSchema mirrors the same
  oneOf shape with additionalProperties:false.

Cache utilities
- lru-cache.ts prune(): scan every entry so per-entry TTLs set via
  set(key, val, ttlMs) are cleaned up even when defaultTtlMs is 0.
  Regression test added.
- lru-memoize.ts: deduplicate concurrent calls for the same key with
  an inFlight Map — a stampede while the first promise is pending
  collapses to a single fn() invocation. Concurrency test added.

PS atomic gzip
- helpers/gzip.ps1 Save-GzippedFile: atomic swap via
  File::Move(src, dst, overwrite:true) on .NET 5+, so a crash
  mid-write never leaves the caller with a missing .gz. Runtime-
  verified in PS7.

All 59 new/updated unit tests pass; tsc --noEmit clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown

Commit Message Format Issue

Your commit messages don't follow the Conventional Commits specification.

Required Format:

<type>(<optional scope>): <description>

[optional body]

[optional footer]

Valid Types:

  • feat: A new feature
  • fix: A bug fix
  • docs: Documentation only changes
  • style: Changes that don't affect code meaning (white-space, formatting)
  • refactor: Code change that neither fixes a bug nor adds a feature
  • perf: Code change that improves performance
  • test: Adding missing tests or correcting existing tests
  • build: Changes that affect the build system or external dependencies
  • ci: Changes to CI configuration files and scripts
  • chore: Other changes that don't modify src or test files
  • revert: Reverts a previous commit

Examples:

feat(auth): add OAuth2 authentication
fix(api): resolve race condition in token refresh
docs(readme): update installation instructions
refactor(core): simplify token optimization logic

Breaking Changes:

Add BREAKING CHANGE: in the footer or append ! after the type:

feat!: remove deprecated API endpoints

Please amend your commit messages to follow this format.

Learn more: Conventional Commits

1 similar comment
@github-actions
Copy link
Copy Markdown

Commit Message Format Issue

Your commit messages don't follow the Conventional Commits specification.

Required Format:

<type>(<optional scope>): <description>

[optional body]

[optional footer]

Valid Types:

  • feat: A new feature
  • fix: A bug fix
  • docs: Documentation only changes
  • style: Changes that don't affect code meaning (white-space, formatting)
  • refactor: Code change that neither fixes a bug nor adds a feature
  • perf: Code change that improves performance
  • test: Adding missing tests or correcting existing tests
  • build: Changes that affect the build system or external dependencies
  • ci: Changes to CI configuration files and scripts
  • chore: Other changes that don't modify src or test files
  • revert: Reverts a previous commit

Examples:

feat(auth): add OAuth2 authentication
fix(api): resolve race condition in token refresh
docs(readme): update installation instructions
refactor(core): simplify token optimization logic

Breaking Changes:

Add BREAKING CHANGE: in the footer or append ! after the type:

feat!: remove deprecated API endpoints

Please amend your commit messages to follow this format.

Learn more: Conventional Commits

@github-actions
Copy link
Copy Markdown

Performance Benchmark Results


Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 11

♻️ Duplicate comments (2)
src/core/config.ts (1)

251-253: ⚠️ Potential issue | 🟠 Major

Return a defensive copy from getOptimizationConfig().

At Line 252, this returns a live mutable object. Callers can mutate modelTokenLimits (and nested settings), leaking state across the manager and defaults.

Suggested fix
   public getOptimizationConfig(): OptimizationConfig {
-    return this.config.optimization ?? DEFAULT_OPTIMIZATION;
+    const optimization = this.config.optimization ?? DEFAULT_OPTIMIZATION;
+    return {
+      ...optimization,
+      cacheSettings: { ...optimization.cacheSettings },
+      chatCompression: { ...optimization.chatCompression },
+      modelTokenLimits: { ...optimization.modelTokenLimits },
+    };
   }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/core/config.ts` around lines 251 - 253, getOptimizationConfig currently
returns a live mutable object (this.config.optimization or DEFAULT_OPTIMIZATION)
which allows callers to mutate modelTokenLimits and other nested settings;
change getOptimizationConfig to return a deep/defensive copy of the chosen
OptimizationConfig so callers cannot mutate internal state. Locate the method
getOptimizationConfig and when returning this.config.optimization or
DEFAULT_OPTIMIZATION, clone the object (including nested modelTokenLimits
maps/objects and any arrays) before returning; ensure copying covers nested
structures rather than shallow copy so modifications to modelTokenLimits won’t
affect the original configuration.
hooks/handlers/token-optimizer-orchestrator.ps1 (1)

19-22: ⚠️ Potential issue | 🟠 Major

Guard helper imports the same way the dispatcher does.

These dot-sources are now early enough, but they are still unprotected. A missing or malformed helper aborts the entire orchestrator before any action-specific fallback can run, which makes shared helpers a startup single point of failure again.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@hooks/handlers/token-optimizer-orchestrator.ps1` around lines 19 - 22, The
four dot-sourced helpers (logging.ps1, config.ps1, gzip.ps1, context-delta.ps1)
are unguarded and can abort the orchestrator; wrap each import with a
Test-Path/Resolve-Path check and a Try/Catch so a missing or malformed helper
logs a warning and allows the orchestrator to continue (and where needed provide
minimal fallback stubs), e.g. for each file name referenced via
"$PSScriptRoot\..\helpers\<name>.ps1" verify the path exists before
dot-sourcing, catch any load/parse exceptions, call a safe fallback or set
default variables/functions (e.g., safe-Log, Get-Config, Compress-Data,
Compute-ContextDelta) and emit a clear warning so the dispatcher behavior
mirrors the guarded imports used elsewhere.
🧹 Nitpick comments (3)
src/core/session.ts (1)

72-73: Clamp preserveTailRatio to a valid range.

Line 72 accepts any numeric input. Values like > 1, < 0, or non-finite numbers can make compression ineffective or unpredictable at Line 155.

🔧 Suggested fix
-        this.preserveTailRatio = options.preserveTailRatio ?? DEFAULT_PRESERVE_TAIL_RATIO;
+        const requestedPreserveTailRatio =
+            options.preserveTailRatio ?? DEFAULT_PRESERVE_TAIL_RATIO;
+        this.preserveTailRatio = Number.isFinite(requestedPreserveTailRatio)
+            ? Math.min(1, Math.max(0, requestedPreserveTailRatio))
+            : DEFAULT_PRESERVE_TAIL_RATIO;

Also applies to: 155-158

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/core/session.ts` around lines 72 - 73, preserveTailRatio is accepted
unchecked and can be >1, <0 or non-finite; sanitize it when assigned and before
use: in the constructor where preserveTailRatio is set (reference symbol:
preserveTailRatio and DEFAULT_PRESERVE_TAIL_RATIO) replace the direct assignment
with logic that uses Number.isFinite(options.preserveTailRatio) ? Math.min(1,
Math.max(0, options.preserveTailRatio)) : DEFAULT_PRESERVE_TAIL_RATIO, and also
guard the value right before the compression logic that uses it (reference the
compress/trim logic around the code at lines referenced in the review) to ensure
you clamp or fallback again so non-finite or out-of-range values cannot affect
compression.
src/utils/lru-memoize.ts (1)

55-59: Consider clearing registry entries in clearAll() as well.

Right now this clears cache contents but keeps all registrations forever; dynamic registrations can leave stale entries and stats noise.

Suggested fix
     public clearAll(): void {
         for (const { cache } of this.caches.values()) {
             cache.clear();
         }
+        this.caches.clear();
     }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/utils/lru-memoize.ts` around lines 55 - 59, The clearAll method currently
clears each cache (see clearAll and this.caches) but leaves the registration
registry in place; update clearAll to also remove all registry entries by
clearing the internal registrations map (e.g., call this.registrations.clear()
or this.registry.clear() depending on the field name used in the class) so
dynamic registrations are removed and stats stop accumulating stale entries.
src/core/tokenizers/google-ai-tokenizer.ts (1)

54-57: Deduplicate concurrent identical misses to reduce duplicate outbound calls.

When multiple requests for the same key arrive before the first completes, each one calls the remote API. Add an in-flight map to coalesce concurrent calls per key.

Proposed refactor
 export class GoogleAITokenizer implements ITokenizer {
@@
     private readonly cache: LruCache<string, number>;
     private readonly timeoutMs: number;
+    private readonly inFlight = new Map<string, Promise<number>>();
@@
         const cached = this.cache.get(key);
         if (cached !== undefined) {
             return cached;
         }
+        const pending = this.inFlight.get(key);
+        if (pending) {
+            return pending;
+        }

-        try {
+        const task = (async () => {
             const response = await fetch(url, {
@@
             this.cache.set(key, data.totalTokens);
             return data.totalTokens;
-        } finally {
-            clearTimeout(timeout);
-        }
+        })()
+            .finally(() => {
+                clearTimeout(timeout);
+                this.inFlight.delete(key);
+            });
+        this.inFlight.set(key, task);
+        return task;

Also applies to: 66-90

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/core/tokenizers/google-ai-tokenizer.ts` around lines 54 - 57, Concurrent
requests for the same key can trigger duplicate remote calls because only
this.cache is checked; add an in-flight map (e.g., this.inFlight: Map<string,
Promise<string>>) on the GoogleAITokenizer class and coalesce concurrent misses
by: check this.inFlight.get(key) and return await that promise if present,
otherwise create a promise for the remote fetch, store it in
this.inFlight.set(key, promise) before starting the remote call, on resolution
store the value in this.cache.set(key, value) and finally remove the entry from
this.inFlight (also remove on rejection) and return the value; apply the same
pattern to the other identical block (lines 66–90) so concurrent identical
misses are deduplicated.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@hooks/handlers/token-optimizer-orchestrator.ps1`:
- Around line 2020-2033: The code currently calls &
"$HELPERS_DIR\invoke-mcp.ps1" -Tool "optimization_storage" -ArgumentsJson
$storeJson and unconditionally logs success; change this to capture the command
output (e.g., $result = & ...), parse/convert it to an object, and verify a
truthy success field (e.g., $result.success -eq $true) before calling Write-Log
"Stored new optimization result..."; if the tool returns success:false or an
error payload, call Handle-Error -Exception (New-Object
System.Exception($result.error)) -Message "Failed to store optimization result"
(or log the error) instead of logging success; reference variables/functions:
$storeJson, invoke-mcp.ps1, optimization_storage, Write-Log, Handle-Error.

In `@hooks/helpers/gzip.ps1`:
- Around line 99-105: Read-MaybeGzippedFile currently reads the sibling
"$Path.gz" with [System.IO.File]::ReadAllBytes and passes bytes to Expand-String
-CompressedBytes, but any exception from reading or decompressing causes a hard
failure even when the plaintext $Path exists; wrap the gzip sibling
read/decompress block (the code that computes $gzPath, calls ReadAllBytes and
Expand-String -CompressedBytes) in a try/catch and on any error fall back to
reading the plaintext sibling (use [System.IO.File]::ReadAllText($Path,
[System.Text.Encoding]::UTF8) if Test-Path $Path), swallowing or logging the
gzip error and returning the plaintext content so corrupt/partial .gz files do
not break the migration path.
- Around line 72-80: In Save-GzippedFile replace the shared temp name ($tmpPath
= "$gzPath.tmp") with a per-write unique temp file (e.g. incorporate a GUID or
use [System.IO.Path]::GetTempFileName() to derive $tmpPath) so concurrent
writers cannot clobber each other; write bytes to that unique $tmpPath, call
[System.IO.File]::Move($tmpPath, $gzPath, $true) as currently done, and add a
finally block to always remove the unique $tmpPath on success or failure to
avoid orphaned temp files and ensure atomic replacement when multiple processes
write the same $gzPath.

In `@src/core/config.ts`:
- Around line 224-247: The current merge always spreads from
DEFAULT_OPTIMIZATION which overwrites any previously stored custom optimization
when update() is called with unrelated fields; change the merge to start from
the existing runtime config's optimization (falling back to
DEFAULT_OPTIMIZATION) so updates only override provided fields. Concretely, read
the current optimization (e.g. const currentOpt = currentConfig.optimization ??
DEFAULT_OPTIMIZATION) and then build optimization as { ...DEFAULT_OPTIMIZATION,
...currentOpt, ...userOpt, cacheSettings: {
...DEFAULT_OPTIMIZATION.cacheSettings, ...currentOpt.cacheSettings,
...(userOpt.cacheSettings ?? {}) }, chatCompression: {
...DEFAULT_OPTIMIZATION.chatCompression, ...currentOpt.chatCompression,
...(userOpt.chatCompression ?? {}) }, modelTokenLimits: {
...DEFAULT_OPTIMIZATION.modelTokenLimits, ...currentOpt.modelTokenLimits,
...(userOpt.modelTokenLimits ?? {}) } } so DEFAULT_OPTIMIZATION provides
fallbacks but existing overrides in currentConfig.optimization are preserved
unless userOpt explicitly replaces them.

In `@src/core/session.ts`:
- Around line 88-94: getHistory and getFileState currently return live internal
objects/arrays (and other places create shallow copies that still share Message
object references), allowing external code to mutate session internals; change
these getters to return defensive copies by mapping history to new Message
objects (clone each Message's primitive fields and nested objects) or using a
structured/deep clone, and return a cloned/read-only copy of fileState (or an
immutable/frozen clone) instead of the original reference; apply the same
defensive-cloning approach to any other methods that currently shallow-copy
arrays (the methods that create copies at the locations referenced in the
review) so callers cannot mutate Message instances and bypass updatedAt
tracking.
- Around line 165-169: The summary is being added as a high-priority system
message: change the role used when creating the summaryMessage (created after
this.summarizer.summarize(head)) from 'system' to a non-system role (e.g.,
'assistant' or a neutral metadata role) so summarized history is not elevated to
system-level context; locate the summary construction (summaryMessage and any
code that appends it to head/conversation) and update the role value and any
downstream handling that assumes system-level precedence accordingly.

In `@src/core/tokenizers/google-ai-tokenizer.ts`:
- Around line 50-53: The current logic in google-ai-tokenizer.ts creates cache
keys using raw text when text.length <= KEY_HASH_THRESHOLD_CHARS and hashed
strings otherwise, which risks retaining sensitive prompts and mixing formats;
change this so the cache key (variable key) is always a namespaced SHA-256 hash
of the input (use createHash('sha256').update(text).digest('hex') and prepend a
fixed namespace like "gpt-tokenizer:"), remove or ignore
KEY_HASH_THRESHOLD_CHARS branching, and ensure all codepaths that reference key
continue to use the new hashed-and-namespaced value.
- Around line 59-61: The URL construction in countTokens exposes the API key in
the query string; remove the `?key=${encodeURIComponent(this.apiKey)}` fragment
from the URL built with `this.endpoint` and `this.modelName` and instead add the
header `'x-goog-api-key': this.apiKey` to the fetch request headers (where fetch
is called in the same method), applying the same change to both places where
countTokens/fetch is invoked; update the fetch options to include this header
(preserving any existing headers like 'Content-Type') and ensure `this.apiKey`
is used directly in the header rather than in the URL.

In `@src/tools/context-delta-tool.ts`:
- Around line 142-165: The inputSchema currently declares currentContent as
optional for all operations but at runtime compute-delta and seed require it;
update the inputSchema (symbol: inputSchema in this module) to be a
discriminated oneOf (or anyOf) keyed on operation so each branch enforces its
own required props: one branch for operation='compute-delta' requiring
sessionId, filePath, currentContent; one for operation='seed' requiring
sessionId, filePath, currentContent; and one for operation='clear' requiring
only sessionId and filePath; keep descriptions the same and remove
currentContent from the global required list so schema validation matches the
runtime contract.

In `@src/utils/lru-memoize.ts`:
- Line 68: The current memoization treats a cached value of undefined as a miss
because it checks the result of cache.get(key) directly; change the hit logic in
the memoize wrapper(s) to use cache.has(key) (or an explicit sentinel) to detect
presence, then call cache.get(key) only after confirming has(key) so undefined
values are correctly returned from the cache; update the occurrences around the
LruCache<string, R> instantiation and the other hit checks (lines noted near the
cache variable and the checks at 90-93 and 101) to use cache.has(key) before
reading the value.
- Around line 82-85: The current JSON serializer in lru-memoize.ts turns bigint
values into plain strings causing collisions (e.g., [1n] vs ["1"]); update the
replacer used in the JSON.stringify call inside the key generation logic (the
function that currently checks typeof v === 'bigint') to emit a type-marked
representation for bigints (for example prefix the value or wrap it in an object
with a __bigint/type marker) so serialized bigints are distinguishable from
ordinary strings, then continue to feed that marked serialization to
createHash('sha256').update(...).digest('hex') unchanged.

---

Duplicate comments:
In `@hooks/handlers/token-optimizer-orchestrator.ps1`:
- Around line 19-22: The four dot-sourced helpers (logging.ps1, config.ps1,
gzip.ps1, context-delta.ps1) are unguarded and can abort the orchestrator; wrap
each import with a Test-Path/Resolve-Path check and a Try/Catch so a missing or
malformed helper logs a warning and allows the orchestrator to continue (and
where needed provide minimal fallback stubs), e.g. for each file name referenced
via "$PSScriptRoot\..\helpers\<name>.ps1" verify the path exists before
dot-sourcing, catch any load/parse exceptions, call a safe fallback or set
default variables/functions (e.g., safe-Log, Get-Config, Compress-Data,
Compute-ContextDelta) and emit a clear warning so the dispatcher behavior
mirrors the guarded imports used elsewhere.

In `@src/core/config.ts`:
- Around line 251-253: getOptimizationConfig currently returns a live mutable
object (this.config.optimization or DEFAULT_OPTIMIZATION) which allows callers
to mutate modelTokenLimits and other nested settings; change
getOptimizationConfig to return a deep/defensive copy of the chosen
OptimizationConfig so callers cannot mutate internal state. Locate the method
getOptimizationConfig and when returning this.config.optimization or
DEFAULT_OPTIMIZATION, clone the object (including nested modelTokenLimits
maps/objects and any arrays) before returning; ensure copying covers nested
structures rather than shallow copy so modifications to modelTokenLimits won’t
affect the original configuration.

---

Nitpick comments:
In `@src/core/session.ts`:
- Around line 72-73: preserveTailRatio is accepted unchecked and can be >1, <0
or non-finite; sanitize it when assigned and before use: in the constructor
where preserveTailRatio is set (reference symbol: preserveTailRatio and
DEFAULT_PRESERVE_TAIL_RATIO) replace the direct assignment with logic that uses
Number.isFinite(options.preserveTailRatio) ? Math.min(1, Math.max(0,
options.preserveTailRatio)) : DEFAULT_PRESERVE_TAIL_RATIO, and also guard the
value right before the compression logic that uses it (reference the
compress/trim logic around the code at lines referenced in the review) to ensure
you clamp or fallback again so non-finite or out-of-range values cannot affect
compression.

In `@src/core/tokenizers/google-ai-tokenizer.ts`:
- Around line 54-57: Concurrent requests for the same key can trigger duplicate
remote calls because only this.cache is checked; add an in-flight map (e.g.,
this.inFlight: Map<string, Promise<string>>) on the GoogleAITokenizer class and
coalesce concurrent misses by: check this.inFlight.get(key) and return await
that promise if present, otherwise create a promise for the remote fetch, store
it in this.inFlight.set(key, promise) before starting the remote call, on
resolution store the value in this.cache.set(key, value) and finally remove the
entry from this.inFlight (also remove on rejection) and return the value; apply
the same pattern to the other identical block (lines 66–90) so concurrent
identical misses are deduplicated.

In `@src/utils/lru-memoize.ts`:
- Around line 55-59: The clearAll method currently clears each cache (see
clearAll and this.caches) but leaves the registration registry in place; update
clearAll to also remove all registry entries by clearing the internal
registrations map (e.g., call this.registrations.clear() or
this.registry.clear() depending on the field name used in the class) so dynamic
registrations are removed and stats stop accumulating stale entries.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: aba569c6-2f73-4c2e-9332-efa357e8e3cf

📥 Commits

Reviewing files that changed from the base of the PR and between fd006e9 and de0a8f9.

📒 Files selected for processing (22)
  • hooks/dispatcher.ps1
  • hooks/handlers/token-optimizer-orchestrator.ps1
  • hooks/helpers/config.ps1
  • hooks/helpers/gzip.ps1
  • hooks/helpers/logging.ps1
  • src/analytics/optimization-storage.ts
  • src/core/compression-engine.ts
  • src/core/config.ts
  • src/core/session.ts
  • src/core/tokenizers/google-ai-tokenizer.ts
  • src/core/tokenizers/heuristic-tokenizer.ts
  • src/core/tokenizers/tiktoken-tokenizer.ts
  • src/core/tokenizers/tokenizer-factory.ts
  • src/tools/context-delta-tool.ts
  • src/tools/optimization-storage-tool.ts
  • src/utils/lru-cache.ts
  • src/utils/lru-memoize.ts
  • src/validation/tool-schemas.ts
  • tests/unit/config.test.ts
  • tests/unit/lru-cache.test.ts
  • tests/unit/lru-memoize.test.ts
  • tests/unit/session.test.ts
✅ Files skipped from review due to trivial changes (3)
  • tests/unit/lru-memoize.test.ts
  • tests/unit/session.test.ts
  • src/tools/optimization-storage-tool.ts
🚧 Files skipped from review as they are similar to previous changes (9)
  • tests/unit/config.test.ts
  • src/core/tokenizers/tokenizer-factory.ts
  • src/core/tokenizers/heuristic-tokenizer.ts
  • tests/unit/lru-cache.test.ts
  • src/validation/tool-schemas.ts
  • src/core/tokenizers/tiktoken-tokenizer.ts
  • src/utils/lru-cache.ts
  • src/analytics/optimization-storage.ts
  • src/core/compression-engine.ts

Comment thread hooks/handlers/token-optimizer-orchestrator.ps1
Comment thread hooks/helpers/gzip.ps1 Outdated
Comment thread hooks/helpers/gzip.ps1
Comment thread src/core/config.ts
Comment thread src/core/session.ts
Comment thread src/core/tokenizers/google-ai-tokenizer.ts Outdated
Comment thread src/core/tokenizers/google-ai-tokenizer.ts Outdated
Comment thread src/tools/context-delta-tool.ts
Comment thread src/utils/lru-memoize.ts Outdated
Comment thread src/utils/lru-memoize.ts Outdated
@ooples ooples force-pushed the feat/sophisticated-token-counting-issue-4 branch from de0a8f9 to 778d01a Compare April 20, 2026 02:33
ooples and others added 2 commits April 19, 2026 22:38
… storage layer

The earlier try/catch in CompressionEngine.decompress() made
decompressFromBase64('invalid-base64-data') return a mojibake string
instead of throwing, which regressed
tests/integration/claude-desktop-harness.test.ts's "should handle
corrupted compressed data gracefully" case on node 22 CI.

Putting the legacy-row fallback where it belongs — in
SqliteOptimizationStorage.decodePayload, keyed on the persisted
compression_algorithm column:

  - 'brotli'   → brotliDecompressSync
  - 'none'/''  → raw utf-8
  - null/undef → try brotli first, fall back to utf-8 (covers
                 pre-tracking rows)
  - unknown    → error

That preserves backward compatibility on the read path while keeping
the compression primitives strict, so callers that pass random base64
to decompressFromBase64 still see the intended error.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two parts:

1. package.json `overrides`: pin brace-expansion to ^2.0.2 and picomatch
   to ^4.0.4 to clean up the non-bundled copies pulled in transitively
   by eslint / test-exclude / top-level resolutions. That resolves
   every node_modules path that the project actually controls.

2. quality-gates.yml: `npm audit` now runs with `--omit=dev` so the
   step no longer fails on unfixable vulnerabilities inside
   node_modules/npm/**. npm itself bundles its own deps — the
   vulnerable brace-expansion / picomatch copies live inside
   @semantic-release/npm's bundled npm, which we pull in as a dev dep
   for releases and never ship to end users. The dedicated
   "Dependency Vulnerability Scan" step still covers the full tree.

Also stops `npm audit` inside the warning branch from killing the
step via its own non-zero exit code.

`npm audit --omit=dev` now reports 0 vulnerabilities.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 11

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
hooks/dispatcher.ps1 (1)

77-90: ⚠️ Potential issue | 🟠 Major

Clean up $tempFile before the early exit 2 paths.

On smart-read success or a context-guard block, this branch exits before the cleanup at Lines 153-156, so one temp file is leaked per intercepted request.

Proposed fix
         if ($toolName -eq "Read") {
             & powershell -NoProfile -ExecutionPolicy Bypass -File $ORCHESTRATOR -Phase "PreToolUse" -Action "smart-read" -InputJsonFile $tempFile
             if ($LASTEXITCODE -eq 2) {
+                if (Test-Path $tempFile) {
+                    Remove-Item -Path $tempFile -Force -ErrorAction SilentlyContinue
+                }
                 # smart_read succeeded - blocks plain Read and returns cached/optimized content
                 exit 2
             }
             # If smart_read failed, allow plain Read to proceed
         }
@@
         & powershell -NoProfile -ExecutionPolicy Bypass -File $ORCHESTRATOR -Phase "PreToolUse" -Action "context-guard" -InputJsonFile $tempFile
         if ($LASTEXITCODE -eq 2) {
+            if (Test-Path $tempFile) {
+                Remove-Item -Path $tempFile -Force -ErrorAction SilentlyContinue
+            }
             Block-Tool -Reason "Context budget exhausted - session optimization required"
         }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@hooks/dispatcher.ps1` around lines 77 - 90, The smart-read and context-guard
branches exit early and never remove the temporary file referenced by $tempFile;
before any early exit (the "exit 2" in the smart-read branch) and before calling
Block-Tool in the context-guard branch, invoke the same temp-file cleanup used
later (Remove-Item $tempFile -ErrorAction SilentlyContinue or the existing
cleanup helper) so the temp file is deleted on all code paths; update the blocks
around the smart-read call and the context-guard handling to perform this
cleanup immediately prior to exiting or blocking.
hooks/handlers/token-optimizer-orchestrator.ps1 (1)

1988-1993: ⚠️ Potential issue | 🔴 Critical

Pass the required key to optimize_text.

optimize_text still requires both text and key on the server side, but this call only sends text and quality. That means post-tool optimization falls into the validation-error path instead of producing a reusable optimized payload.

Proposed fix
             $optimizeArgs = @{
                 text = $outputText
+                key = "tool_output_$originalTextHash"
                 quality = $script:OPTIMIZATION_QUALITY
             }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@hooks/handlers/token-optimizer-orchestrator.ps1` around lines 1988 - 1993,
The optimize_text call is missing the required key in $optimizeArgs, causing
server validation errors; update the $optimizeArgs hash to include a key
property set to the same key variable used for other MCP/tool calls (e.g. the
script-level key like $script:KEY or whatever key variable you use elsewhere),
so $optimizeArgs contains text, quality and key before converting to JSON and
invoking "$HELPERS_DIR\invoke-mcp.ps1" -Tool "optimize_text".
♻️ Duplicate comments (9)
hooks/helpers/gzip.ps1 (2)

99-107: ⚠️ Potential issue | 🟠 Major

Fall back to plaintext when gzip read fails.

If the .gz file exists but is corrupt or partially written, ReadAllBytes or Expand-String will throw, even when a valid plaintext sibling exists. This breaks the backward-compatible migration path described in the header comment.

This issue was previously flagged.

Suggested fix
     $gzPath = "$Path.gz"
     if (Test-Path $gzPath) {
-        $bytes = [System.IO.File]::ReadAllBytes($gzPath)
-        return Expand-String -CompressedBytes $bytes
+        try {
+            $bytes = [System.IO.File]::ReadAllBytes($gzPath)
+            return Expand-String -CompressedBytes $bytes
+        } catch {
+            # Gzip corrupted — fall through to plaintext if available.
+            if (-not (Test-Path $Path)) {
+                throw
+            }
+        }
     }
     if (Test-Path $Path) {
         return [System.IO.File]::ReadAllText($Path, [System.Text.Encoding]::UTF8)
     }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@hooks/helpers/gzip.ps1` around lines 99 - 107, When a .gz sibling exists but
reading it fails, the code should catch exceptions from
[System.IO.File]::ReadAllBytes and Expand-String and fall back to reading the
plaintext file at $Path; update the block that uses $gzPath and Expand-String so
that any exception during gzip reading is handled (try/catch) and, on error,
attempts to return [System.IO.File]::ReadAllText($Path,
[System.Text.Encoding]::UTF8) if Test-Path $Path is true, otherwise rethrow or
return $null — ensure you reference and protect the $gzPath, $Path,
ReadAllBytes, and Expand-String calls.

72-85: ⚠️ Potential issue | 🟠 Major

Use a unique temp path to avoid concurrent write collisions.

Line 73 uses a fixed temp path "$gzPath.tmp". If multiple processes write the same file concurrently, they'll clobber each other's temp bytes before the final move, potentially publishing corrupted data.

This issue was previously flagged.

Suggested fix
     $compressed = Compress-String -InputString $Content
     $gzPath = "$Path.gz"
-    $tmpPath = "$gzPath.tmp"
+    $tmpPath = "$gzPath.$([guid]::NewGuid().ToString('N')).tmp"
     [System.IO.File]::WriteAllBytes($tmpPath, $compressed)
     # Atomic swap: File::Move(src, dst, overwrite:$true) on .NET5+.
     # Unlike "delete then move", this never leaves the caller with a
     # missing .gz file if the process crashes.
     try {
         [System.IO.File]::Move($tmpPath, $gzPath, $true)
     } catch {
+        throw
+    } finally {
         if (Test-Path $tmpPath) {
             Remove-Item -Path $tmpPath -Force -ErrorAction SilentlyContinue
         }
-        throw
     }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@hooks/helpers/gzip.ps1` around lines 72 - 85, The temp file path $tmpPath
uses a fixed suffix "$gzPath.tmp" which causes concurrent writers to collide;
change the temp path generation used before [System.IO.File]::WriteAllBytes so
it is unique per process/attempt (for example append a GUID or process id and
timestamp, e.g. "$gzPath.{GUID}.tmp"), keep using
[System.IO.File]::WriteAllBytes and the atomic [System.IO.File]::Move($tmpPath,
$gzPath, $true), and ensure the existing catch cleanup logic still checks and
removes the specific unique $tmpPath if present to avoid orphaned files.
src/core/tokenizers/google-ai-tokenizer.ts (2)

59-74: ⚠️ Potential issue | 🟠 Major

Move API key from URL to request header.

Exposing the API key in the URL query string (line 61) leaks it via HTTP access logs, proxy logs, and distributed tracing. Google's Generative Language API supports and recommends the x-goog-api-key header.

This issue was previously flagged.

Suggested fix
-        const url = `${this.endpoint}/${encodeURIComponent(
-            this.modelName
-        )}:countTokens?key=${encodeURIComponent(this.apiKey)}`;
+        const url = `${this.endpoint}/${encodeURIComponent(
+            this.modelName
+        )}:countTokens`;

         const controller = new AbortController();
         const timeout = setTimeout(() => controller.abort(), this.timeoutMs);

         try {
             const response = await fetch(url, {
                 method: 'POST',
-                headers: { 'Content-Type': 'application/json' },
+                headers: {
+                    'Content-Type': 'application/json',
+                    'x-goog-api-key': this.apiKey,
+                },
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/core/tokenizers/google-ai-tokenizer.ts` around lines 59 - 74, The request
currently puts the API key in the URL query string when building the countTokens
call in google-ai-tokenizer.ts (the URL using this.modelName and this.apiKey);
remove the ?key=... query parameter and instead add the API key as the
recommended header 'x-goog-api-key' in the fetch call's headers alongside
'Content-Type', passing this.apiKey directly (no URL encoding), and keep the
AbortController/timeout logic and existing body intact; update any related URL
construction to only include the encoded modelName.

50-53: ⚠️ Potential issue | 🟠 Major

Use namespaced hashed keys consistently.

The cache key strategy stores short prompts verbatim (line 51-52), which retains user content in memory and mixes raw/hash formats that could theoretically collide. Using a consistent namespaced hash for all inputs improves both privacy and collision resistance.

This issue was previously flagged.

Suggested fix
-        const key =
-            text.length <= KEY_HASH_THRESHOLD_CHARS
-                ? text
-                : createHash('sha256').update(text).digest('hex');
+        const key = `sha256:${createHash('sha256').update(text).digest('hex')}`;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/core/tokenizers/google-ai-tokenizer.ts` around lines 50 - 53, The current
cache key assignment uses raw text for short inputs (KEY_HASH_THRESHOLD_CHARS)
which mixes unhashed and hashed keys; change it so all inputs use a namespaced
hash instead: always compute createHash('sha256').update(text).digest('hex'),
then prefix with a stable namespace string (e.g., 'gai:' or similar) to form the
key variable, replacing the existing conditional; ensure any code that
reads/writes the cache uses this namespaced hashed key format so
KEY_HASH_THRESHOLD_CHARS is no longer used to decide raw storage.
src/core/config.ts (2)

251-253: ⚠️ Potential issue | 🟠 Major

Return a defensive copy to prevent mutation of shared state.

getOptimizationConfig() returns the live optimization object (or the shared DEFAULT_OPTIMIZATION instance). Callers can mutate modelTokenLimits and leak changes into subsequent calls or other ConfigManager instances.

This issue was previously flagged.

Suggested fix
   public getOptimizationConfig(): OptimizationConfig {
-    return this.config.optimization ?? DEFAULT_OPTIMIZATION;
+    const opt = this.config.optimization ?? DEFAULT_OPTIMIZATION;
+    return {
+      ...opt,
+      modelTokenLimits: { ...opt.modelTokenLimits },
+      cacheSettings: { ...opt.cacheSettings },
+      chatCompression: { ...opt.chatCompression },
+    };
   }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/core/config.ts` around lines 251 - 253, getOptimizationConfig currently
returns the live OptimizationConfig (this.config.optimization or
DEFAULT_OPTIMIZATION) which allows callers to mutate shared state (e.g.,
modelTokenLimits); change getOptimizationConfig to return a defensive copy
instead: locate the getOptimizationConfig method and return a shallow/deep copy
of the OptimizationConfig object (cloning nested mutable fields such as
modelTokenLimits) so modifications by callers do not affect
this.config.optimization or DEFAULT_OPTIMIZATION.

224-248: ⚠️ Potential issue | 🟠 Major

update() still loses existing optimization overrides.

When update() calls mergeConfig(this.config, updates), the merge at lines 230-246 always spreads from DEFAULT_OPTIMIZATION instead of defaults.optimization. This means calling update({ cache: {...} }) will reset any previously loaded optimization customizations back to defaults.

This issue was previously flagged and appears unresolved.

Suggested fix
   ): HypercontextConfig {
     const userOpt = user.optimization ?? {};
+    const baseOpt = defaults.optimization ?? DEFAULT_OPTIMIZATION;
     return {
       cache: { ...defaults.cache, ...user.cache },
       monitoring: { ...defaults.monitoring, ...user.monitoring },
       intelligence: { ...defaults.intelligence, ...user.intelligence },
       performance: { ...defaults.performance, ...user.performance },
       optimization: {
-        ...DEFAULT_OPTIMIZATION,
+        ...baseOpt,
         ...userOpt,
         cacheSettings: {
-          ...DEFAULT_OPTIMIZATION.cacheSettings,
+          ...baseOpt.cacheSettings,
           ...(userOpt.cacheSettings ?? {}),
         },
         chatCompression: {
-          ...DEFAULT_OPTIMIZATION.chatCompression,
+          ...baseOpt.chatCompression,
           ...(userOpt.chatCompression ?? {}),
         },
         modelTokenLimits: {
-          ...DEFAULT_OPTIMIZATION.modelTokenLimits,
+          ...baseOpt.modelTokenLimits,
           ...(userOpt.modelTokenLimits ?? {}),
         },
       },
     };
   }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/core/config.ts` around lines 224 - 248, The merge in the configuration
builder always starts from the hardcoded DEFAULT_OPTIMIZATION which overwrites
any previously applied optimization overrides when update() calls mergeConfig;
change the merge base to use defaults.optimization (the existing resolved
defaults) instead of DEFAULT_OPTIMIZATION and preserve the deep-merge logic for
cacheSettings, chatCompression and modelTokenLimits (keep userOpt spreads and
the ...(userOpt.* ?? {}) merging) so updates like update({ cache: {...} }) no
longer reset prior optimization customizations; reference the
update()/mergeConfig callsite and the optimization assembly code that constructs
optimization using DEFAULT_OPTIMIZATION, userOpt, cacheSettings,
chatCompression, and modelTokenLimits.
src/core/session.ts (2)

165-169: ⚠️ Potential issue | 🟠 Major

Don't reinsert summarized history as a system message.

This elevates user-derived summary text to the highest-priority prompt layer after compression. Use a non-system role here.

Proposed fix
         const summaryMessage: Message = {
-            role: 'system',
+            role: 'assistant',
             content: `[summary of earlier conversation] ${summary}`,
             timestamp: head[head.length - 1].timestamp,
         };
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/core/session.ts` around lines 165 - 169, The code re-inserts compressed
history as a system-level message; change summaryMessage.role from 'system' to a
non-system role (e.g., 'assistant') so the summarizer output doesn't become
highest-priority prompt context—update the block that creates summaryMessage
(the variable named summaryMessage created after
this.summarizer.summarize(head)) to use 'assistant' (or 'user') instead of
'system'.

88-94: ⚠️ Potential issue | 🟡 Minor

Return cloned session state instead of live references.

getHistory(), getFileState(), toSnapshot(), and fromSnapshot() still share Message / fileState objects by reference, so callers can mutate session state without updating updatedAt.

Also applies to: 177-200

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/core/session.ts` around lines 88 - 94, getHistory and getFileState (and
the serialization helpers toSnapshot/fromSnapshot) currently return or accept
live Message and SessionFileState references so callers can mutate internal
session state without touching updatedAt; change these methods to perform deep
copies: return a new array of cloned Message objects for getHistory, return a
cloned SessionFileState object for getFileState, have toSnapshot produce and
return a snapshot with cloned Message(s) and cloned fileState, and have
fromSnapshot clone the incoming snapshot before assigning to
this.history/this.fileState. Use the existing symbols Message, SessionFileState,
history, fileState, toSnapshot, fromSnapshot, getHistory, getFileState and
ensure updatedAt remains controlled only by internal mutators (i.e., external
mutations of returned objects must not affect internal state).
hooks/handlers/token-optimizer-orchestrator.ps1 (1)

2019-2033: ⚠️ Potential issue | 🟠 Major

Only log a successful store after checking the MCP result.

optimization_storage reports tool-level failures in-band via success: false, but this path logs a successful store unconditionally. That makes cache persistence failures silent and hard to diagnose.

Proposed fix
                 try {
                     $storeArgs = @{
                         operation = "store"
                         originalTextHash = $originalTextHash
                         optimizedText = [System.Convert]::ToBase64String([System.Text.Encoding]::UTF8.GetBytes($optimizedText))
                         originalTokens = $beforeTokens
                         optimizedTokens = $afterTokens
                         tokensSaved = $saved
                     }
                     $storeJson = $storeArgs | ConvertTo-Json -Compress
-                    & "$HELPERS_DIR\invoke-mcp.ps1" -Tool "optimization_storage" -ArgumentsJson $storeJson
-                    Write-Log "Stored new optimization result. Hash: $originalTextHash" "DEBUG"
+                    $storeResultJson = & "$HELPERS_DIR\invoke-mcp.ps1" -Tool "optimization_storage" -ArgumentsJson $storeJson
+                    $storeResult = if ($storeResultJson) { $storeResultJson | ConvertFrom-Json } else { $null }
+                    if ($storeResult -and $storeResult.success) {
+                        Write-Log "Stored new optimization result. Hash: $originalTextHash" "DEBUG"
+                    } else {
+                        $storeError = if ($storeResult -and $storeResult.error) { $storeResult.error } else { "Unknown optimization_storage failure" }
+                        Write-Log "Failed to store optimization result: $storeError" "WARN"
+                    }
                 } catch {
                     Handle-Error -Exception $_.Exception -Message "Failed to store optimization result"
                 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@hooks/handlers/token-optimizer-orchestrator.ps1` around lines 2019 - 2033,
The code unconditionally logs success after calling the MCP helper; instead
capture and parse the MCP invocation result from & "$HELPERS_DIR\invoke-mcp.ps1"
(the call that sends operation="store") into a variable, convert it from JSON,
check the result's success flag and any error/message fields, and only call
Write-Log "Stored new optimization result..." when success is true; if success
is false, call Handle-Error (or Write-Log at ERROR) with the returned error
details and include the originalTextHash and the MCP response for diagnostics,
leaving the existing base64 encoding of optimizedText and the surrounding
try/catch intact.
🧹 Nitpick comments (2)
hooks/helpers/logging.ps1 (1)

4-37: Consider renaming Write-Log to avoid overriding built-in cmdlet.

Write-Log is a built-in cmdlet in PowerShell Core 6.1.0+. Redefining it in this helper works, but may cause confusion if the built-in behavior is ever expected. Consider Write-TokenOptimizerLog or similar namespaced name to avoid conflicts.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@hooks/helpers/logging.ps1` around lines 4 - 37, The function name Write-Log
shadows a built-in PowerShell cmdlet and should be renamed to avoid collisions;
update the function declaration and all internal references to a namespaced name
such as Write-TokenOptimizerLog (or similar) throughout this file, including any
calls that reference Write-Log and uses of $script:LOG_FILE, $Level, $Context,
and other parameters, and ensure exported/consumed entrypoints (scripts/modules)
calling Write-Log are updated to the new name so behavior and logging (file
writing, verbose output, debug gating) remain identical.
src/core/session-manager.ts (1)

90-95: Prefer applying defaults after spreading options.

Current merge order allows explicit undefined fields in options to override manager defaults (e.g., maxTokens, tokenizer/summarizer defaults).

♻️ Suggested refactor
 const session = new Session({
-    tokenizer: this.tokenizer,
-    summarizer: this.summarizer,
-    maxTokens: options.maxTokens ?? this.defaultMaxTokens,
-    ...options,
+    ...options,
+    tokenizer: options.tokenizer ?? this.tokenizer,
+    summarizer: options.summarizer ?? this.summarizer,
+    maxTokens: options.maxTokens ?? this.defaultMaxTokens,
 });
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/core/session-manager.ts` around lines 90 - 95, The Session constructor
call applies defaults before spreading options so explicit undefined in options
can override manager defaults; change the merge order so options are spread
first and then defaults are applied (or explicitly nullish-coalesce each option
field) when constructing the Session (referencing Session, the session variable,
this.tokenizer, this.summarizer, maxTokens and this.defaultMaxTokens) so manager
defaults win unless options provide a concrete value.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@hooks/helpers/context-delta.ps1`:
- Around line 59-72: When you generate a local SessionId via
Get-TokenOptimizerSessionId before calling ContextDeltaTool.computeDelta, ensure
you also bootstrap the server-side SessionManager so the session exists
remotely: after assigning $SessionId (only when it was newly created), call
Invoke-TokenOptimizer to create the server session (e.g. Invoke-TokenOptimizer
-ToolName 'session_manager' -Arguments @{ operation = 'create'; sessionId =
$SessionId }) and verify success before invoking Invoke-TokenOptimizer -ToolName
'context_delta' -Arguments $toolArgs; this ensures ContextDeltaTool.computeDelta
does not reject unknown sessions.
- Around line 70-86: The Invoke-TokenOptimizer helper is never loaded because
invoke-token-optimizer.ps1 is a parameterized script rather than a dot-sourced
function, so Get-Command Invoke-TokenOptimizer always fails and context-delta
returns $null; fix by either (A) converting invoke-token-optimizer.ps1 to export
a reusable function named Invoke-TokenOptimizer and dot-source that file from
the orchestrator alongside logging.ps1/config.ps1/gzip.ps1, or (B) move the
token optimizer logic into an existing sourced helper (e.g., context-delta.ps1
or a new helpers file) and ensure the function name Invoke-TokenOptimizer is
defined and discoverable before the Get-Command check in context-delta.ps1 so
the try/catch block can invoke it successfully.

In `@hooks/helpers/logging.ps1`:
- Around line 46-48: The code assigns $stackTrace from
$Exception.ScriptStackTrace which shadows PowerShell's automatic $StackTrace;
rename the local variable (for example to $exceptionStackTrace or
$scriptStackTrace) wherever it's declared and used (the assignment currently at
$stackTrace = $Exception.ScriptStackTrace and the subsequent Write-Log calls
that reference $stackTrace) so you do not override the automatic variable and
update the two Write-Log calls to use the new variable name.

In `@src/core/session-manager.ts`:
- Around line 117-130: In addMessage, the session is mutated
(session.addMessage) before awaiting getHistoryTokenCount()/compressHistory, so
if those async calls throw schedulePersist is never called; wrap the
token-counting/compression logic in a try/catch/finally (or move schedulePersist
into a finally block) so schedulePersist() always runs even on error, and
rethrow the error after ensuring persistence; reference addMessage,
requireSession, session.addMessage, session.getHistoryTokenCount,
session.compressHistory and schedulePersist when locating the change.
- Around line 234-243: Restored sessions currently bypass the write-time
file-state size check, so in the loop that reconstructs sessions
(Session.fromSnapshot and this.sessions.set) enforce the same maxFileStateBytes
limit used on writes: for each snapshot check the size of snapshot.fileState (or
serialized snapshot/fileState) against this.maxFileStateBytes and either
drop/skip or truncate the fileState before calling Session.fromSnapshot; ensure
the same behavior and logging as the write path so oversized persisted/legacy
entries cannot be reintroduced during restore while still honoring sessionTtlMs.

In `@src/core/summarization.ts`:
- Around line 133-137: The thrown Error in src/core/summarization.ts embeds the
provider response body (in the response.ok failure path), which can leak
sensitive content; update the error handling in the function that checks
response.ok so it no longer includes raw response.text() in the thrown Error —
instead throw a sanitized message containing only response.status and
response.statusText (optionally include a non-sensitive hint like "response body
omitted" or the body length), and if you need to capture body for debugging, log
it at debug level after redaction rather than putting it into the thrown Error;
apply this same change to the second identical block around the other failure
path (the similar response.ok check at lines ~218-222).
- Around line 40-63: The truncation math in TruncatingSummarizer (constructor,
summarize) is brittle: replace the hardcoded -20 with a computed marker string
(e.g., const marker = '\n... [truncated] ...\n') and its length, ensure keepHead
= Math.floor(maxChars * 0.4) and keepTail = Math.max(0, maxChars - keepHead -
marker.length), and handle tiny maxChars by returning either a truncated marker
(if maxChars <= marker.length) or slicing only head/tail when keepTail is zero;
update uses of keepHead/keepTail so the returned string never exceeds maxChars
and never uses negative slice lengths.

In `@src/server/index.ts`:
- Around line 410-424: The memoized caches memoizedSmartRead, memoizedSmartGrep,
and memoizedSmartGlob are never cleared after filesystem mutations, causing
stale read results; update the write/edit code paths (e.g., the functions or
handlers that call runSmartWrite and runSmartEdit) to explicitly invalidate
those caches after a successful write/edit by calling the memo clear/invalidate
API on memoizedSmartRead, memoizedSmartGrep, and memoizedSmartGlob (e.g.,
memoizedSmartRead.clear()/invalidate()) so reads reflect the latest filesystem
state.
- Around line 925-944: The handler for the count_tokens response changed
content[0].text to a JSON string, breaking callers that expect a scalar integer;
restore backward compatibility by returning the numeric count as a plain string
(not a JSON object) or update callers in the same PR. Concretely, in the
count_tokens handling code that uses TokenCounter/tokenCounter and calls count()
or countAsync(), set content[0].text to the numeric count value (e.g.,
String(result.count ?? result)) and keep the existing model info elsewhere or
add a separate field, and ensure counter.free() is still called when modelName
is provided.
- Around line 929-937: The temporary TokenCounter created when modelName is set
must be freed in a finally to ensure resources are released even on errors: wrap
the call to counter.count(text) / await counter.countAsync(text) in a
try/finally, assign the result inside the try (using await for the async branch)
and in finally call counter.free() only when modelName is truthy (so the shared
tokenCounter is not freed). Reference TokenCounter, countAsync, count, free,
modelName, and tokenCounter when making the change.

In `@src/utils/gzip.ts`:
- Around line 82-85: loadMaybeGzippedFile currently hard-fails when an existing
.gz file cannot be decompressed; wrap the gunzipBuffer call in a try/catch so a
decompression error falls back to reading the original plaintext path (use
readFileSync on the non-".gz" path and return that content) instead of throwing;
reference gzPath, gunzipBuffer, readFileSync and existsSync in the change and
optionally log the decompression error before returning the plaintext fallback.

---

Outside diff comments:
In `@hooks/dispatcher.ps1`:
- Around line 77-90: The smart-read and context-guard branches exit early and
never remove the temporary file referenced by $tempFile; before any early exit
(the "exit 2" in the smart-read branch) and before calling Block-Tool in the
context-guard branch, invoke the same temp-file cleanup used later (Remove-Item
$tempFile -ErrorAction SilentlyContinue or the existing cleanup helper) so the
temp file is deleted on all code paths; update the blocks around the smart-read
call and the context-guard handling to perform this cleanup immediately prior to
exiting or blocking.

In `@hooks/handlers/token-optimizer-orchestrator.ps1`:
- Around line 1988-1993: The optimize_text call is missing the required key in
$optimizeArgs, causing server validation errors; update the $optimizeArgs hash
to include a key property set to the same key variable used for other MCP/tool
calls (e.g. the script-level key like $script:KEY or whatever key variable you
use elsewhere), so $optimizeArgs contains text, quality and key before
converting to JSON and invoking "$HELPERS_DIR\invoke-mcp.ps1" -Tool
"optimize_text".

---

Duplicate comments:
In `@hooks/handlers/token-optimizer-orchestrator.ps1`:
- Around line 2019-2033: The code unconditionally logs success after calling the
MCP helper; instead capture and parse the MCP invocation result from &
"$HELPERS_DIR\invoke-mcp.ps1" (the call that sends operation="store") into a
variable, convert it from JSON, check the result's success flag and any
error/message fields, and only call Write-Log "Stored new optimization
result..." when success is true; if success is false, call Handle-Error (or
Write-Log at ERROR) with the returned error details and include the
originalTextHash and the MCP response for diagnostics, leaving the existing
base64 encoding of optimizedText and the surrounding try/catch intact.

In `@hooks/helpers/gzip.ps1`:
- Around line 99-107: When a .gz sibling exists but reading it fails, the code
should catch exceptions from [System.IO.File]::ReadAllBytes and Expand-String
and fall back to reading the plaintext file at $Path; update the block that uses
$gzPath and Expand-String so that any exception during gzip reading is handled
(try/catch) and, on error, attempts to return
[System.IO.File]::ReadAllText($Path, [System.Text.Encoding]::UTF8) if Test-Path
$Path is true, otherwise rethrow or return $null — ensure you reference and
protect the $gzPath, $Path, ReadAllBytes, and Expand-String calls.
- Around line 72-85: The temp file path $tmpPath uses a fixed suffix
"$gzPath.tmp" which causes concurrent writers to collide; change the temp path
generation used before [System.IO.File]::WriteAllBytes so it is unique per
process/attempt (for example append a GUID or process id and timestamp, e.g.
"$gzPath.{GUID}.tmp"), keep using [System.IO.File]::WriteAllBytes and the atomic
[System.IO.File]::Move($tmpPath, $gzPath, $true), and ensure the existing catch
cleanup logic still checks and removes the specific unique $tmpPath if present
to avoid orphaned files.

In `@src/core/config.ts`:
- Around line 251-253: getOptimizationConfig currently returns the live
OptimizationConfig (this.config.optimization or DEFAULT_OPTIMIZATION) which
allows callers to mutate shared state (e.g., modelTokenLimits); change
getOptimizationConfig to return a defensive copy instead: locate the
getOptimizationConfig method and return a shallow/deep copy of the
OptimizationConfig object (cloning nested mutable fields such as
modelTokenLimits) so modifications by callers do not affect
this.config.optimization or DEFAULT_OPTIMIZATION.
- Around line 224-248: The merge in the configuration builder always starts from
the hardcoded DEFAULT_OPTIMIZATION which overwrites any previously applied
optimization overrides when update() calls mergeConfig; change the merge base to
use defaults.optimization (the existing resolved defaults) instead of
DEFAULT_OPTIMIZATION and preserve the deep-merge logic for cacheSettings,
chatCompression and modelTokenLimits (keep userOpt spreads and the ...(userOpt.*
?? {}) merging) so updates like update({ cache: {...} }) no longer reset prior
optimization customizations; reference the update()/mergeConfig callsite and the
optimization assembly code that constructs optimization using
DEFAULT_OPTIMIZATION, userOpt, cacheSettings, chatCompression, and
modelTokenLimits.

In `@src/core/session.ts`:
- Around line 165-169: The code re-inserts compressed history as a system-level
message; change summaryMessage.role from 'system' to a non-system role (e.g.,
'assistant') so the summarizer output doesn't become highest-priority prompt
context—update the block that creates summaryMessage (the variable named
summaryMessage created after this.summarizer.summarize(head)) to use 'assistant'
(or 'user') instead of 'system'.
- Around line 88-94: getHistory and getFileState (and the serialization helpers
toSnapshot/fromSnapshot) currently return or accept live Message and
SessionFileState references so callers can mutate internal session state without
touching updatedAt; change these methods to perform deep copies: return a new
array of cloned Message objects for getHistory, return a cloned SessionFileState
object for getFileState, have toSnapshot produce and return a snapshot with
cloned Message(s) and cloned fileState, and have fromSnapshot clone the incoming
snapshot before assigning to this.history/this.fileState. Use the existing
symbols Message, SessionFileState, history, fileState, toSnapshot, fromSnapshot,
getHistory, getFileState and ensure updatedAt remains controlled only by
internal mutators (i.e., external mutations of returned objects must not affect
internal state).

In `@src/core/tokenizers/google-ai-tokenizer.ts`:
- Around line 59-74: The request currently puts the API key in the URL query
string when building the countTokens call in google-ai-tokenizer.ts (the URL
using this.modelName and this.apiKey); remove the ?key=... query parameter and
instead add the API key as the recommended header 'x-goog-api-key' in the fetch
call's headers alongside 'Content-Type', passing this.apiKey directly (no URL
encoding), and keep the AbortController/timeout logic and existing body intact;
update any related URL construction to only include the encoded modelName.
- Around line 50-53: The current cache key assignment uses raw text for short
inputs (KEY_HASH_THRESHOLD_CHARS) which mixes unhashed and hashed keys; change
it so all inputs use a namespaced hash instead: always compute
createHash('sha256').update(text).digest('hex'), then prefix with a stable
namespace string (e.g., 'gai:' or similar) to form the key variable, replacing
the existing conditional; ensure any code that reads/writes the cache uses this
namespaced hashed key format so KEY_HASH_THRESHOLD_CHARS is no longer used to
decide raw storage.

---

Nitpick comments:
In `@hooks/helpers/logging.ps1`:
- Around line 4-37: The function name Write-Log shadows a built-in PowerShell
cmdlet and should be renamed to avoid collisions; update the function
declaration and all internal references to a namespaced name such as
Write-TokenOptimizerLog (or similar) throughout this file, including any calls
that reference Write-Log and uses of $script:LOG_FILE, $Level, $Context, and
other parameters, and ensure exported/consumed entrypoints (scripts/modules)
calling Write-Log are updated to the new name so behavior and logging (file
writing, verbose output, debug gating) remain identical.

In `@src/core/session-manager.ts`:
- Around line 90-95: The Session constructor call applies defaults before
spreading options so explicit undefined in options can override manager
defaults; change the merge order so options are spread first and then defaults
are applied (or explicitly nullish-coalesce each option field) when constructing
the Session (referencing Session, the session variable, this.tokenizer,
this.summarizer, maxTokens and this.defaultMaxTokens) so manager defaults win
unless options provide a concrete value.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 1d0eb650-7b17-4882-9b3a-901e8475d697

📥 Commits

Reviewing files that changed from the base of the PR and between de0a8f9 and 778d01a.

📒 Files selected for processing (37)
  • hooks/dispatcher.ps1
  • hooks/handlers/token-optimizer-orchestrator.ps1
  • hooks/helpers/config.ps1
  • hooks/helpers/context-delta.ps1
  • hooks/helpers/gzip.ps1
  • hooks/helpers/logging.ps1
  • src/analytics/optimization-storage.ts
  • src/core/compression-engine.ts
  • src/core/config.ts
  • src/core/session-manager.ts
  • src/core/session.ts
  • src/core/summarization.ts
  • src/core/token-counter.ts
  • src/core/tokenizers/google-ai-tokenizer.ts
  • src/core/tokenizers/heuristic-tokenizer.ts
  • src/core/tokenizers/i-tokenizer.ts
  • src/core/tokenizers/tiktoken-tokenizer.ts
  • src/core/tokenizers/tokenizer-factory.ts
  • src/core/types.ts
  • src/server/index.ts
  • src/tools/context-delta-tool.ts
  • src/tools/optimization-storage-tool.ts
  • src/utils/diff.ts
  • src/utils/gzip.ts
  • src/utils/lru-cache.ts
  • src/utils/lru-memoize.ts
  • src/validation/tool-schemas.ts
  • tests/benchmarks/results.json
  • tests/unit/cache-engine.test.ts
  • tests/unit/config.test.ts
  • tests/unit/diff.test.ts
  • tests/unit/gzip.test.ts
  • tests/unit/lru-cache.test.ts
  • tests/unit/lru-memoize.test.ts
  • tests/unit/session.test.ts
  • tests/unit/summarization.test.ts
  • tests/unit/tokenizers.test.ts
✅ Files skipped from review due to trivial changes (8)
  • src/core/tokenizers/i-tokenizer.ts
  • src/utils/diff.ts
  • tests/unit/lru-memoize.test.ts
  • src/core/types.ts
  • tests/unit/session.test.ts
  • tests/unit/config.test.ts
  • src/tools/context-delta-tool.ts
  • tests/unit/lru-cache.test.ts
🚧 Files skipped from review as they are similar to previous changes (11)
  • tests/unit/diff.test.ts
  • tests/unit/gzip.test.ts
  • tests/unit/cache-engine.test.ts
  • tests/unit/tokenizers.test.ts
  • tests/unit/summarization.test.ts
  • src/validation/tool-schemas.ts
  • tests/benchmarks/results.json
  • src/utils/lru-memoize.ts
  • src/core/tokenizers/tiktoken-tokenizer.ts
  • src/analytics/optimization-storage.ts
  • src/tools/optimization-storage-tool.ts

Comment thread hooks/helpers/context-delta.ps1 Outdated
Comment thread hooks/helpers/context-delta.ps1 Outdated
Comment thread hooks/helpers/logging.ps1 Outdated
Comment thread src/core/session-manager.ts Outdated
Comment thread src/core/session-manager.ts
Comment thread src/core/summarization.ts
Comment thread src/server/index.ts
Comment thread src/server/index.ts Outdated
Comment thread src/server/index.ts
Comment thread src/utils/gzip.ts
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.github/workflows/quality-gates.yml:
- Around line 132-137: The CI currently only runs a production-only audit (npm
audit --omit=dev) which hides dev-dependency vulnerabilities when
SNYK_TOKEN-based full-tree scans are disabled; update the workflow to run two
audits: keep the existing production-only command (npm audit --omit=dev --json >
audit-results.json || true) and add a second full-tree audit (npm audit --json >
audit-results-full.json || true) so non-gating dev issues are captured, then
include both audit-results.json and audit-results-full.json in the artifact
upload path; apply the same change for the duplicate block referenced around the
other occurrence (the section noted at lines 154-162).

In `@package.json`:
- Around line 135-137: The overrides block currently pins "brace-expansion" and
"picomatch" with caret ranges; change this to exact pins or remove the
problematic picomatch override: locate the package.json "overrides" object and
either remove the "picomatch" override entirely or replace the caret range with
a strict exact version (no ^) that is known to be compatible with all transitive
consumers (e.g., adjust to a single exact semver that satisfies
anymatch/micromatch and jest-util/tinyglobby or else drop the override). Also
ensure "brace-expansion" is pinned exactly (no caret) if you intend to keep it,
then run the lockfile update and CI/build to verify transitive dependency
compatibility.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: d3f542a3-0e0e-4de7-8249-ae93f315d80a

📥 Commits

Reviewing files that changed from the base of the PR and between 778d01a and b040513.

⛔ Files ignored due to path filters (1)
  • package-lock.json is excluded by !**/package-lock.json
📒 Files selected for processing (4)
  • .github/workflows/quality-gates.yml
  • package.json
  • src/analytics/optimization-storage.ts
  • src/core/compression-engine.ts

Comment thread .github/workflows/quality-gates.yml Outdated
Comment thread package.json Outdated
ooples and others added 2 commits April 19, 2026 22:53
PowerShell helpers
- gzip.ps1 Save-GzippedFile: per-write GUID-suffixed temp path so
  concurrent writers can't clobber each other mid-write; move the
  stale-tmp cleanup into `finally` so a failed atomic swap still
  unlinks the tmp file.
- gzip.ps1 Read-MaybeGzippedFile: fall back to the plaintext sibling
  when the .gz is corrupt/partial rather than hard-failing the read.
- logging.ps1 Handle-Error: renamed $stackTrace → $exceptionTrace to
  stop shadowing PowerShell's built-in $StackTrace automatic variable.
- context-delta.ps1 Invoke-ContextDelta: call the repo's existing
  invoke-mcp.ps1 directly (the Invoke-TokenOptimizer function it
  previously probed for never actually existed, so every context_delta
  update silently returned $null). Server now auto-creates the session
  on first contact, so no separate bootstrap call is needed.

TypeScript
- src/utils/gzip.ts loadMaybeGzippedFile: same plaintext-fallback
  behavior on a bad .gz so the backward-compat path actually works.
- src/core/session-manager.addMessage: schedulePersist runs in
  `finally` so a tokenizer/compression throw still persists the
  mutated session; restore path now enforces maxFileStateBytes on
  each per-file entry so a tampered persisted file can't smuggle in
  oversized state past the write-time cap.
- src/core/session-manager.getOrCreateSession: new helper.
- src/tools/context-delta-tool: compute-delta / seed use
  getOrCreateSession so unknown sessionIds bootstrap cleanly; input
  schema is now a discriminated oneOf keyed on operation so
  compute-delta/seed require currentContent at validation time
  instead of at dispatch.
- src/validation/tool-schemas: ContextDeltaSchema mirrors the same
  discriminated-union shape on the zod side.
- src/core/config.mergeConfig: start from defaults.optimization
  instead of always DEFAULT_OPTIMIZATION, so update() calls that
  don't touch optimization no longer silently reset it.
- src/core/session: getHistory / getFileState / toSnapshot /
  fromSnapshot return defensive message copies so external mutation
  can't bypass updatedAt or corrupt session internals; summary
  messages are now role:'assistant' instead of 'system' to avoid
  promoting possibly-user-derived content into higher-priority
  instruction context (prompt-injection hardening).
- src/core/tokenizers/google-ai-tokenizer: always hash cache keys
  with a `sha256:` namespace prefix (no more verbatim text keys);
  authenticate with the `x-goog-api-key` header instead of a `?key=`
  query param so the key never ends up in access logs; thrown
  errors no longer embed the response body.
- src/core/summarization.TruncatingSummarizer: validate maxChars
  (>=32), compute the truncation budget from the actual marker
  length so the final output never exceeds maxChars for small limits;
  Anthropic & Google summarizers stop embedding provider response
  bodies in thrown errors.
- src/utils/lru-memoize: envelope cached values in { value }
  so a legitimately-cached `undefined` isn't treated as a miss;
  tag bigints with a dedicated discriminator in the default key
  serializer so `[1n]` and `["1"]` don't collapse to the same key.
- src/server/index.ts smart_write / smart_edit: invalidate the
  memoized read-only caches (smart_read/grep/glob) after any
  filesystem mutation so stale results aren't returned until TTL
  expiry.
- src/server/index.ts count_tokens: returns a two-element content
  array — `content[0].text` stays the scalar token count (preserves
  the int-parse contract that PS orchestrator uses at L931/1910/2092),
  `content[1].text` carries the structured JSON. counter.free() now
  runs in `finally` so a throwing countAsync doesn't leak the
  per-call tiktoken encoder.

Tests
- lru-memoize.test: new cases for undefined-memoization and
  bigint/string key non-collision.
- session.test: assert the summary role is `assistant`, not `system`.

All 61 unit tests in the new suites pass; tsc --noEmit clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Addresses the two remaining coderabbit findings:

- Remove the package.json `overrides` for brace-expansion and
  picomatch. The picomatch ^4.0.4 override was risky — it forces a
  major version on every transitive consumer, and can break
  packages that declare older picomatch majors. `npm audit --omit=dev`
  already reports 0 vulnerabilities without the override because the
  remaining vulns live inside @semantic-release/npm's bundled npm
  (dev-only, never shipped), and that's the scope the Security Audit
  step gates on.

- quality-gates.yml: keep the `--omit=dev` gating audit, but also
  run a full-tree `npm audit` and write audit-results-full.json so
  dev-dep findings stay visible even on repos/forks without a
  SNYK_TOKEN. Both artifacts are uploaded. The step still fails
  only on critical prod vulnerabilities.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown

Performance Benchmark Results


@ooples ooples merged commit b316152 into master Apr 20, 2026
14 of 16 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment