Skip to content

Feature Request: Microcompact Context Compression Strategy #2817

@chinesepowered

Description

@chinesepowered

What would you like to be added?

Feature Request: Microcompact Context Compression Strategy

Problem

Qwen Code's chatCompressionService.ts has a single compression strategy: summarize the conversation via an LLM API call when context usage exceeds 70% of the window, keeping the last 30%. This means every compression event incurs the latency and cost of an additional LLM call, even when much of the bloat comes from large, stale tool results that could be cheaply trimmed without any LLM involvement.

Proposed Solution

Add a microcompact pre-pass that runs before the LLM summarization step:

  1. Scan chat history for old tool results from tools known to produce large outputs (read_file, run_shell_command, grep_search, glob, etc.)
  2. Skip the N most recent tool results (default: 5) to preserve the active working context
  3. Replace older, large tool results (> 500 chars) with a short cleared message
  4. If microcompact alone brings context usage below the threshold, skip the LLM summarization entirely

This is a zero-LLM-call compression strategy that reduces API costs and latency. The LLM summarization remains as a Phase 2 fallback when microcompact alone isn't sufficient.

Scope

  • New module: packages/core/src/services/microcompact.ts
  • Changes to packages/core/src/services/chatCompressionService.ts (add Phase 1 microcompact before Phase 2 LLM)
  • New MICROCOMPACTED status in CompressionStatus enum (turn.ts)
  • Handle MICROCOMPACTED status in client.ts and CompressionMessage.tsx

Impact

High (reduces API costs for compression). Medium effort.

Why is this needed?

Qwen Code's chatCompressionService.ts has a single compression strategy: summarize the conversation via an LLM API call when context usage exceeds 70% of the window, keeping the last 30%. This means every compression event incurs the latency and cost of an additional LLM call, even when much of the bloat comes from large, stale tool results that could be cheaply trimmed without any LLM involvement.

Additional context

No response

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions