feat(core): add microcompaction for idle context cleanup#3006
feat(core): add microcompaction for idle context cleanup#3006tanzhenxin wants to merge 6 commits intomainfrom
Conversation
Clear old tool result content from chat history when the user returns after an idle period (default 60 min). Replaces functionResponse output with a sentinel string for compactable tools (read_file, shell, grep, glob, web_fetch, web_search, edit, write_file), keeping the N most recent results intact (default 5). Runs before full compression so it can shed tokens cheaply without an API call. - Time-based trigger reuses lastApiCompletionTimestamp from thinking cleanup - Per-part counting so keepRecent applies to individual tool results even when batched in parallel - Preserves tool error responses (only clears successful outputs) - Configurable via settings.json (context.microcompaction) with env var overrides for E2E testing - Enabled by default
📋 Review SummaryThis PR introduces a lightweight microcompaction pre-pass that clears old tool result content from chat history after an idle period (default 60 minutes). The implementation is well-structured, thoroughly tested with 18 unit tests, and integrates cleanly with the existing compression system. The code quality is high overall, with good separation of concerns and thoughtful edge case handling. 🔍 General Feedback
🎯 Specific Feedback🟡 High
🟢 Medium
🔵 Low
✅ Highlights
|
Consolidate thinking block cleanup and tool results microcompaction
config into a single `context.clearContextOnIdle` settings group:
{
"context": {
"clearContextOnIdle": {
"thinkingThresholdMinutes": 5,
"toolResultsThresholdMinutes": 60,
"toolResultsNumToKeep": 5
}
}
}
- Use -1 on either threshold to disable that cleanup (no enabled bool)
- Remove separate `microcompaction` and `gapThresholdMinutes` settings
- Thinking cleanup: 5 min default (unchanged)
- Tool results cleanup: 60 min default
- Preserve tool error responses (only clear successful outputs)
…tion - Add gapThresholdMinutes settings for thinking blocks, tool results, and retention count - Remove deprecated gapThresholdMinutes from root settings level This reorganizes the context clearing settings into a dedicated clearContextOnIdle object with configurable thresholds for thinking blocks and tool results. Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
Code Coverage Summary
CLI Package - Full Text ReportCore Package - Full Text ReportFor detailed HTML reports, please see the 'coverage-reports-22.x-ubuntu-latest' artifact from the main CI run. |
|
Known limitations for followup:
|
Move microcompactHistory() inside the UserQuery/Cron guard so model latency during tool-call loops doesn't count as user idle time.
E2E Test Report — MicrocompactionDate: 2026-04-08 | Branch:
4/4 passed. Bug found and fixed (c61b0bf)
Fix: Moved the call inside the |
Replace stale `context.gapThresholdMinutes` entry with the new `context.clearContextOnIdle.*` settings group introduced in the microcompaction feature.
TLDR
Add a lightweight microcompaction pre-pass that clears old tool result content from chat history when the user returns after an idle period (default 60 min). Replaces
functionResponse.outputwith a sentinel string for compactable tools (read_file,run_shell_command,grep_search,glob,web_fetch,web_search,edit,write_file), keeping the N most recent results intact (default 5). Runs before full LLM compression so it can shed tokens cheaply without an API call.Also unifies the thinking block idle cleanup (from #2897) and tool results microcompaction into a single
context.clearContextOnIdlesettings group:{ "context": { "clearContextOnIdle": { "thinkingThresholdMinutes": 5, "toolResultsThresholdMinutes": 60, "toolResultsNumToKeep": 5 } } }Use
-1on either threshold to disable that cleanup.Screenshots / Video Demo
N/A — no user-facing change. The feature operates transparently before each model request. Debug logging prints when it fires:
Dive Deeper
Design: Time-based trigger aligned with server prompt cache TTL. Only fires when cache is already cold, so clearing content is free in terms of cache cost. When cache is warm, tool results are kept intact.
Key decisions:
toolResultsNumToKeepapplies to individual tool results even when batched in parallelfunctionResponse.nameand message structure (functionCall → functionResponse pairing stays intact)context.clearContextOnIdle— thinking and tool results cleanup share one config group, each with its own threshold-1disables a threshold (no separateenabledboolean)QWEN_MC_KEEP_RECENTenv var override for E2E testingFiles:
packages/core/src/services/microcompaction/microcompact.ts— core logic (trigger evaluation, clearing)packages/core/src/services/microcompaction/microcompact.test.ts— 18 unit testspackages/core/src/core/client.ts— integration: callsmicrocompactHistory()beforetryCompressChat(); refactored thinking cleanup to use shared configpackages/core/src/config/config.ts—ClearContextOnIdleSettingsinterface + Config getterpackages/cli/src/config/settingsSchema.ts— settings schema undercontext.clearContextOnIdlepackages/cli/src/config/config.ts— maps settings to ConfigReviewer Test Plan
cd packages/core && npx vitest run src/services/microcompaction/microcompact.test.ts(18 tests)cd packages/core && npx vitest run src/core/client.test.ts(63 tests)npm run build && npm run bundlecd /tmp/mc-test && node dist/cli.js --approval-mode yolo --openai-logging --openai-logging-dir /tmp/mc-logs[Old tool result content cleared]Testing Matrix
Linked issues / bugs
Resolves #2817