feat(core): dynamic tool output truncation based on context pressure#2572
Open
netbrah wants to merge 1 commit intoQwenLM:mainfrom
Open
feat(core): dynamic tool output truncation based on context pressure#2572netbrah wants to merge 1 commit intoQwenLM:mainfrom
netbrah wants to merge 1 commit intoQwenLM:mainfrom
Conversation
3c42d68 to
5296c1a
Compare
Increase default truncation thresholds from 25K to 80K chars and 1000 to 2000 lines for better data availability in early sessions. Thresholds now scale dynamically as the context window fills: char limit caps at min(4 * remainingTokens, base), line limit scales as max(500, baseLines * (1 - usageRatio)). Fixes QwenLM#2566 Made-with: Cursor
5296c1a to
32a0dd5
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Make tool output truncation thresholds context-aware. Bump defaults from 25K → 80K chars / 1000 → 2000 lines, with dynamic scaling as context fills up.
Why
The old 25K/1000 defaults were too aggressive early in a session — cutting off useful tool output that the model needs to work with. But keeping a flat 80K budget would be reckless late in a session when context pressure is high.
The fix: scale dynamically based on how full the context window is. Early session → full budget. Late session → shrink proportionally. The model always gets the maximum data it can safely handle at each point in the conversation.
Scaling formulas
min(4 × remainingTokens, baseThreshold)max(500, floor(baseLines × (1 − usageRatio)))Both read from
uiTelemetryService.getLastPromptTokenCount()(already populated by the existing telemetry pipeline) and the model's context window size.Changes
config.ts(core)config.test.tssettings.schema.jsonchatCompressionService.tsTest Plan
Existing
config.test.tsupdated — 79/79 passing. Dynamic scaling verified via getter behavior withuiTelemetryServicealready wired.Demo
N/A — internal optimization. Observable via reduced late-session "context too long" errors and richer tool output in early sessions.
Fixes #2566