Skip to content

feat(providers): configurable context windows + oldest-first compaction#737

Merged
penso merged 6 commits intomoltis-org:mainfrom
Cstewart-HC:feat/combined-context-window-overrides
Apr 16, 2026
Merged

feat(providers): configurable context windows + oldest-first compaction#737
penso merged 6 commits intomoltis-org:mainfrom
Cstewart-HC:feat/combined-context-window-overrides

Conversation

@Cstewart-HC
Copy link
Copy Markdown
Contributor

Summary

Combined replacement for #723, #726, #727 — merges all three PRs into a single clean branch to eliminate cross-PR merge-order conflicts.

Commit 1 — fix(config): correct model_overrides doc references

Fixes stale comments referencing [providers.<name>.models.<id>] — the actual TOML key is model_overrides. Also updates WhatsApp channel template defaults.
Replaces: #723

Commit 2 — fix(agents): oldest-first tool result compaction with configurable ratios

Three changes to per-iteration tool-result compaction:

  • A) Reverse compaction direction: compact oldest results first instead of newest. The model needs its most recent tool outputs for coherent operation in long agent loops.
  • B) Make compaction/overflow ratios configurable via ToolsConfig:
    • tool_result_compaction_ratio (default 75, set to 0 to disable)
    • preemptive_overflow_ratio (default 90)
  • C) Add compaction_min_iterations (default 3): skip compaction for the first N iterations.
    All defaults match previous hardcoded values — backward compatible.
    Replaces: fix(agents): oldest-first tool result compaction with configurable ratios #726

Commit 3 — feat(providers): wire config overrides into provider context_window()

Wire per-model context window overrides from MoltisConfig.models into AnthropicProvider and OpenAiProvider so context_window() returns config-aware values at runtime. Provider-scope overrides win over global overrides, which win over the hardcoded heuristic.

Validation

Config Shape

# Global context window override
[models.claude-opus-4-6]
context_window = 1_000_000

# Provider-scoped (takes precedence)
[providers.zai-code.model_overrides.glm-5-turbo]
context_window = 200_000

# Compaction tuning
[tools]
tool_result_compaction_ratio = 75
preemptive_overflow_ratio = 90
compaction_min_iterations = 3

Closes #723, closes #726, closes #727

Fix stale comments referencing [providers.<name>.models.<id>] — the
actual TOML key is model_overrides. Also update WhatsApp channel
template defaults.

Refs: moltis-org#723
…tios

Three changes to per-iteration tool-result compaction:

A) Reverse compaction direction: compact oldest results first instead of
   newest. The model needs its most recent tool outputs to maintain
   coherent operation in long agent loops.

B) Make compaction/overflow ratios configurable via ToolsConfig:
   - tool_result_compaction_ratio (default 75, set to 0 to disable)
   - preemptive_overflow_ratio (default 90)

C) Add compaction_min_iterations (default 3): skip compaction for the
   first N iterations to prevent premature context destruction in
   short loops.

All defaults match the previous hardcoded values for backward
compatibility. No moltis.toml changes required for existing deploys.

Refs: moltis-org#726
Wire per-model context window overrides from MoltisConfig.models into
provider trait implementations so context_window() returns config-aware
values at runtime.

Changes:
- Add context_window_global / context_window_provider override map fields
  to OpenAiProvider and AnthropicProvider, initialized from config
- Switch context_window() to call context_window_for_model_with_config()
  instead of the hardcoded heuristic
- Thread global overrides through ProviderRegistry and all registration
  paths (catalog, config-defined, OpenAI-compat)
- Extract overrides from MoltisConfig.models at gateway startup in both
  initial and post-discovery registry rebuild paths
- Add extract_cw_overrides() helper to convert ModelOverride maps to
  plain HashMap<String, u32>

5 new unit tests: provider-wins, global-wins, empty-maps-fallthrough,
extract-filters-none, extract-empty.

Refs: moltis-org#727
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Apr 16, 2026

Greptile Summary

This PR consolidates three improvements: fixes stale doc-comment references (models.<id>model_overrides.<id>), reverses tool-result compaction to oldest-first, makes compaction/overflow ratios and a minimum-iterations guard configurable, and wires per-model context-window overrides from moltis.toml into both AnthropicProvider and OpenAiProvider with full precedence (provider > global > heuristic). Both previously flagged regressions — global CW overrides silently dropped on runtime registry rebuilds, and preemptive_overflow_ratio = 0 causing immediate loop failure — are correctly addressed.

Confidence Score: 5/5

Safe to merge; previous P1 regressions are fully addressed and remaining findings are P2 style/validation suggestions.

All registry rebuild paths (service.rs, oauth.rs, post_state.rs, swift bridge) now correctly thread global_cw_overrides. The overflow_ratio=0 warning is added. Compaction direction, configurable ratios, and min-iterations guard are logically correct with good test coverage. Only two P2 findings remain: an indentation inconsistency that will fail just format-check, and a missing upper-bound warning for compaction_min_iterations.

crates/providers/src/registry/core.rs (indentation); crates/config/src/validate/semantic.rs (missing compaction_min_iterations upper-bound warning)

Important Files Changed

Filename Overview
crates/agents/src/runner/helpers.rs Renames compaction to oldest-first, removes hardcoded ratio constants, adds configurable compaction_ratio/overflow_ratio params to enforce_tool_result_context_budget; logic is correct.
crates/config/src/validate/semantic.rs Adds warnings for overflow_ratio=0, ratios >100, and overflow<=compaction; missing upper-bound guard for compaction_min_iterations that could silently disable compaction indefinitely.
crates/providers/src/model_capabilities.rs Adds extract_cw_overrides helper and context_window_for_model_with_config override precedence; normalizes via capability_model_id; 5 new tests cover all override priority combinations.
crates/providers/src/registry/core.rs Adds global_cw_overrides to registry; all from_* constructor paths updated; has a method-chain indentation inconsistency in replace_openai_compat_catalog that will fail just format-check.
crates/providers/src/registry/registration.rs Wires extract_cw_overrides into all provider registration paths (Anthropic, OpenAI, COMPAT, custom); both replace_ and register_ paths consistently updated.
crates/provider-setup/src/service/implementation/service.rs Adds global_cw_overrides field and with_global_cw_overrides builder; all registry-rebuild sites in service.rs and oauth.rs correctly thread the overrides through.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[context_window called] --> B{provider_overrides contains model_id?}
    B -->|Yes| C[Return provider-scoped value - highest precedence]
    B -->|No| D{global_overrides contains model_id?}
    D -->|Yes| E[Return global value from models.* config]
    D -->|No| F[Return heuristic context_window_for_model]

    G[Agent loop iteration N] --> H[iterations += 1]
    H --> I{iterations > compaction_min_iterations?}
    I -->|No| J[effective_ratio = 0 - overflow-check only]
    I -->|Yes| K{user compaction_ratio == 0?}
    K -->|Yes| J
    K -->|No| L[effective_ratio = compaction_ratio]
    J --> M{tokens > overflow_budget?}
    L --> N{tokens > compaction_budget?}
    N -->|Yes| O[compact_tool_results_oldest_first]
    O --> M
    N -->|No| P[OK]
    M -->|Yes| Q[ContextWindowExceeded error]
    M -->|No| P
Loading

Reviews (4): Last reviewed commit: "fix(config): warn on preemptive_overflow..." | Re-trigger Greptile

Comment thread crates/providers/src/registry/core.rs Outdated
Comment thread crates/config/src/validate/semantic.rs
…ry rebuilds

P1 fix for PR moltis-org#737 review: from_env_with_config_and_overrides was
hardcoding HashMap::new() for global_cw_overrides, causing
[models.*].context_window overrides to be silently dropped on any
runtime registry rebuild (UI key save, OAuth completion).

- Add global_cw_overrides param to from_env_with_config and
  from_env_with_config_and_overrides in ProviderRegistry
- Add global_cw_overrides field + builder to LiveProviderSetupService
- Thread overrides through all 4 rebuild sites in service.rs/oauth.rs
- Pass extract_cw_overrides at gateway startup and swift-bridge init
- Update 38 test call sites with HashMap::new()
@Cstewart-HC
Copy link
Copy Markdown
Contributor Author

@greptileai review

… remove unnecessary Mutex

P2 fixes from PR moltis-org#737 Greptile re-review:

- Warn when tool_result_compaction_ratio or preemptive_overflow_ratio
  exceeds 100 (silently defeats protection)
- Replace Arc<Mutex<HashMap>> with plain HashMap for global_cw_overrides
  (write-once at construction, only cloned after)
A zero overflow_ratio makes the budget always 0, causing every
agent loop iteration to fail with ContextWindowExceeded immediately.
@Cstewart-HC
Copy link
Copy Markdown
Contributor Author

@greptileai review

@penso
Copy link
Copy Markdown
Collaborator

penso commented Apr 16, 2026

@greptile review

@penso penso merged commit 24604ea into moltis-org:main Apr 16, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants