Skip to content

feat: add configurable fast_model for declarative providers#8194

Merged
DOsinga merged 3 commits intoaaif-goose:mainfrom
dondetir:feat/configurable-fast-model
Apr 2, 2026
Merged

feat: add configurable fast_model for declarative providers#8194
DOsinga merged 3 commits intoaaif-goose:mainfrom
dondetir:feat/configurable-fast-model

Conversation

@dondetir
Copy link
Copy Markdown
Contributor

@dondetir dondetir commented Mar 29, 2026

Summary

Add a fast_model field to DeclarativeProviderConfig so custom/declarative providers can configure their fast model, instead of having no fast model at all or falling back to the primary model.

Problem

When using Goose with a non-OpenAI server (LM Studio, Ollama, Groq, etc.) via custom/declarative providers, the fast model is not configurable. The built-in OpenAI provider hardcodes gpt-4o-mini as its fast model, but custom providers get None — meaning use_fast_model() always falls back to the primary model, which is wasteful for lightweight tasks.

Solution

Per @DOsinga's direction in #7934:

  • Add fast_model: Option<String> with #[serde(default)] to DeclarativeProviderConfig
  • Wire it into from_custom_config for all three provider engines (OpenAI, Anthropic, Ollama) via the existing model.with_fast() API
  • Preserve fast_model through create/update flows

What this enables

Declarative provider JSON files can now include:

{
  "name": "groq",
  "fast_model": "llama-3.1-8b-instant",
  ...
}

What is NOT changed

  • Built-in provider from_env paths are untouched
  • No env vars or ConfigKeys added to built-in provider metadata
  • Existing JSON files without fast_model continue to deserialize cleanly

Tests

  • All 447 existing tests pass (8 declarative + 439 provider)
  • Backward compat confirmed via test_existing_json_files_still_deserialize_without_new_fields
  • cargo fmt and cargo clippy -D warnings clean

Closes #7934

Add fast_model field to DeclarativeProviderConfig so custom/declarative
providers can configure their fast model. When set, the provider calls
model.with_fast() during from_custom_config initialization.

Wire the field through all three provider engines (openai, anthropic,
ollama) with the same pattern. Built-in providers keep their hardcoded
fast models unchanged.

The field is Optional with serde(default), so existing provider JSON
files without fast_model continue to deserialize cleanly.

Regenerate OpenAPI schema and frontend TypeScript types.

Closes aaif-goose#7934

Signed-off-by: rdondeti <ravitez.dondeti@gmail.com>
@dondetir dondetir force-pushed the feat/configurable-fast-model branch from fd961fb to 9a8ec49 Compare March 29, 2026 22:32
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 9a8ec49e6e

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +224 to +225
let model = if let Some(ref fast_model_name) = config.fast_model {
model.with_fast(fast_model_name, &config.name)?
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Ignore empty fast_model values

This branch treats any non-null fast_model as configured, including "" or whitespace-only strings from declarative JSON. In that case with_fast installs an empty model name, so complete_fast will first send a request with an invalid model and then fall back every time, adding avoidable latency and error noise on each fast-path call. Please normalize empty/blank fast_model to unset (or reject it during config load/update) before calling with_fast.

Useful? React with 👍 / 👎.

Douwe Osinga added 2 commits March 30, 2026 09:40
…lization

Addresses codex review comment: empty or whitespace-only fast_model strings
are now normalized to None at the serde boundary, preventing invalid model
names from being passed to with_fast().

Signed-off-by: Douwe Osinga <douwe@squareup.com>
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: eb27f67d7d

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

D: Deserializer<'de>,
{
let opt: Option<String> = Option::deserialize(deserializer)?;
Ok(opt.filter(|s| !s.trim().is_empty()))
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Normalize fast_model by trimming surrounding whitespace

The new deserialize_non_empty_string helper only checks s.trim().is_empty() but keeps the original string unchanged, so values like "gpt-4o-mini " deserialize as Some("gpt-4o-mini ") and are passed to with_fast as an invalid model name. In that case each fast-path request will fail and fall back to the primary model, adding avoidable latency and errors for a very common config-editing mistake (trailing spaces). Trim the value before storing it (or reject non-trimmed values) during deserialization.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Collaborator

@DOsinga DOsinga left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good — small, well-scoped change that follows existing patterns. The deserialize_non_empty_string handles the empty-string edge case that Codex flagged. Thanks for the contribution!

@DOsinga DOsinga added this pull request to the merge queue Apr 2, 2026
Merged via the queue into aaif-goose:main with commit 8d4835f Apr 2, 2026
22 checks passed
lifeizhou-ap added a commit that referenced this pull request Apr 7, 2026
* origin/main: (32 commits)
  docs: rework homepage and add aaif migration blog post (#8356)
  chore(aaif): rename a bunch of repository references (#8152)
  fix: use OPENAI_API_KEY secret for recipe security scanner (#8358)
  feat: configurable extension timeouts via ACP _meta and global default (#8295)
  fix: hide hidden extensions in UI (#8346)
  refactor: skills as its own platform ext (#8244)
  fix baseUrl (#8347)
  Fix desktop slash commands (#8341)
  fix(cli): display platform-correct secrets path in keyring config dialog (#8328)
  feat(acp): add reusable ACP provider controls (#8314)
  fix: resolve MDX compilation error in using-goosehints.md (#8332)
  fix: use v1beta1 API version for Google/MaaS models on GCP Vertex AI (#8278)
  docs: add MCP Roots guide (#8252)
  rust acp client for extension methods (#8227)
  fix: reconsolidate split tool-call messages to follow OpenAI format (#7921)
  fix: clean up MCP subprocesses after abrupt parent exit (#8242)
  build: raise default stack reserve to 8 MB (#8234)
  fix(config): honour GOOSE_DISABLE_KEYRING from config.yaml at startup (#8219)
  feat: add configurable fast_model for declarative providers (#8194)
  fix(authentication): Allow connecting to Oauth servers that use protected-resource fallback instead of the WWW-authenticate header (#8148)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Configurable OPENAI_FAST_MODEL

2 participants