feat(engine): LLM council via per-call model override in CodeAct by ilblackdragon · Pull Request #2320 · nearai/ironclaw

ilblackdragon · 2026-04-11T03:02:37Z

Summary

Lets the v2 engine query a council of LLMs from a single CodeAct step by adding a model= (and models= for the batched variant) keyword to the existing llm_query() / llm_query_batched() host functions. No new tool, no new dispatch path.
Plumbs the override through LlmCallConfig → LlmBridgeAdapter → CompletionRequest.model, which providers that support per-request switching (NEAR AI, Anthropic OAuth, GitHub Copilot, Bedrock) already honor.
Ships an llm-council skill that activates on council/cross-reference language and teaches the pattern (broadcast same prompt across N models, then synthesize via a follow-up llm_query).

Why a skill, not a tool?

Originally drafted as a LlmCouncilTool built-in. Reverted in favor of extending CodeAct because:

Reuses existing infrastructure — suspend/resume, token accounting, recursive depth tracking, error handling all come for free from llm_query.
The skill is just text — council line-ups, synthesis logic, and prompt patterns live in SKILL.md. No Rust changes needed to add new council variants.
Per-call model overrides compose — any future skill that wants a single sub-query routed to a specific model (e.g. "use a reasoning model for this synthesis step") gets it for free via llm_query(..., model=\"...\").

API

# Single call to a specific model
answer = llm_query(prompt=\"...\", model=\"gpt-4o\")

# Council pattern: same prompt, parallel across N models
COUNCIL = [\"gpt-4o\", \"claude-sonnet-4-20250514\", \"llama-3.1-70b-instruct\"]
responses = llm_query_batched(
    prompts=[question] * len(COUNCIL),
    models=COUNCIL,                       # parallel array, length must match
)

# Or one model applied to many prompts
results = llm_query_batched(prompts=[\"q1\", \"q2\", \"q3\"], model=\"gpt-4o\")

Files changed

File	Change
`crates/ironclaw_engine/src/traits/llm.rs`	Added `model: Option<String>` to `LlmCallConfig`
`crates/ironclaw_engine/src/executor/scripting.rs`	`handle_llm_query` and `handle_llm_query_batched` extract `model=` / `models=`; length-mismatch validation; 5 new unit tests
`crates/ironclaw_engine/src/executor/orchestrator.rs`	`__llm_complete__` host fn extracts `model` from explicit_config
`crates/ironclaw_engine/prompts/codeact_preamble.md`	Documented new parameters so the agent sees them in its system prompt
`src/bridge/llm_adapter.rs`	Forward `config.model` onto `CompletionRequest.model` / `ToolCompletionRequest.model` for both paths; 2 new tests
`skills/llm-council/SKILL.md`	New skill teaching the council pattern + recommended NEAR AI line-ups

Provider behavior

Provider	Per-request model override	Council works?
NEAR AI	Yes (`take_model_override()`)	✅
Anthropic OAuth	Yes (`set_model()` based)	✅
GitHub Copilot	Yes	✅
Bedrock	Yes	✅
OpenAI (rig adapter)	Ignored with warning	Falls back to configured model
Ollama / Tinfoil	Ignored with warning	Falls back to configured model

Test plan

cargo test -p ironclaw_engine --lib — 342 tests pass (337 + 5 new in executor::scripting)
cargo test --lib — 4644 tests pass (incl. 2 new in bridge::llm_adapter)
cargo clippy --all --benches --tests --examples --all-features — zero warnings
Manual: trigger the skill from a v2 conversation with NEAR AI backend ("ask 3 different models what the main risks of X are") and verify all three models respond
Manual: confirm a non-NEAR AI provider (e.g. Ollama) gracefully falls back when model= is passed

🤖 Generated with Claude Code

Extend `llm_query()` and `llm_query_batched()` with a `model=` (and `models=` for the batched variant) keyword so CodeAct can route individual sub-queries to specific LLMs. The "LLM council" pattern becomes a skill — the agent broadcasts the same prompt across a parallel array of models and synthesizes the responses — with no new tool, dispatch path, or capability boundary. - Add `model: Option<String>` to `LlmCallConfig`; thread it through `LlmBridgeAdapter` onto `CompletionRequest.model` / `ToolCompletionRequest.model` so providers that honor per-request overrides (NEAR AI, Anthropic OAuth, GitHub Copilot, Bedrock) pick it up. Other providers fall back to their configured model. - `handle_llm_query` extracts a `model` arg; `handle_llm_query_batched` accepts either `model="..."` (broadcast) or `models=[...]` (parallel array, length-validated against `prompts`). - `__llm_complete__` host fn extracts `model` from explicit_config so the Python orchestrator can also forward it. - Update CodeAct preamble docs so the agent sees the new parameters. - Add `skills/llm-council/SKILL.md` with the council pattern, recommended NEAR AI model line-ups, and a synthesis example. Tests: 5 new scripting tests (model kwarg forwarding, default `None`, `models=` broadcast, single-`model=` broadcast, length-mismatch error) and 2 new bridge tests (config.model → CompletionRequest.model on both the no-tools and with-tools paths). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

gemini-code-assist · 2026-04-11T03:02:41Z

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

Address three review comments on the LLM council PR: 1. Test coverage for the orchestrator entry point. Add two tests driving `handle_llm_complete` directly with explicit_config containing `model` (and a control case without it). Closes the "test through the caller, not just the helper" gap — the previous tests only exercised `handle_llm_query`, leaving the parallel `__llm_complete__` host fn path unverified. 2. Loud failure on non-string entries in `models=[...]`. Previously `models=[1, 2]` was silently coerced via `monty_to_string` to `["1", "2"]`. Now returns `TypeError` with the offending value, matching the existing length-mismatch handling style. 3. `None` slots in `models=[...]` are no longer backfilled by the singular `model=` kwarg. Each slot is authoritative: a `None` means "no override for this prompt" (use the configured default). Mixing the two would have been surprising — the docs now spell out the contract explicitly. Add a regression test that passes both `models=[None, "gpt-4o"]` and `model="claude-..."` and asserts the None slot stays None. Also update `skills/llm-council/SKILL.md` to default to a 4-model council of `anthropic/claude-opus-4-6`, `google/gemini-3-pro`, `zai-org/GLM-latest`, `openai/gpt-5.4`. Per-call provider errors already flow through the existing `Ok(Err(e))` arm as `"Error: ..."` strings — the batch never fails as a whole, so unavailable models just surface in their own slot. Tests: 4 new (2 orchestrator, 2 scripting), all 346 engine unit tests pass, zero clippy warnings. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…odeact # Conflicts: # crates/ironclaw_engine/src/executor/orchestrator.rs

gemini-code-assist · 2026-04-11T04:31:47Z

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

Copilot

Pull request overview

Adds per-call model overrides to CodeAct’s existing llm_query() / llm_query_batched() host functions so a single CodeAct step can query a “council” of models, and includes documentation + a skill to teach the pattern.

Changes:

Extended engine LLM call configuration with an optional per-request model override and forwarded it through the bridge adapter.
Added model= / models= parsing + validation to CodeAct host functions, with unit tests covering the new behavior.
Documented the new API in the CodeAct preamble and introduced an llm-council skill describing recommended usage.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 4 comments.

Show a summary per file

File	Description
`crates/ironclaw_engine/src/traits/llm.rs`	Adds `model: Option<String>` to `LlmCallConfig` for per-call model overrides.
`crates/ironclaw_engine/src/executor/scripting.rs`	Parses `model`/`models` kwargs for `llm_query*` and adds tests for plumbing and validation.
`crates/ironclaw_engine/src/executor/orchestrator.rs`	Forwards `explicit_config["model"]` into `LlmCallConfig` for `__llm_complete__`.
`crates/ironclaw_engine/prompts/codeact_preamble.md`	Documents new `model`/`models` parameters in the agent-visible preamble.
`src/bridge/llm_adapter.rs`	Forwards `config.model` into provider request `model` fields and adds coverage.
`skills/llm-council/SKILL.md`	New skill describing “LLM council” usage pattern and suggested lineups.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-11T04:35:20Z

 ) -> ExtFunctionResult {
    let prompt = extract_string_arg(args, kwargs, "prompt", 0);
    let context_arg = extract_string_arg(args, kwargs, "context", 1);
+    let model_arg = extract_string_arg(args, kwargs, "model", 2);


model is extracted via extract_string_arg, which coerces None to the literal string "None" and also stringifies non-string values. That means llm_query(..., model=None) will incorrectly set a model override of "None" instead of leaving it unset. Consider parsing model as an optional string with explicit handling for MontyObject::None (=> None) and raising a TypeError for other non-string types.

Suggested change

let model_arg = extract_string_arg(args, kwargs, "model", 2);

let model_value = kwargs

.iter()

.find_map(|(key, value)| match key {

MontyObject::String(name) if name == "model" => Some(value),

_ => None,

})

.or_else(|| args.get(2));

let model_arg = match model_value {

Some(MontyObject::None) => None,

Some(MontyObject::String(model)) => Some(model.clone()),

Some(_) => {

return ExtFunctionResult::Error(MontyException::new(

ExcType::TypeError,

Some("llm_query() 'model' argument must be a string or None".into()),

));

}

None => None,

};

Fixed in f47004c. Confirmed the bug: extract_string_arg delegates to monty_to_string, which converts MontyObject::None to the literal string "None", so llm_query(..., model=None) would route to a bogus model ID. Replaced with a new extract_optional_string_kwarg helper that returns Ok(None) for missing/None, Ok(Some(s)) for strings, and TypeError for anything else. Regression tests added: llm_query_model_none_kwarg_is_no_override_not_literal_none_string and llm_query_rejects_non_string_model_kwarg.

Copilot · 2026-04-11T04:35:21Z

+    //     a `None` slot means "no override for this prompt" (the caller
+    //     opted out of routing for that slot); the singular `model=` kwarg
+    //     does NOT fill those slots, since mixing the two would be surprising.
+    let single_model = extract_string_arg(&[], kwargs, "model", usize::MAX);


Same issue for the batched API: model is extracted via extract_string_arg, so model=None becomes the string "None" and non-string values get stringified. This should behave like an optional string override: treat None as no override, and reject other non-string types with a TypeError to avoid silently routing to unintended model IDs.

Suggested change

let single_model = extract_string_arg(&[], kwargs, "model", usize::MAX);

let single_model = match kwargs.iter().find_map(|(k, v)| match k {

MontyObject::String(key) if key == "model" => Some(v),

_ => None,

}) {

None | Some(MontyObject::None) => None,

Some(MontyObject::String(s)) => Some(s.clone()),

Some(other) => {

return ExtFunctionResult::Error(MontyException::new(

ExcType::TypeError,

Some(format!(

"llm_query_batched(): `model` must be a string or None, got {other:?}"

)),

));

}

};

Fixed in f47004c. Same root cause as comment #3067596620 — the singular model= kwarg on llm_query_batched used the same extract_string_arg helper with the same None → "None" coercion bug. Now uses the shared extract_optional_string_kwarg. Regression tests added: llm_query_batched_single_model_none_kwarg_is_no_override and llm_query_batched_rejects_non_string_single_model.

Copilot · 2026-04-11T04:35:21Z

+            return ExtFunctionResult::Error(MontyException::new(
+                ExcType::TypeError,
+                Some(format!(
+                    "llm_query_batched(): `models` must be a list of strings, got {other:?}"


Error message says "models must be a list of strings", but the implementation explicitly allows None entries (and the message in the other branch even says "str or None"). Consider adjusting this message to reflect the actual accepted types (e.g., list of str or None).

Suggested change

"llm_query_batched(): `models` must be a list of strings, got {other:?}"

"llm_query_batched(): `models` must be a list of str or None, got {other:?}"

Fixed in f47004c. Message now reads \`models\` must be a list of str or None to match the actual accepted types.

Copilot · 2026-04-11T04:35:21Z

+- `llm_query_batched()` runs all calls in parallel — total latency is roughly the slowest model.
+- If `models=` is provided, it must be the same length as `prompts`.
+- Use `model=` (singular) when you want one model applied to every prompt; use `models=` (plural list) for the council pattern.
+- Errors from individual models surface as `"Error: ..."` strings in the result list — the batch never raises.


The note says the batched call "never raises", but llm_query_batched() does raise exceptions for invalid arguments (e.g., models length mismatch, non-list models, non-list prompts). Consider clarifying that per-model failures are returned as "Error: ..." strings, but argument validation errors still raise.

Suggested change

- Errors from individual models surface as `"Error: ..."` strings in the result list — the batch never raises.

- Errors from individual models surface as `"Error: ..."` strings in the result list, but invalid arguments (for example, wrong types or a `models`/`prompts` length mismatch) still raise exceptions.

Fixed in f47004c. Clarified the note: per-model runtime failures (unavailable model, provider timeout, etc.) surface as "Error: ..." strings so a single bad model doesn't fail the whole batch, but argument validation errors (wrong types, length mismatch) still raise.

Address Copilot review on PR #2320. Four comments, all valid: 1 & 2. `model` and `single_model` were extracted via `extract_string_arg`, which calls `monty_to_string` — that coerces `MontyObject::None` to the literal string "None" and stringifies non-string values (ints become "1", etc.). So `llm_query(prompt="hi", model=None)` would silently route every call to a bogus model ID called "None". Add a strict `extract_optional_string_kwarg` helper that returns `Ok(None)` for missing/`None`, `Ok(Some(s))` for strings, and a `TypeError` for anything else. Use it in both `handle_llm_query` and `handle_llm_query_batched` for the `model=` kwarg. Regression tests cover: `model=None` → no override, `model=<int>` → TypeError, and the same two cases on the batched path. 3. The `models=` list-type error message said "list of strings" but we accept `None` entries. Updated to "list of str or None". 4. SKILL.md claimed the batched call "never raises". It does — for argument validation errors (wrong types, length mismatch). Clarified that per-model failures return as `"Error: ..."` strings, but argument validation still raises. Tests: 4 new regression tests, all 4687 main-crate and 358 engine tests pass, zero clippy warnings. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

serrrfirat · 2026-04-11T05:38:30Z

+using the built-in `llm_query()` and `llm_query_batched()` functions. Both
+accept a `model=` (or `models=`) keyword that overrides the configured model
+for that call. Providers that support per-request model overrides (NEAR AI,
+Anthropic OAuth, GitHub Copilot, Bedrock) honor it; others fall back to their


Medium Severity

The skill says Bedrock honors per-request overrides and provides a cross-provider default council, but src/llm/bedrock.rs ignores request.model and explicitly reports the active model only. Also, non-aggregating providers like Anthropic OAuth/Copilot cannot switch to Google/OpenAI by provider-prefixed model name.

Please scope the default lineup to NEAR AI/aggregator backends, provide provider-specific lineups, and either implement Bedrock request-level model selection or remove Bedrock from the "honors it" list.

Fixed in 458f614. Verified both sub-points:

Bedrock does not honor per-request overrides. src/llm/bedrock.rs::complete() unconditionally calls self.current_model_id() and never reads request.model. Dropped Bedrock from the "honors it" list.

Cross-vendor routing requires an aggregator. Anthropic OAuth and GitHub Copilot do honor set_model (confirmed via take_model_override calls in both files), but they can only switch between models exposed by their own vendor — a direct Anthropic provider cannot route openai/gpt-5.4 to OpenAI's API. Scoped the default prefixed 4-model lineup explicitly to NEAR AI.

SKILL.md now has a provider capability table with separate columns for "honors model=" and "cross-vendor routing", plus per-backend default lineups: NEAR AI (cross-vendor aggregator), Anthropic OAuth (Claude tiers only), Copilot (Copilot-exposed models). For backends that don't honor model= at all (Bedrock, raw OpenAI/Ollama/Tinfoil), the skill instructs the agent to tell the user and fall back to a single-model answer.

Implementing Bedrock request-level model selection would require changing bedrock.rs to build the Converse API call with model_id(override.unwrap_or_else(...)); I left that as out-of-scope for this PR since it's a separate provider change.

serrrfirat · 2026-04-11T05:38:30Z

+    //     does NOT fill those slots, since mixing the two would be surprising.
+    // See note in handle_llm_query: `model` must be parsed explicitly so that
+    // `model=None` doesn't become the literal string "None".
+    let single_model = match extract_optional_string_kwarg(&[], kwargs, "model", usize::MAX) {


Medium Severity

llm_query_batched(prompts, context=None, model=None, models=None) only reads context, model, and models from kwargs. Positional calls like llm_query_batched(prompts, None, "gpt-4o") silently ignore the model and route to the default model.

Please either extract context from position 1, model from position 2, and models from position 3, or document these as keyword-only and reject extra positional args. Add tests for positional model/models so the call signature in the preamble stays honest.

Fixed in 458f614. Confirmed the bug: extract_*_kwarg calls all passed &[] for args, so llm_query_batched(prompts, None, "gpt-4o") silently dropped the model. Now threads the real args slice into each extractor with the documented positional indices (context=1, model=2, models=3). Positional None at any of those slots correctly means "no override". Three regression tests added: llm_query_batched_honors_positional_context_and_model, llm_query_batched_honors_positional_models_list, and llm_query_batched_positional_none_for_models_is_no_override. The preamble signature now matches actual behavior.

… docs Address two review comments from serrrfirat on PR #2320. 1. `llm_query_batched` silently dropped positional `context`/`model`/ `models` args. The documented signature is `llm_query_batched(prompts, context=None, model=None, models=None)`, but the extractors were hardcoded to kwargs only (`&[]` for args). A call like `llm_query_batched(prompts, None, "gpt-4o")` routed to the default model — silent contract violation. Thread the real `args` slice into each extractor with the documented positional indices: context=1, model=2, models=3. Positional `MontyObject::None` at any of those slots now correctly means "no override". Added 3 regression tests: - `llm_query_batched_honors_positional_context_and_model` - `llm_query_batched_honors_positional_models_list` - `llm_query_batched_positional_none_for_models_is_no_override` 2. SKILL.md claimed Bedrock honors per-request model overrides, but `bedrock.rs::complete()` unconditionally uses `self.current_model_id()` and ignores `request.model`. Also, the default 4-model prefixed lineup (`anthropic/...`, `google/...`, `openai/...`) only works on aggregator backends like NEAR AI — a direct Anthropic OAuth or Copilot provider honors `set_model` but can only switch between models within its own vendor. Rewrite the SKILL.md preamble with a provider capability table (dropping Bedrock from "honors it" and adding cross-vendor routing as a separate column), and add per-backend default lineups: NEAR AI (prefixed cross-vendor), Anthropic OAuth (Anthropic tiers only), Copilot (Copilot-exposed models). For backends that don't honor `model=` at all (Bedrock, raw OpenAI/Ollama/Tinfoil), the skill now instructs the agent to tell the user and fall back to a single-model answer. Tests: 3 new regression tests, all 4687 main-crate and 361 engine tests pass, zero clippy warnings. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Copilot

Pull request overview

Copilot reviewed 6 out of 6 changed files in this pull request and generated 3 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-11T06:09:21Z

+| Backend            | Honors `model=`? | Cross-vendor routing? |
+|--------------------|------------------|-----------------------|
+| NEAR AI            | Yes              | Yes (aggregator — hosts models from many vendors) |
+| Anthropic OAuth    | Yes              | No (Anthropic models only) |
+| GitHub Copilot     | Yes              | No (Copilot-exposed models only) |
+| Bedrock            | **No**           | — (model fixed at construction) |
+| OpenAI / Ollama / Tinfoil via rig | No (silent fallback with warning log) | — |


The Bedrock row here says per-request model= is not honored (and implies the model is “fixed at construction”), but the PR description claims Bedrock supports per-request model overrides. The current Bedrock provider explicitly ignores per-request overrides (see src/llm/bedrock.rs effective_model_name: it always returns the active model). Please reconcile the PR description and reword this row to avoid implying Bedrock can’t switch models at runtime (it supports set_model(), just not per-request overrides).

Copilot · 2026-04-11T06:09:21Z

+    /// Optional per-call model override. When set, the bridge adapter forwards
+    /// this to the underlying `LlmProvider` via `CompletionRequest::model`.
+    /// Providers that don't support per-request overrides will fall back to
+    /// their configured model and log a warning.


Doc comment for LlmCallConfig.model says providers that don’t support per-request overrides “will … log a warning”, but at least the Bedrock and rig adapters currently ignore overrides without emitting a warning. Please either soften this to “may log a warning” or add a warning at the bridge/provider layer when an override is provided but effective_model_name() resolves to the active/default model.

Suggested change

/// their configured model and log a warning.

/// their configured model and may log a warning.

Copilot · 2026-04-11T06:09:22Z

+| Anthropic OAuth    | Yes              | No (Anthropic models only) |
+| GitHub Copilot     | Yes              | No (Copilot-exposed models only) |
+| Bedrock            | **No**           | — (model fixed at construction) |
+| OpenAI / Ollama / Tinfoil via rig | No (silent fallback with warning log) | — |


This table says the rig-backed providers (OpenAI / Ollama / Tinfoil) “silently fallback with warning log” when model= is provided, but the current rig adapter path injects a model field without logging, and effective_model_name() ignores the override. Please either update this wording (no warning today) or add an explicit warning when a model override is provided but the adapter/provider won’t honor it.

Suggested change

| OpenAI / Ollama / Tinfoil via rig | No (silent fallback with warning log) | — |

| OpenAI / Ollama / Tinfoil via rig | No (override currently not honored; no warning log) | — |

serrrfirat

Thorough review complete. Clean design, excellent test coverage (18 new tests), well-motivated helper (extract_optional_string_kwarg correctly avoids the None → "None" coercion bug). No blocking issues found.

Minor suggestions for follow-up (non-blocking):

PR body provider table lists Bedrock as Yes for model override, but SKILL.md correctly says No — consider fixing the PR body to avoid confusion
context arg in handle_llm_query_batched switched from lenient coercion to strict str/None — creates a subtle inconsistency with handle_llm_query where context still uses extract_string_arg. Consider aligning both to the strict extractor
No test for model="" (empty string) — downstream normalized_model_override() handles it, but a unit test would document the expectation

LGTM 🚢

github-actions bot added scope: docs Documentation size: L 200-499 changed lines risk: low Changes to docs, tests, or low-risk modules contributor: core 20+ merged PRs labels Apr 11, 2026

github-actions bot added size: XL 500+ changed lines and removed size: L 200-499 changed lines labels Apr 11, 2026

Merge remote-tracking branch 'origin/staging' into feat/llm-council-c…

7520aba

…odeact # Conflicts: # crates/ironclaw_engine/src/executor/orchestrator.rs

ilblackdragon marked this pull request as ready for review April 11, 2026 04:31

Copilot AI review requested due to automatic review settings April 11, 2026 04:31

Copilot AI reviewed Apr 11, 2026

View reviewed changes

serrrfirat reviewed Apr 11, 2026

View reviewed changes

Copilot AI review requested due to automatic review settings April 11, 2026 06:04

Copilot started reviewing on behalf of ilblackdragon April 11, 2026 06:05 View session

Copilot AI reviewed Apr 11, 2026

View reviewed changes

serrrfirat approved these changes Apr 11, 2026

View reviewed changes

serrrfirat merged commit 764e586 into staging Apr 11, 2026
19 checks passed

serrrfirat deleted the feat/llm-council-codeact branch April 11, 2026 06:23

henrypark133 mentioned this pull request Apr 13, 2026

feat(docs): animated architecture overview video for contributors #2365

Merged

4 tasks

-    let model_arg = extract_string_arg(args, kwargs, "model", 2);
+    let model_value = kwargs
+        .iter()
+        .find_map(|(key, value)| match key {
+            MontyObject::String(name) if name == "model" => Some(value),
+            _ => None,
+        })
+        .or_else(|| args.get(2));
+    let model_arg = match model_value {
+        Some(MontyObject::None) => None,
+        Some(MontyObject::String(model)) => Some(model.clone()),
+        Some(_) => {
+            return ExtFunctionResult::Error(MontyException::new(
+                ExcType::TypeError,
+                Some("llm_query() 'model' argument must be a string or None".into()),
+            ));
+        }
+        None => None,
+    };

-    let single_model = extract_string_arg(&[], kwargs, "model", usize::MAX);
+    let single_model = match kwargs.iter().find_map(|(k, v)| match k {
+        MontyObject::String(key) if key == "model" => Some(v),
+        _ => None,
+    }) {
+        None | Some(MontyObject::None) => None,
+        Some(MontyObject::String(s)) => Some(s.clone()),
+        Some(other) => {
+            return ExtFunctionResult::Error(MontyException::new(
+                ExcType::TypeError,
+                Some(format!(
+                    "llm_query_batched(): `model` must be a string or None, got {other:?}"
+                )),
+            ));
+        }
+    };

	"llm_query_batched(): `models` must be a list of strings, got {other:?}"
	"llm_query_batched(): `models` must be a list of str or None, got {other:?}"

	- Errors from individual models surface as `"Error: ..."` strings in the result list — the batch never raises.
	- Errors from individual models surface as `"Error: ..."` strings in the result list, but invalid arguments (for example, wrong types or a `models`/`prompts` length mismatch) still raise exceptions.

	/// their configured model and log a warning.
	/// their configured model and may log a warning.

	\| OpenAI / Ollama / Tinfoil via rig \| No (silent fallback with warning log) \| — \|
	\| OpenAI / Ollama / Tinfoil via rig \| No (override currently not honored; no warning log) \| — \|

Conversation

ilblackdragon commented Apr 11, 2026

Summary

Why a skill, not a tool?

API

Files changed

Provider behavior

Test plan

Uh oh!

gemini-code-assist bot commented Apr 11, 2026

Uh oh!

gemini-code-assist bot commented Apr 11, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Apr 11, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 11, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 11, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 11, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Apr 11, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 11, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 11, 2026

Choose a reason for hiding this comment

Uh oh!

serrrfirat left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants