diff --git a/docs/design/multi-model-review.md b/docs/design/multi-model-review.md
new file mode 100644
index 0000000000..57d9d04830
--- /dev/null
+++ b/docs/design/multi-model-review.md
@@ -0,0 +1,622 @@
+# Multi-Model Code Review Design
+
+## Background
+
+Qwen Code 当前的 `/review` 功能由单个模型驱动,通过启动 4 个并行子任务(correctness、quality、performance、audit)从不同维度审查代码。虽然多维度审查有效,但所有视角都来自同一个模型,存在以下局限:
+
+- 单一模型的知识盲区会导致某些问题被一致性地忽略
+- 不同模型在不同领域有各自的优势(如某些模型更擅长安全审查,某些更擅长性能分析)
+- 用户无法利用已配置的多个模型提供商来获得更全面的审查
+
+## Goal
+
+支持用户配置多个模型同时审查同一段代码,汇总各模型的独立审查结果,产出一份综合报告。
+
+## Non-Goals
+
+- 不改变现有单模型 `/review` 的行为(作为默认模式保留)
+- 不要求所有模型必须来自不同提供商(同一提供商的不同模型也可以)
+- 不涉及 review 结果的持久化存储或历史对比
+
+---
+
+## 1. 用户配置
+
+### 1.1 设计原则:渐进式复杂度
+
+配置设计遵循"用最少的配置获得最大的功能"原则,分四个层级,用户按需递进:
+
+```
+Level 0: 零配置 /review --multi → 列出 modelProviders 中可用模型,提示用户配置
+Level 1: 选模型 "review": { "models": ["gpt-4o", "deepseek-chat"] }
+Level 2: 指定仲裁 "review": { ..., "arbitratorModel": "claude-opus-4-6-20250725" }
+Level 3: 内联自定义 "review": { "models": ["gpt-4o", { "id": "my-model", ... }] }
+```
+
+### 1.2 配置格式
+
+**核心简化**:`review.models` 和 `review.arbitratorModel` 直接写模型 ID 字符串,自动从 `modelProviders` 解析。不再要求用户写 `{ id, authType }` 对象。
+
+```jsonc
+// ~/.qwen/settings.json
+{
+ // 用户已有的模型配置(已配好的,不需要为 review 新增)
+ "modelProviders": {
+ "openai": [
+ {
+ "id": "gpt-4o",
+ "baseUrl": "https://api.openai.com/v1",
+ "envKey": "OPENAI_API_KEY",
+ },
+ ],
+ "anthropic": [
+ {
+ "id": "claude-sonnet-4-6-20250514",
+ "baseUrl": "https://api.anthropic.com",
+ "envKey": "ANTHROPIC_API_KEY",
+ },
+ {
+ "id": "claude-opus-4-6-20250725",
+ "baseUrl": "https://api.anthropic.com",
+ "envKey": "ANTHROPIC_API_KEY",
+ },
+ ],
+ },
+
+ // 多模型 review:最简配置只需要模型 ID 列表
+ "review": {
+ "models": ["gpt-4o", "claude-sonnet-4-6-20250514"],
+ "arbitratorModel": "claude-opus-4-6-20250725",
+ },
+}
+```
+
+模型 ID 自动从 `modelProviders` 中查找,用户不需要重复写 `authType`、`baseUrl`、`envKey`。
+
+**对比改进前后的用户体验**:
+
+```jsonc
+// ❌ 改进前:繁琐,用户需要重复写 authType + 完整对象
+"review": {
+ "models": [
+ { "id": "gpt-4o", "authType": "openai" },
+ { "id": "claude-sonnet-4-6-20250514", "authType": "anthropic" }
+ ],
+ "arbitratorModel": {
+ "id": "claude-opus-4-6-20250725",
+ "authType": "anthropic"
+ }
+}
+
+// ✅ 改进后:只写模型名
+"review": {
+ "models": ["gpt-4o", "claude-sonnet-4-6-20250514"],
+ "arbitratorModel": "claude-opus-4-6-20250725"
+}
+```
+
+**混合模式**:当模型不在 `modelProviders` 中时,支持字符串和对象混写:
+
+```jsonc
+"review": {
+ "models": [
+ "gpt-4o", // 从 modelProviders 解析
+ "claude-sonnet-4-6-20250514", // 从 modelProviders 解析
+ { // 内联:不在 modelProviders 中的模型
+ "id": "deepseek-chat",
+ "authType": "openai",
+ "baseUrl": "https://api.deepseek.com/v1",
+ "envKey": "DEEPSEEK_API_KEY"
+ }
+ ]
+}
+```
+
+### 1.3 模型解析规则
+
+```
+review.models 中的每个条目
+ │
+ ├─ 字符串(如 "gpt-4o")
+ │ └─ 在 modelProviders 所有 authType 中按 id 查找
+ │ ├─ 找到 → 使用 modelProviders 中的完整配置
+ │ ├─ 找到多个同名 → 报错: "Ambiguous model id 'xxx', found in openai and anthropic. Use object form to specify authType."
+ │ └─ 未找到 → 报错: "Model 'gpt-4o' not found in modelProviders. Add it to modelProviders or use object form with full config."
+ │
+ └─ 对象(如 { id, authType, ... })
+ └─ 直接使用,不查找 modelProviders
+```
+
+**去重规则**:按模型 `id` 去重。当前会话模型 + `review.models` 列表中如有重复 id,只保留一份。
+
+### 1.4 各层级详解
+
+#### Level 0:零配置(`/review --multi`)
+
+用户已在 `modelProviders` 中配置了多个模型,但尚未配置 `review.models`:
+
+```
+> /review --multi
+
+No review models configured. Available models from modelProviders:
+ ✓ gpt-4o (openai)
+ ✓ claude-sonnet-4-6-20250514 (anthropic)
+ ✓ claude-opus-4-6-20250725 (anthropic)
+ ✗ deepseek-chat (openai) — DEEPSEEK_API_KEY not set
+
+To enable multi-model review, add to ~/.qwen/settings.json:
+ "review": { "models": ["gpt-4o", "claude-sonnet-4-6-20250514"] }
+
+Proceeding with standard single-model review...
+```
+
+**不做自动选择**:列出可选模型 + 给出配置示例,让用户主动选择。
+自动选择的隐式逻辑(每个 authType 选几个?按什么排序?)对用户不透明,MVP 阶段不做。
+
+#### Level 1:指定审查模型
+
+```jsonc
+"review": {
+ "models": ["gpt-4o", "claude-sonnet-4-6-20250514", "deepseek-chat"]
+}
+```
+
+配置后 `/review` 自动使用多模型,无需 `--multi` 标志。
+
+#### Level 2:指定仲裁模型
+
+```jsonc
+"review": {
+ "models": ["gpt-4o", "deepseek-chat"],
+ "arbitratorModel": "claude-opus-4-6-20250725"
+}
+```
+
+**三种角色完全解耦**:
+
+```
+┌─────────────────────────────────────────────────────────────────────┐
+│ 角色 │ 典型选择 │ 核心需求 │
+├──────────────────┼─────────────────────┼───────────────────────────┤
+│ 会话模型 │ Qwen Coder Turbo │ 快速、低延迟、日常编码 │
+│ (Session Model) │ GPT-4o-mini │ │
+├──────────────────┼─────────────────────┼───────────────────────────┤
+│ 审查模型 │ GPT-4o │ 覆盖面广、各有所长 │
+│ (Review Models) │ Claude Sonnet │ 多视角、并行 │
+│ │ DeepSeek V3 │ │
+├──────────────────┼─────────────────────┼───────────────────────────┤
+│ 仲裁模型 │ Claude Opus │ 强推理、准确裁决 │
+│ (Arbitrator) │ o3 │ 可以慢,但要准 │
+└──────────────────┴─────────────────────┴───────────────────────────┘
+```
+
+#### Level 3:内联自定义模型
+
+```jsonc
+"review": {
+ "models": [
+ "gpt-4o",
+ { "id": "my-internal-model", "authType": "openai", "baseUrl": "https://internal.corp/v1", "envKey": "INTERNAL_API_KEY" }
+ ]
+}
+```
+
+适用于需要接入未在 `modelProviders` 中注册的模型(如内部部署的模型)。
+
+### 1.5 配置 Schema
+
+```typescript
+review: {
+ type: 'object',
+ label: 'Code Review',
+ category: 'Tools',
+ requiresRestart: false,
+ default: {},
+ description: 'Multi-model code review configuration.',
+ showInDialog: false,
+ properties: {
+ models: {
+ type: 'array',
+ label: 'Review Models',
+ category: 'Tools',
+ requiresRestart: false,
+ default: [],
+ description: 'Models for multi-model review. Each entry can be a model ID string (resolved from modelProviders) or a full model config object.',
+ showInDialog: false,
+ },
+ arbitratorModel: {
+ type: 'string', // 字符串,不是 object
+ label: 'Arbitrator Model',
+ category: 'Tools',
+ requiresRestart: false,
+ default: undefined,
+ description: 'Model ID for the arbitrator (resolved from modelProviders). Falls back to current session model if not set.',
+ showInDialog: false,
+ },
+ },
+}
+```
+
+注意:相比之前的设计,去掉了 `includeCurrentModel`、`maxConcurrency`、`skipArbitration`。这些要么有合理的默认值不需要暴露,要么可以用命令行参数临时覆盖,不值得占配置项。
+
+---
+
+## 2. 用户使用方式
+
+### 2.1 触发方式
+
+扩展现有 `/review`,不引入新命令:
+
+```bash
+/review # 有 review.models 配置 → 多模型; 否则 → 单模型
+/review 123 # 审查 PR #123(模式同上)
+/review src/foo.ts # 审查单文件(模式同上)
+/review --multi # 强制多模型(无配置则列出可用模型并提示配置)
+/review --multi 123 # 多模型审查 PR #123
+/review --single # 临时走单模型(忽略 review.models 配置)
+/review --single 123 # 单模型审查 PR #123
+```
+
+**决策逻辑:**
+
+```
+/review [args]
+ │
+ ├─ 有 --single 标志? → 单模型 review(忽略 review.models,走现有 4-agent 流程)
+ │
+ ├─ review.models 已配置且 ≥ 2 个可用模型?
+ │ └─ 是 → 多模型 review
+ │
+ ├─ 有 --multi 标志?
+ │ └─ 列出 modelProviders 中可用模型,提示配置 review.models
+ │
+ └─ 都没有 → 单模型 review(现有行为,完全不变)
+```
+
+### 2.2 用户旅程
+
+#### 首次使用(未配置 review.models)
+
+```
+> /review --multi
+
+ No review models configured. Available models from modelProviders:
+ ✓ gpt-4o (openai)
+ ✓ claude-sonnet-4-6-20250514 (anthropic)
+ ✗ deepseek-chat (openai) — DEEPSEEK_API_KEY not set
+
+ To enable multi-model review, add to ~/.qwen/settings.json:
+ "review": { "models": ["gpt-4o", "claude-sonnet-4-6-20250514"] }
+
+ Proceeding with standard single-model review...
+ (... 现有单模型 review 输出 ...)
+```
+
+#### 已配置后的首次使用
+
+```
+> /review 123
+
+ Reviewing PR #123 with 2 models + arbitrator...
+
+ gpt-4o ✓ done (12.3s)
+ claude-sonnet ✓ done (18.7s)
+ claude-opus (judge) ✓ done (8.2s)
+
+ ── Multi-Model Review: PR #123 ──────────────────────────
+
+ Review models: gpt-4o, claude-sonnet
+ Arbitrator: claude-opus
+ Files: 15 files, +342/-128 lines
+
+ Critical (1)
+
+ [gpt-4o, claude-sonnet] src/db.ts:42 — SQL injection
+ Query string built via concatenation without sanitization.
+ Fix: Use parameterized queries.
+
+ Suggestions (2)
+
+ [claude-sonnet] src/utils.ts:15 — Duplicated logic
+ Similar pattern exists in src/helpers.ts:30.
+
+ [gpt-4o] src/api.ts:8 — Missing input validation
+ User input passed directly to internal API.
+
+ Nice to have (1)
+
+ [gpt-4o] src/config.ts:22 — Unused import
+
+ Verdict: Request Changes
+ Both models identified critical SQL injection at src/db.ts:42.
+```
+
+#### 日常使用(已配置)
+
+```
+> /review 123
+
+ Reviewing PR #123 with 3 models...
+
+ gpt-4o ✓ done (12.3s)
+ claude-sonnet ✓ done (18.7s)
+ deepseek-chat ✓ done (15.1s)
+ claude-opus (judge) ✓ done (8.2s, 2 disputes resolved)
+
+ ── Multi-Model Review: PR #123 ──────────────────────────
+ (... 报告输出 ...)
+```
+
+无额外操作,跟单模型 `/review` 一样直接。
+
+#### 错误处理
+
+```
+# 部分模型失败 → 继续
+ deepseek-chat ✗ failed (timeout)
+ ⚠ 1/3 models failed. Proceeding with 2 results.
+
+# 所有模型失败 → 自动回退
+ ✗ All models failed. Falling back to single-model review.
+
+# API key 缺失 → 跳过并提示
+ ✗ gpt-4o: OPENAI_API_KEY not set, skipped
+ Tip: Set the env var or remove "gpt-4o" from review.models
+```
+
+---
+
+## 3. 技术架构
+
+### 3.1 系统架构图
+
+```
+┌────────────────────────────────────────────────────────────────┐
+│ /review Skill (SKILL.md) │
+│ │
+│ Step 1: 获取 diff │
+│ Step 2: 调用 multi_model_review tool │
+│ Step 3: 输出最终报告 │
+└──────────────────────────┬─────────────────────────────────────┘
+ │
+ ┌────────────▼────────────────┐
+ │ MultiModelReviewTool │
+ │ │
+ │ - 解析 review.models 配置 │
+ │ - < 2 个可用模型 → 返回提示 │
+ │ (SKILL.md 自然 fallback │
+ │ 到现有 4-agent 流程) │
+ │ - ≥ 2 个模型 → 调用 Service │
+ └────────────┬────────────────┘
+ │
+ ┌────────────▼────────────────┐
+ │ MultiModelReviewService │
+ │ │
+ │ Phase 1: 并行收集 │
+ │ - 为每个模型创建 │
+ │ ContentGenerator │
+ │ - p-limit 并发调用 │
+ │ - 收集各模型自由文本 review │
+ │ - 个别失败容错 │
+ └────────────┬────────────────┘
+ ┌────────────┼────────────────┐
+ ▼ ▼ ▼
+ ┌────────────┐ ┌────────────┐ ┌────────────┐
+ │ Model A │ │ Model B │ │ Model C │
+ │ .generate │ │ .generate │ │ .generate │
+ │ Content() │ │ Content() │ │ Content() │
+ └────────────┘ └────────────┘ └────────────┘
+ │ │ │
+ └────────────┼────────────────┘
+ ▼
+ ┌────────────────────────────┐
+ │ Phase 2: 仲裁 (始终执行) │
+ │ │
+ │ arbitratorModel 配置了? │
+ │ ├─ 是 → 独立仲裁模型 │
+ │ └─ 否 → 会话模型仲裁 │
+ │ │
+ │ 合并去重 + 裁决 + 输出报告 │
+ └────────────────────────────┘
+```
+
+### 3.2 模型配置解析
+
+模型解析复用 1.3 节的规则:字符串 → 从 modelProviders 全局查找;对象 → 直接使用。
+`arbitratorModel` 同理。
+
+### 3.3 审查模型的输出
+
+各审查模型收到统一的 review prompt(覆盖 correctness、security、quality、performance 四个维度),返回**自由文本** review。
+
+不强制 JSON schema,原因:
+
+- 不是所有模型都良好支持 function calling
+- 结构化约束会压制深度推理能力
+- 仲裁模型负责整合各模型的自由文本,不需要预先结构化
+
+**审查 prompt 模板**:
+
+```markdown
+Review the following code changes. Cover these dimensions:
+
+1. Correctness & Security — bugs, edge cases, vulnerabilities
+2. Code Quality — naming, duplication, style consistency
+3. Performance — bottlenecks, memory, unnecessary work
+4. Anything else that looks off
+
+For each finding, include: file path, line number (if applicable), severity
+(Critical / Suggestion / Nice to have), what's wrong, and suggested fix.
+
+End with a verdict: Approve, Request Changes, or Comment.
+
+
+{diff}
+
+```
+
+Service 层收集后的内部表示:
+
+```typescript
+interface ModelReviewResult {
+ modelId: string;
+ reviewText: string; // 模型返回的自由文本 review
+ error?: string; // 调用失败时的错误信息
+}
+```
+
+不从自由文本中提取 verdict 或 severity 等结构化字段——这些语义判断全部交给仲裁模型。
+
+### 3.4 结果聚合与仲裁
+
+核心原则:**程序只做收集,LLM 做所有语义工作**。
+
+#### 3.4.1 流程
+
+```
+Phase 1: 并行收集 (Service 层) Phase 2: 仲裁 (始终执行)
+ │ │
+各模型并行调用 → 收集自由文本结果 ────────────▶ 仲裁模型
+ │ │
+ 个别失败容错 合并去重 + 裁决 + 输出报告
+ (失败模型跳过)
+```
+
+Phase 1 和 Phase 2 之间**没有分支判断**。仲裁模型始终运行,原因:
+
+- 即使所有模型"一致",多份自由文本仍需合并去重为一份报告
+- 从自由文本中程序化提取 verdict/severity 不可靠,不值得为此增加分支复杂度
+- 无争议时仲裁模型工作量很小(仅合并),开销可忽略
+
+#### 3.4.2 Phase 1: 并行收集(Service 层)
+
+Service 层只做两件事:**并行调用** + **收集原始文本**。
+
+```typescript
+interface CollectedReview {
+ /** 各模型的原始自由文本结果(失败的模型不包含在内) */
+ modelResults: ModelReviewResult[];
+
+ /** 完整 diff(传递给仲裁模型) */
+ diff: string;
+}
+```
+
+不做 verdict 提取、不做 finding 对齐、不做分歧检测。这些全部是语义工作,交给 Phase 2。
+
+#### 3.4.3 Phase 2: 仲裁(仲裁模型)
+
+仲裁模型的职责:**合并去重 + 验证 + 裁决 + 输出最终报告**。
+
+**仲裁模型选择**:
+
+```
+review.arbitratorModel 配置了?
+ │
+ ├─ 是 → 独立仲裁: 创建独立 ContentGenerator, 通过 generateContent() 调用
+ │ 输入: 各模型原始 review 文本 + 完整 diff
+ │
+ └─ 否 → 会话模型仲裁 (默认): Tool 将各模型 review 文本作为结果返回给主会话
+ 主模型在会话上下文中完成仲裁(有完整项目上下文 + tool 访问)
+```
+
+| 维度 | 独立仲裁 (`arbitratorModel`) | 会话模型仲裁 (默认) |
+| ------------- | ------------------------------ | ------------------------------ |
+| 项目上下文 | 无(只看各模型 review + diff) | 有(完整会话历史 + tool 访问) |
+| 推荐场景 | 会话模型是快速模型 | 会话模型推理能力足够 |
+| 额外 API 开销 | 一次调用 | 无 |
+
+**仲裁 Prompt**:
+
+```markdown
+You are the senior code reviewer. Multiple models independently reviewed the same
+code changes. Your job is to produce the final unified review report.
+
+Tasks:
+
+1. **Merge & deduplicate**: Identify findings that refer to the same issue
+ (even if described differently or pointing to nearby lines). Consolidate them,
+ noting which models identified each issue.
+2. **Resolve severity conflicts**: When models disagree on severity for the same
+ issue, evaluate the actual code and choose the appropriate level.
+ Default to the HIGHER severity when uncertain.
+3. **Validate isolated findings**: For findings raised by only one model,
+ verify against the code. Keep valid ones, dismiss false positives with reasoning.
+4. **Final verdict**: Approve / Request Changes / Comment, with reasoning.
+
+Output format:
+
+- Group findings by severity (Critical → Suggestion → Nice to have)
+- For each finding: [model names] file:line — title, description, suggested fix
+- End with verdict and one-sentence reasoning
+
+Each model's full review is provided below, followed by the diff.
+Do NOT discard findings just because only one model raised them.
+```
+
+---
+
+## 4. 与现有系统的对接
+
+| 组件 | 复用方式 |
+| ------------------------------------ | ----------------------------------------------------------- |
+| `ContentGenerator` 工厂 | `createContentGenerator()` 为每个 review model 创建独立实例 |
+| `ContentGenerator.generateContent()` | 审查模型和独立仲裁模型均使用自由文本调用 |
+| `ModelConfig` 类型 | 复用 `models/types.ts` 中的类型定义 |
+| `p-limit` 并发控制 | 复用 insight 的并发模式 |
+| 容错模式 | 复用 insight 的个别模型失败不影响整体的模式 |
+| settings.json | 复用现有的设置加载和 merge 机制 |
+| SKILL.md | 扩展现有 review skill,调用 MultiModelReviewTool |
+
+---
+
+## 5. 实现计划
+
+### Phase 1: 核心流程(MVP)
+
+1. **Settings Schema**: 在 `settingsSchema.ts` 中添加 `review.models` 和 `review.arbitratorModel`
+2. **Config 层**: 添加 `getReviewModels()` / `getArbitratorModel()` 方法,含模型 ID 解析逻辑
+3. **Service 层**: 实现 `MultiModelReviewService`
+ - 为各模型创建临时 ContentGenerator
+ - 并行 `generateContent()` 调用 + 收集自由文本
+ - 仲裁模型调用(独立仲裁 or 返回给会话模型)
+4. **Tool 层**: 实现 `MultiModelReviewTool`(含 < 2 模型时的提示返回)
+5. **Skill 层**: 扩展 `/review` SKILL.md
+
+### Phase 2: 体验优化
+
+6. 进度展示(各模型审查进度实时更新)
+7. `--single` 标志支持(临时走单模型)
+8. Level 0 零配置引导(`--multi` 时列出可用模型并提示配置)
+
+### Phase 3: 高级功能
+
+9. 各模型可配置不同的审查 prompt(如某模型专注安全)
+10. Review 结果缓存(避免相同 diff 重复审查)
+11. 零配置自动模型选择(从 modelProviders 智能选取)
+
+---
+
+## 6. 涉及的文件变更
+
+| 文件 | 变更类型 | 说明 |
+| ------------------------------------------------------------ | -------- | ------------------------------- |
+| `packages/cli/src/config/settingsSchema.ts` | 修改 | 添加 `review` settings schema |
+| `packages/core/src/config/config.ts` | 修改 | 添加 `getReviewModels()` 等方法 |
+| `packages/core/src/services/multiModelReviewService.ts` | **新建** | 多模型 review 核心逻辑 |
+| `packages/core/src/tools/multiModelReview.ts` | **新建** | MultiModelReviewTool |
+| `packages/core/src/tools/tool-names.ts` | 修改 | 注册新 tool name |
+| `packages/core/src/tools/tool-registry.ts` | 修改 | 注册 MultiModelReviewTool |
+| `packages/core/src/skills/bundled/review/SKILL.md` | 修改 | 添加多模型分支逻辑 |
+| `packages/core/src/services/multiModelReviewService.test.ts` | **新建** | 单元测试 |
+
+---
+
+## 7. Open Questions
+
+1. **大 diff 的处理**: 当 diff 超过某些模型的上下文窗口时如何处理?
+ - **建议**: 后续迭代中检测上下文窗口并跳过不足的模型(含告警),进一步支持按文件分片。
+ - **当前状态**: MVP 未实现上下文窗口检测,超长 diff 会由模型 API 自行报错,被 collectReviews 归入失败模型并展示给用户。
+
+2. **独立仲裁模型的上下文**: 独立仲裁模型通过 API 调用,没有 tool 访问能力,无法主动读取代码文件。
+ - **建议**: 仲裁 prompt 中包含完整 diff(审查模型也看的是同一份 diff),这足以让仲裁模型验证 findings。不需要额外提取文件上下文。
diff --git a/packages/cli/src/config/config.ts b/packages/cli/src/config/config.ts
index 88153fe750..9d2292285e 100755
--- a/packages/cli/src/config/config.ts
+++ b/packages/cli/src/config/config.ts
@@ -1035,6 +1035,14 @@ export async function loadCliConfig(
},
hooks: settings.hooks,
hooksConfig: settings.hooksConfig,
+ reviewConfig: settings.review
+ ? {
+ models: settings.review.models as
+ | Array>
+ | undefined,
+ arbitratorModel: settings.review.arbitratorModel,
+ }
+ : undefined,
enableHooks:
argv.experimentalHooks === true || settings.hooksConfig?.enabled === true,
channel: argv.channel,
diff --git a/packages/cli/src/config/settingsSchema.ts b/packages/cli/src/config/settingsSchema.ts
index 4701abc1ab..996b9f7a78 100644
--- a/packages/cli/src/config/settingsSchema.ts
+++ b/packages/cli/src/config/settingsSchema.ts
@@ -76,6 +76,8 @@ export interface SettingDefinition {
mergeStrategy?: MergeStrategy;
/** Enum type options */
options?: readonly SettingEnumOption[];
+ /** Custom JSON Schema for array items (overrides default `{ type: 'string' }`) */
+ items?: Record;
}
export interface SettingsSchema {
@@ -1247,6 +1249,64 @@ const SETTINGS_SCHEMA = {
},
},
},
+ // Multi-model code review configuration
+ review: {
+ type: 'object',
+ label: 'Code Review',
+ category: 'Tools',
+ requiresRestart: false,
+ default: {},
+ description:
+ 'Multi-model code review configuration. When review.models is configured with 2+ models, /review will use multi-model review automatically.',
+ showInDialog: false,
+ properties: {
+ models: {
+ type: 'array',
+ label: 'Review Models',
+ category: 'Tools',
+ requiresRestart: false,
+ default: [] as Array>,
+ description:
+ 'Models for multi-model review. Each entry can be a model ID string (resolved from modelProviders) or a full model config object with id, authType, baseUrl, envKey.',
+ showInDialog: false,
+ items: {
+ oneOf: [
+ {
+ type: 'string',
+ description: 'Model ID resolved from modelProviders',
+ },
+ {
+ type: 'object',
+ description: 'Inline model configuration',
+ properties: {
+ id: { type: 'string', description: 'Model identifier' },
+ authType: {
+ type: 'string',
+ description: 'Authentication type',
+ },
+ baseUrl: { type: 'string', description: 'API base URL' },
+ envKey: {
+ type: 'string',
+ description: 'Environment variable for API key',
+ },
+ },
+ required: ['id'],
+ },
+ ],
+ },
+ },
+ arbitratorModel: {
+ type: 'string',
+ label: 'Arbitrator Model',
+ category: 'Tools',
+ requiresRestart: false,
+ default: undefined,
+ description:
+ 'Model ID for the final arbitrator (resolved from modelProviders). Falls back to the current session model if not set. Recommended: a high-reasoning model.',
+ showInDialog: false,
+ },
+ },
+ },
} as const satisfies SettingsSchema;
export type SettingsSchemaType = typeof SETTINGS_SCHEMA;
diff --git a/packages/core/src/config/config.test.ts b/packages/core/src/config/config.test.ts
index 828ef9c3ef..3037a1b1ef 100644
--- a/packages/core/src/config/config.test.ts
+++ b/packages/core/src/config/config.test.ts
@@ -1313,6 +1313,289 @@ describe('BaseLlmClient Lifecycle', () => {
});
});
+describe('Review Model Config Resolution', () => {
+ const MODEL = 'qwen3-coder-plus';
+ const TELEMETRY_SETTINGS = { enabled: false };
+
+ const reviewBaseParams: ConfigParameters = {
+ cwd: '/tmp',
+ targetDir: '/tmp',
+ debugMode: false,
+ question: '',
+ model: MODEL,
+ usageStatisticsEnabled: false,
+ telemetry: TELEMETRY_SETTINGS,
+ overrideExtensions: [],
+ };
+
+ beforeEach(() => {
+ vi.clearAllMocks();
+ vi.mocked(canUseRipgrep).mockResolvedValue(true);
+ vi.spyOn(QwenLogger.prototype, 'logStartSessionEvent').mockImplementation(
+ async () => undefined,
+ );
+ vi.mocked(resolveContentGeneratorConfigWithSources).mockImplementation(
+ (_config, authType, generationConfig) => ({
+ config: {
+ ...generationConfig,
+ authType,
+ model: generationConfig?.model || MODEL,
+ apiKey: 'test-key',
+ } as ContentGeneratorConfig,
+ sources: {},
+ }),
+ );
+ });
+
+ describe('getReviewModels', () => {
+ it('should return empty array when no reviewConfig', () => {
+ const config = new Config({ ...reviewBaseParams });
+ expect(config.getReviewModels()).toEqual([]);
+ });
+
+ it('should return empty array when models is empty', () => {
+ const config = new Config({
+ ...reviewBaseParams,
+ reviewConfig: { models: [] },
+ });
+ expect(config.getReviewModels()).toEqual([]);
+ });
+
+ it('should resolve string model IDs from modelProviders', () => {
+ const config = new Config({
+ ...reviewBaseParams,
+ modelProvidersConfig: {
+ openai: [
+ {
+ id: 'gpt-4o',
+ name: 'GPT-4o',
+ baseUrl: 'https://api.openai.com/v1',
+ },
+ ],
+ },
+ reviewConfig: { models: ['gpt-4o'] },
+ });
+ const models = config.getReviewModels();
+ expect(models).toHaveLength(1);
+ expect(models[0].id).toBe('gpt-4o');
+ expect(models[0].authType).toBe(AuthType.USE_OPENAI);
+ });
+
+ it('should throw for string model ID not found in modelProviders', () => {
+ const config = new Config({
+ ...reviewBaseParams,
+ modelProvidersConfig: {},
+ reviewConfig: { models: ['non-existent'] },
+ });
+ expect(() => config.getReviewModels()).toThrow(
+ /Model 'non-existent' not found in modelProviders/,
+ );
+ });
+
+ it('should deduplicate string model IDs', () => {
+ const config = new Config({
+ ...reviewBaseParams,
+ modelProvidersConfig: {
+ openai: [
+ {
+ id: 'gpt-4o',
+ name: 'GPT-4o',
+ baseUrl: 'https://api.openai.com/v1',
+ },
+ ],
+ },
+ reviewConfig: { models: ['gpt-4o', 'gpt-4o'] },
+ });
+ const models = config.getReviewModels();
+ expect(models).toHaveLength(1);
+ });
+
+ it('should resolve inline object model configs', () => {
+ const config = new Config({
+ ...reviewBaseParams,
+ reviewConfig: {
+ models: [
+ {
+ id: 'custom-model',
+ authType: 'openai',
+ baseUrl: 'https://custom.api/v1',
+ envKey: 'CUSTOM_KEY',
+ },
+ ],
+ },
+ });
+ const models = config.getReviewModels();
+ expect(models).toHaveLength(1);
+ expect(models[0].id).toBe('custom-model');
+ expect(models[0].authType).toBe(AuthType.USE_OPENAI);
+ expect(models[0].baseUrl).toBe('https://custom.api/v1');
+ expect(models[0].envKey).toBe('CUSTOM_KEY');
+ });
+
+ it('should throw for inline object missing authType', () => {
+ const config = new Config({
+ ...reviewBaseParams,
+ reviewConfig: {
+ models: [{ id: 'no-auth' }],
+ },
+ });
+ expect(() => config.getReviewModels()).toThrow(
+ /missing required field: authType/,
+ );
+ });
+
+ it('should throw for inline object with invalid authType', () => {
+ const config = new Config({
+ ...reviewBaseParams,
+ reviewConfig: {
+ models: [{ id: 'bad-auth', authType: 'invalid-provider' }],
+ },
+ });
+ expect(() => config.getReviewModels()).toThrow(/invalid authType/);
+ });
+
+ it('should throw for inline object missing baseUrl when required', () => {
+ const config = new Config({
+ ...reviewBaseParams,
+ reviewConfig: {
+ models: [{ id: 'no-url', authType: 'openai' }],
+ },
+ });
+ expect(() => config.getReviewModels()).toThrow(/requires a baseUrl/);
+ });
+
+ it('should allow missing baseUrl for qwen-oauth authType', () => {
+ const config = new Config({
+ ...reviewBaseParams,
+ reviewConfig: {
+ models: [{ id: 'qwen-model', authType: 'qwen-oauth' }],
+ },
+ });
+ const models = config.getReviewModels();
+ expect(models).toHaveLength(1);
+ expect(models[0].baseUrl).toBe('');
+ });
+
+ it('should handle mixed string and object entries', () => {
+ const config = new Config({
+ ...reviewBaseParams,
+ modelProvidersConfig: {
+ openai: [
+ {
+ id: 'gpt-4o',
+ name: 'GPT-4o',
+ baseUrl: 'https://api.openai.com/v1',
+ },
+ ],
+ },
+ reviewConfig: {
+ models: [
+ 'gpt-4o',
+ {
+ id: 'custom-model',
+ authType: 'openai',
+ baseUrl: 'https://custom.api/v1',
+ },
+ ],
+ },
+ });
+ const models = config.getReviewModels();
+ expect(models).toHaveLength(2);
+ expect(models[0].id).toBe('gpt-4o');
+ expect(models[1].id).toBe('custom-model');
+ });
+
+ it('should deduplicate across string and object entries', () => {
+ const config = new Config({
+ ...reviewBaseParams,
+ modelProvidersConfig: {
+ openai: [
+ {
+ id: 'gpt-4o',
+ name: 'GPT-4o',
+ baseUrl: 'https://api.openai.com/v1',
+ },
+ ],
+ },
+ reviewConfig: {
+ models: [
+ 'gpt-4o',
+ {
+ id: 'gpt-4o',
+ authType: 'openai',
+ baseUrl: 'https://api.openai.com/v1',
+ },
+ ],
+ },
+ });
+ const models = config.getReviewModels();
+ expect(models).toHaveLength(1);
+ });
+
+ it('should use id as name when name is not provided in inline config', () => {
+ const config = new Config({
+ ...reviewBaseParams,
+ reviewConfig: {
+ models: [
+ {
+ id: 'my-model',
+ authType: 'openai',
+ baseUrl: 'https://api.example.com/v1',
+ },
+ ],
+ },
+ });
+ const models = config.getReviewModels();
+ expect(models[0].name).toBe('my-model');
+ });
+ });
+
+ describe('getArbitratorModel', () => {
+ it('should return undefined when no arbitratorModel configured', () => {
+ const config = new Config({ ...reviewBaseParams });
+ expect(config.getArbitratorModel()).toBeUndefined();
+ });
+
+ it('should return undefined when arbitratorModel is empty string', () => {
+ const config = new Config({
+ ...reviewBaseParams,
+ reviewConfig: { arbitratorModel: '' },
+ });
+ expect(config.getArbitratorModel()).toBeUndefined();
+ });
+
+ it('should resolve arbitratorModel from modelProviders', () => {
+ const config = new Config({
+ ...reviewBaseParams,
+ modelProvidersConfig: {
+ openai: [
+ {
+ id: 'gpt-4o',
+ name: 'GPT-4o',
+ baseUrl: 'https://api.openai.com/v1',
+ },
+ ],
+ },
+ reviewConfig: { arbitratorModel: 'gpt-4o' },
+ });
+ const model = config.getArbitratorModel();
+ expect(model).toBeDefined();
+ expect(model?.id).toBe('gpt-4o');
+ });
+
+ it('should throw when arbitratorModel not found in modelProviders', () => {
+ const config = new Config({
+ ...reviewBaseParams,
+ modelProvidersConfig: {},
+ reviewConfig: { arbitratorModel: 'non-existent' },
+ });
+ expect(() => config.getArbitratorModel()).toThrow(
+ /Arbitrator model 'non-existent' not found/,
+ );
+ });
+ });
+});
+
describe('Model Switching and Config Updates', () => {
const baseParams: ConfigParameters = {
cwd: '/tmp',
diff --git a/packages/core/src/config/config.ts b/packages/core/src/config/config.ts
index 3663beb8fc..f9da656db8 100644
--- a/packages/core/src/config/config.ts
+++ b/packages/core/src/config/config.ts
@@ -58,6 +58,7 @@ import { SkillTool } from '../tools/skill.js';
import { TaskTool } from '../tools/task.js';
import { TodoWriteTool } from '../tools/todoWrite.js';
import { ToolRegistry } from '../tools/tool-registry.js';
+import { MultiModelReviewTool } from '../tools/multiModelReview.js';
import { WebFetchTool } from '../tools/web-fetch.js';
import { WebSearchTool } from '../tools/web-search/index.js';
import { WriteFileTool } from '../tools/write-file.js';
@@ -126,6 +127,7 @@ import {
ModelsConfig,
type ModelProvidersConfig,
type AvailableModel,
+ type ResolvedModelConfig,
type RuntimeModelSnapshot,
} from '../models/index.js';
import type { ClaudeMarketplaceConfig } from '../extension/claude-converter.js';
@@ -394,6 +396,11 @@ export interface ConfigParameters {
hooksConfig?: Record;
/** Warnings generated during configuration resolution */
warnings?: string[];
+ /** Multi-model review configuration */
+ reviewConfig?: {
+ models?: Array>;
+ arbitratorModel?: string;
+ };
}
function normalizeConfigOutputFormat(
@@ -443,6 +450,10 @@ export class Config {
private modelsConfig!: ModelsConfig;
private readonly modelProvidersConfig?: ModelProvidersConfig;
+ private readonly reviewConfig?: {
+ models?: Array>;
+ arbitratorModel?: string;
+ };
private readonly sandbox: SandboxConfig | undefined;
private readonly targetDir: string;
private workspaceContext: WorkspaceContext;
@@ -619,6 +630,7 @@ export class Config {
this.folderTrust = params.folderTrust ?? false;
this.ideMode = params.ideMode ?? false;
this.modelProvidersConfig = params.modelProvidersConfig;
+ this.reviewConfig = params.reviewConfig;
this.cliVersion = params.cliVersion;
this.chatRecordingEnabled = params.chatRecording ?? true;
@@ -1103,6 +1115,111 @@ export class Config {
return this.modelsConfig.getAllConfiguredModels(authTypes);
}
+ /**
+ * Resolve review models from review.models config.
+ * String IDs are resolved from modelProviders via ModelsConfig.findModelById().
+ * Returns ResolvedModelConfig[] ready for createContentGenerator().
+ */
+ getReviewModels(): ResolvedModelConfig[] {
+ const models = this.reviewConfig?.models;
+ if (!models || models.length === 0) {
+ return [];
+ }
+
+ const resolved: ResolvedModelConfig[] = [];
+ const seenIds = new Set();
+
+ for (const entry of models) {
+ if (typeof entry === 'string') {
+ // Resolve string ID from modelProviders
+ const found = this.modelsConfig.findModelById(entry);
+ if (!found) {
+ throw new Error(
+ `Model '${entry}' not found in modelProviders. Add it to modelProviders or use object form with full config.`,
+ );
+ }
+ if (seenIds.has(found.config.id)) {
+ this.debugLogger.warn(
+ `Duplicate model ID '${found.config.id}' in review.models — skipping`,
+ );
+ } else {
+ seenIds.add(found.config.id);
+ resolved.push(found.config);
+ }
+ } else if (entry && typeof entry === 'object' && 'id' in entry) {
+ // Inline model config object — access via index signature
+ const obj = entry as Record;
+ const id = String(obj['id']);
+ const authTypeRaw = obj['authType'] as string | undefined;
+ if (!authTypeRaw) {
+ throw new Error(
+ `Inline model config for '${id}' missing required field: authType`,
+ );
+ }
+ if (!Object.values(AuthType).includes(authTypeRaw as AuthType)) {
+ throw new Error(
+ `Inline model config for '${id}' has invalid authType: '${authTypeRaw}'. Expected one of: ${Object.values(AuthType).join(', ')}`,
+ );
+ }
+ const authType = authTypeRaw as AuthType;
+ const baseUrl =
+ obj['baseUrl'] !== undefined && obj['baseUrl'] !== null
+ ? String(obj['baseUrl'])
+ : '';
+ if (
+ !baseUrl &&
+ authType !== AuthType.QWEN_OAUTH &&
+ authType !== AuthType.USE_VERTEX_AI &&
+ authType !== AuthType.USE_GEMINI
+ ) {
+ throw new Error(
+ `Inline model config for '${id}' with authType '${authType}' requires a baseUrl.`,
+ );
+ }
+ if (seenIds.has(id)) {
+ this.debugLogger.warn(
+ `Duplicate model ID '${id}' in review.models — skipping`,
+ );
+ } else {
+ seenIds.add(id);
+ resolved.push({
+ id,
+ authType,
+ name:
+ obj['name'] !== undefined && obj['name'] !== null
+ ? String(obj['name'])
+ : id,
+ baseUrl,
+ envKey: obj['envKey'] ? String(obj['envKey']) : undefined,
+ generationConfig: {},
+ capabilities: {},
+ });
+ }
+ }
+ }
+
+ return resolved;
+ }
+
+ /**
+ * Resolve the arbitrator model from review.arbitratorModel config.
+ * Returns undefined when session model should act as arbitrator.
+ */
+ getArbitratorModel(): ResolvedModelConfig | undefined {
+ const modelId = this.reviewConfig?.arbitratorModel;
+ if (!modelId) {
+ return undefined;
+ }
+
+ const found = this.modelsConfig.findModelById(modelId);
+ if (!found) {
+ throw new Error(
+ `Arbitrator model '${modelId}' not found in modelProviders. Add it to modelProviders first.`,
+ );
+ }
+ return found.config;
+ }
+
/**
* Get the currently active runtime model snapshot.
* Delegates to ModelsConfig.
@@ -1898,6 +2015,15 @@ export class Config {
registerCoreTool(AskUserQuestionTool, this);
!this.sdkMode && registerCoreTool(ExitPlanModeTool, this);
registerCoreTool(WebFetchTool, this);
+ // Register multi-model review tool when review.models is configured (even with < 2 models,
+ // so the tool can surface guidance about how to configure additional models)
+ if (
+ this.reviewConfig?.models &&
+ Array.isArray(this.reviewConfig.models) &&
+ this.reviewConfig.models.length > 0
+ ) {
+ registerCoreTool(MultiModelReviewTool, this);
+ }
// Conditionally register web search tool if web search provider is configured
// buildWebSearchConfig ensures qwen-oauth users get dashscope provider, so
// if tool is registered, config must exist
diff --git a/packages/core/src/index.ts b/packages/core/src/index.ts
index e1fe65d2ff..a1551c0e74 100644
--- a/packages/core/src/index.ts
+++ b/packages/core/src/index.ts
@@ -91,12 +91,14 @@ export * from './tools/todoWrite.js';
export * from './tools/web-fetch.js';
export * from './tools/web-search/index.js';
export * from './tools/write-file.js';
+export * from './tools/multiModelReview.js';
// ============================================================================
// Services
// ============================================================================
export * from './services/chatRecordingService.js';
+export * from './services/multiModelReviewService.js';
export * from './services/fileDiscoveryService.js';
export * from './services/fileSystemService.js';
export * from './services/gitService.js';
diff --git a/packages/core/src/models/modelRegistry.test.ts b/packages/core/src/models/modelRegistry.test.ts
index 9005dd52a6..46cd96ec5b 100644
--- a/packages/core/src/models/modelRegistry.test.ts
+++ b/packages/core/src/models/modelRegistry.test.ts
@@ -384,6 +384,51 @@ describe('ModelRegistry', () => {
});
});
+ describe('findModelById', () => {
+ it('should find a model that exists in exactly one authType', () => {
+ const registry = new ModelRegistry({
+ openai: [
+ { id: 'gpt-4', name: 'GPT-4', baseUrl: 'https://api.openai.com/v1' },
+ ],
+ });
+
+ const result = registry.findModelById('gpt-4');
+ expect(result).toBeDefined();
+ expect(result?.config.id).toBe('gpt-4');
+ expect(result?.authType).toBe(AuthType.USE_OPENAI);
+ });
+
+ it('should return undefined for a model that does not exist', () => {
+ const registry = new ModelRegistry({
+ openai: [{ id: 'gpt-4', name: 'GPT-4' }],
+ });
+
+ const result = registry.findModelById('non-existent');
+ expect(result).toBeUndefined();
+ });
+
+ it('should throw for ambiguous model id found in multiple authTypes', () => {
+ const registry = new ModelRegistry({
+ openai: [{ id: 'shared-model', name: 'OpenAI Shared' }],
+ gemini: [{ id: 'shared-model', name: 'Gemini Shared' }],
+ });
+
+ expect(() => registry.findModelById('shared-model')).toThrow(
+ /Ambiguous model id/,
+ );
+ });
+
+ it('should find qwen-oauth built-in models', () => {
+ // QWEN_OAUTH_MODELS are always registered by the constructor.
+ // 'coder-model' is the default model defined in QWEN_OAUTH_MODELS (constants.ts).
+ const registry = new ModelRegistry();
+
+ const result = registry.findModelById('coder-model');
+ expect(result).toBeDefined();
+ expect(result?.authType).toBe(AuthType.QWEN_OAUTH);
+ });
+ });
+
describe('reloadModels', () => {
it('should reload models from new config', () => {
const registry = new ModelRegistry({
diff --git a/packages/core/src/models/modelRegistry.ts b/packages/core/src/models/modelRegistry.ts
index c2815fb329..15791d2d27 100644
--- a/packages/core/src/models/modelRegistry.ts
+++ b/packages/core/src/models/modelRegistry.ts
@@ -197,6 +197,32 @@ export class ModelRegistry {
}
}
+ /**
+ * Find a model by ID across all authTypes.
+ * Returns the resolved config if found in exactly one authType.
+ * Throws if the model ID is ambiguous (found in multiple authTypes).
+ */
+ findModelById(
+ modelId: string,
+ ): { config: ResolvedModelConfig; authType: AuthType } | undefined {
+ const matches: Array<{ config: ResolvedModelConfig; authType: AuthType }> =
+ [];
+ for (const [authType, modelMap] of this.modelsByAuthType.entries()) {
+ const model = modelMap.get(modelId);
+ if (model) {
+ matches.push({ config: model, authType });
+ }
+ }
+ if (matches.length === 0) return undefined;
+ if (matches.length > 1) {
+ const authTypes = matches.map((m) => m.authType).join(' and ');
+ throw new Error(
+ `Ambiguous model id '${modelId}', found in ${authTypes}. Use object form to disambiguate, e.g.: { "id": "${modelId}", "authType": "${matches[0].authType}" }`,
+ );
+ }
+ return matches[0];
+ }
+
/**
* Reload models from updated configuration.
* Clears existing user-configured models and re-registers from new config.
diff --git a/packages/core/src/models/modelsConfig.ts b/packages/core/src/models/modelsConfig.ts
index d22cc790cb..f053ee37bb 100644
--- a/packages/core/src/models/modelsConfig.ts
+++ b/packages/core/src/models/modelsConfig.ts
@@ -230,6 +230,17 @@ export class ModelsConfig {
return this.authTypeWasExplicitlyProvided;
}
+ /**
+ * Find a model by ID across all authTypes.
+ * Returns the resolved config if found in exactly one authType.
+ * Throws if the model ID is ambiguous (found in multiple authTypes).
+ */
+ findModelById(
+ modelId: string,
+ ): { config: ResolvedModelConfig; authType: AuthType } | undefined {
+ return this.modelRegistry.findModelById(modelId);
+ }
+
/**
* Get available models for current authType
*/
diff --git a/packages/core/src/services/multiModelReviewService.test.ts b/packages/core/src/services/multiModelReviewService.test.ts
new file mode 100644
index 0000000000..807b110aba
--- /dev/null
+++ b/packages/core/src/services/multiModelReviewService.test.ts
@@ -0,0 +1,305 @@
+/**
+ * @license
+ * Copyright 2025 Qwen
+ * SPDX-License-Identifier: Apache-2.0
+ */
+
+/* eslint-disable @typescript-eslint/no-explicit-any */
+
+import { describe, it, expect, vi, beforeEach } from 'vitest';
+import {
+ MultiModelReviewService,
+ type ModelReviewResult,
+} from './multiModelReviewService.js';
+import type { Config } from '../config/config.js';
+import type { ResolvedModelConfig } from '../models/types.js';
+import { AuthType } from '../core/contentGenerator.js';
+
+// Mock createContentGenerator
+vi.mock('../core/contentGenerator.js', async (importOriginal) => {
+ const mod =
+ await importOriginal();
+ return {
+ ...mod,
+ createContentGenerator: vi.fn(),
+ };
+});
+
+import { createContentGenerator } from '../core/contentGenerator.js';
+
+const mockedCreateContentGenerator = vi.mocked(createContentGenerator);
+
+function makeModel(id: string): ResolvedModelConfig {
+ return {
+ id,
+ name: id,
+ authType: AuthType.USE_OPENAI,
+ baseUrl: 'https://api.openai.com/v1',
+ generationConfig: {},
+ capabilities: {},
+ };
+}
+
+function makeGeneratorResponse(text: string) {
+ return {
+ candidates: [
+ {
+ content: {
+ parts: [{ text }],
+ },
+ },
+ ],
+ };
+}
+
+describe('MultiModelReviewService', () => {
+ let config: Config;
+ let service: MultiModelReviewService;
+
+ beforeEach(() => {
+ vi.clearAllMocks();
+ config = {} as Config;
+ service = new MultiModelReviewService(config);
+ });
+
+ describe('collectReviews', () => {
+ it('should collect reviews from multiple models in parallel', async () => {
+ const models = [makeModel('model-a'), makeModel('model-b')];
+
+ mockedCreateContentGenerator.mockResolvedValueOnce({
+ generateContent: vi
+ .fn()
+ .mockResolvedValue(makeGeneratorResponse('Review from A')),
+ } as any);
+ mockedCreateContentGenerator.mockResolvedValueOnce({
+ generateContent: vi
+ .fn()
+ .mockResolvedValue(makeGeneratorResponse('Review from B')),
+ } as any);
+
+ const result = await service.collectReviews('diff content', models);
+
+ expect(result.modelResults).toHaveLength(2);
+ expect(result.modelResults[0].modelId).toBe('model-a');
+ expect(result.modelResults[0].reviewText).toBe('Review from A');
+ expect(result.modelResults[1].modelId).toBe('model-b');
+ expect(result.modelResults[1].reviewText).toBe('Review from B');
+ expect(result.failedModels).toHaveLength(0);
+ expect(result.diff).toBe('diff content');
+ });
+
+ it('should handle partial failures gracefully', async () => {
+ const models = [makeModel('model-a'), makeModel('model-b')];
+
+ mockedCreateContentGenerator.mockResolvedValueOnce({
+ generateContent: vi
+ .fn()
+ .mockResolvedValue(makeGeneratorResponse('Review from A')),
+ } as any);
+ mockedCreateContentGenerator.mockResolvedValueOnce({
+ generateContent: vi
+ .fn()
+ .mockRejectedValue(new Error('API key invalid')),
+ } as any);
+
+ const result = await service.collectReviews('diff content', models);
+
+ // Only successful results are returned
+ expect(result.modelResults).toHaveLength(1);
+ expect(result.modelResults[0].modelId).toBe('model-a');
+ expect(result.failedModels).toHaveLength(1);
+ expect(result.failedModels[0].modelId).toBe('model-b');
+ expect(result.failedModels[0].error).toBe('API key invalid');
+ });
+
+ it('should return empty results when all models fail', async () => {
+ const models = [makeModel('model-a'), makeModel('model-b')];
+
+ mockedCreateContentGenerator.mockResolvedValueOnce({
+ generateContent: vi.fn().mockRejectedValue(new Error('fail 1')),
+ } as any);
+ mockedCreateContentGenerator.mockResolvedValueOnce({
+ generateContent: vi.fn().mockRejectedValue(new Error('fail 2')),
+ } as any);
+
+ const result = await service.collectReviews('diff content', models);
+
+ expect(result.modelResults).toHaveLength(0);
+ expect(result.failedModels).toHaveLength(2);
+ });
+
+ it('should treat empty responses as errors', async () => {
+ const models = [makeModel('model-a')];
+
+ mockedCreateContentGenerator.mockResolvedValueOnce({
+ generateContent: vi.fn().mockResolvedValue(makeGeneratorResponse('')),
+ } as any);
+
+ const result = await service.collectReviews('diff content', models);
+
+ expect(result.modelResults).toHaveLength(0);
+ });
+
+ it('should handle createContentGenerator failure gracefully', async () => {
+ const models = [makeModel('model-a'), makeModel('model-b')];
+
+ mockedCreateContentGenerator.mockRejectedValueOnce(
+ new Error('Failed to create generator'),
+ );
+ mockedCreateContentGenerator.mockResolvedValueOnce({
+ generateContent: vi
+ .fn()
+ .mockResolvedValue(makeGeneratorResponse('Review from B')),
+ } as any);
+
+ const result = await service.collectReviews('diff content', models);
+
+ expect(result.modelResults).toHaveLength(1);
+ expect(result.modelResults[0].modelId).toBe('model-b');
+ });
+
+ it('should treat missing env var as error for model with envKey', async () => {
+ const model = makeModel('model-a');
+ model.envKey = 'MISSING_API_KEY';
+ // Ensure the env var is not set
+ delete process.env['MISSING_API_KEY'];
+
+ const result = await service.collectReviews('diff content', [model]);
+
+ expect(result.modelResults).toHaveLength(0);
+ expect(result.failedModels).toHaveLength(1);
+ expect(result.failedModels[0].error).toContain('MISSING_API_KEY');
+ });
+
+ it('should respect abort signal', async () => {
+ const models = [makeModel('model-a')];
+ const controller = new AbortController();
+ controller.abort();
+
+ const result = await service.collectReviews(
+ 'diff content',
+ models,
+ controller.signal,
+ );
+
+ // Aborted tasks should be treated as errors
+ expect(result.modelResults).toHaveLength(0);
+ });
+ });
+
+ describe('arbitrateIndependently', () => {
+ it('should produce arbitrated report from collected reviews', async () => {
+ const collected = {
+ modelResults: [
+ { modelId: 'model-a', reviewText: 'Found bug X' },
+ { modelId: 'model-b', reviewText: 'Found bug Y' },
+ ] as ModelReviewResult[],
+ failedModels: [],
+ diff: 'some diff',
+ };
+ const arbitrator = makeModel('arbitrator');
+
+ mockedCreateContentGenerator.mockResolvedValueOnce({
+ generateContent: vi
+ .fn()
+ .mockResolvedValue(
+ makeGeneratorResponse('Unified report: bugs X and Y'),
+ ),
+ } as any);
+
+ const result = await service.arbitrateIndependently(
+ collected,
+ arbitrator,
+ );
+
+ expect(result.report).toBe('Unified report: bugs X and Y');
+ });
+
+ it('should throw when arbitrator returns empty response', async () => {
+ const collected = {
+ modelResults: [
+ { modelId: 'model-a', reviewText: 'Found bug X' },
+ ] as ModelReviewResult[],
+ failedModels: [],
+ diff: 'some diff',
+ };
+ const arbitrator = makeModel('arbitrator');
+
+ mockedCreateContentGenerator.mockResolvedValueOnce({
+ generateContent: vi.fn().mockResolvedValue(makeGeneratorResponse('')),
+ } as any);
+
+ await expect(
+ service.arbitrateIndependently(collected, arbitrator),
+ ).rejects.toThrow(/empty response/i);
+ });
+
+ it('should propagate API errors from arbitrator', async () => {
+ const collected = {
+ modelResults: [
+ { modelId: 'model-a', reviewText: 'Found bug X' },
+ ] as ModelReviewResult[],
+ failedModels: [],
+ diff: 'some diff',
+ };
+ const arbitrator = makeModel('arbitrator');
+
+ mockedCreateContentGenerator.mockResolvedValueOnce({
+ generateContent: vi.fn().mockRejectedValue(new Error('Auth failed')),
+ } as any);
+
+ await expect(
+ service.arbitrateIndependently(collected, arbitrator),
+ ).rejects.toThrow(/Auth failed/);
+ });
+
+ it('should include diff in the arbitration prompt', async () => {
+ const collected = {
+ modelResults: [
+ { modelId: 'model-a', reviewText: 'Review A' },
+ ] as ModelReviewResult[],
+ failedModels: [],
+ diff: 'important diff content',
+ };
+ const arbitrator = makeModel('arbitrator');
+
+ const mockGenerateContent = vi
+ .fn()
+ .mockResolvedValue(makeGeneratorResponse('report'));
+ mockedCreateContentGenerator.mockResolvedValueOnce({
+ generateContent: mockGenerateContent,
+ } as any);
+
+ await service.arbitrateIndependently(collected, arbitrator);
+
+ const prompt =
+ mockGenerateContent.mock.calls[0][0].contents[0].parts[0].text;
+ expect(prompt).toContain('important diff content');
+ expect(prompt).toContain('');
+ expect(prompt).toContain('Review by model-a');
+ });
+ });
+
+ describe('buildSessionArbitrationPrompt', () => {
+ it('should build prompt containing all model reviews but not the diff', () => {
+ const collected = {
+ modelResults: [
+ { modelId: 'model-a', reviewText: 'Review A content' },
+ { modelId: 'model-b', reviewText: 'Review B content' },
+ ] as ModelReviewResult[],
+ failedModels: [],
+ diff: 'the diff',
+ };
+
+ const prompt = service.buildSessionArbitrationPrompt(collected);
+
+ expect(prompt).toContain('Review by model-a');
+ expect(prompt).toContain('Review A content');
+ expect(prompt).toContain('Review by model-b');
+ expect(prompt).toContain('Review B content');
+ // Diff content is excluded because the session model already has it in context
+ expect(prompt).not.toContain('');
+ expect(prompt).toContain('already available in context');
+ });
+ });
+});
diff --git a/packages/core/src/services/multiModelReviewService.ts b/packages/core/src/services/multiModelReviewService.ts
new file mode 100644
index 0000000000..4e4694a455
--- /dev/null
+++ b/packages/core/src/services/multiModelReviewService.ts
@@ -0,0 +1,256 @@
+/**
+ * @license
+ * Copyright 2025 Qwen
+ * SPDX-License-Identifier: Apache-2.0
+ */
+
+import process from 'node:process';
+import pLimit from 'p-limit';
+import {
+ type ContentGeneratorConfig,
+ createContentGenerator,
+} from '../core/contentGenerator.js';
+import type { Config } from '../config/config.js';
+import type { ResolvedModelConfig } from '../models/types.js';
+import { getErrorMessage } from '../utils/errors.js';
+import { getResponseText } from '../utils/partUtils.js';
+import { createDebugLogger } from '../utils/debugLogger.js';
+
+const debugLogger = createDebugLogger('MULTI_MODEL_REVIEW');
+
+const CONCURRENCY_LIMIT = 4;
+
+const REVIEW_PROMPT_PREFIX = `Review the following code changes. Cover these dimensions:
+1. Correctness & Security — bugs, edge cases, vulnerabilities
+2. Code Quality — naming, duplication, style consistency
+3. Performance — bottlenecks, memory, unnecessary work
+4. Anything else that looks off
+
+For each finding, include: file path, line number (if applicable), severity (Critical / Suggestion / Nice to have), what's wrong, and suggested fix.
+
+End with a verdict: Approve, Request Changes, or Comment.
+
+`;
+
+const REVIEW_PROMPT_SUFFIX = ``;
+
+const ARBITRATION_PROMPT_TEMPLATE = `You are the senior code reviewer. Multiple models independently reviewed the same code changes. Your job is to produce the final unified review report.
+
+Tasks:
+1. **Merge & deduplicate**: Identify findings that refer to the same issue (even if described differently or pointing to nearby lines). Consolidate them, noting which models identified each issue.
+2. **Resolve severity conflicts**: When models disagree on severity for the same issue, evaluate the actual code and choose the appropriate level. Default to the HIGHER severity when uncertain.
+3. **Validate isolated findings**: For findings raised by only one model, verify against the code. Keep valid ones, dismiss false positives with reasoning.
+4. **Final verdict**: Approve / Request Changes / Comment, with reasoning.
+
+Output format:
+- Group findings by severity (Critical → Suggestion → Nice to have)
+- For each finding: [model names] file:line — title, description, suggested fix
+- End with verdict and one-sentence reasoning
+
+Each model's full review is provided below.
+Do NOT discard findings just because only one model raised them.`;
+
+/**
+ * Result from a single review model.
+ */
+export interface ModelReviewResult {
+ modelId: string;
+ reviewText: string;
+ error?: string;
+}
+
+/**
+ * Collected reviews from all models.
+ */
+export interface CollectedReview {
+ modelResults: ModelReviewResult[];
+ failedModels: ModelReviewResult[];
+ diff: string;
+}
+
+/**
+ * Final arbitrated review report.
+ */
+export interface ArbitratedReview {
+ report: string;
+}
+
+/**
+ * Service for multi-model code review.
+ * Phase 1: Parallel collection of reviews from multiple models.
+ * Phase 2: Arbitration by a designated or session model.
+ */
+export class MultiModelReviewService {
+ constructor(private readonly config: Config) {}
+
+ /**
+ * Phase 1: Collect reviews from multiple models in parallel.
+ */
+ async collectReviews(
+ diff: string,
+ reviewModels: ResolvedModelConfig[],
+ signal?: AbortSignal,
+ ): Promise {
+ const limit = pLimit(CONCURRENCY_LIMIT);
+ const diffLines = diff.split('\n').length;
+ const diffBytes = Buffer.byteLength(diff, 'utf8');
+ debugLogger.info(
+ `Dispatching diff to ${reviewModels.length} models (${diffLines} lines, ${diffBytes} bytes)`,
+ );
+
+ const prompt = `${REVIEW_PROMPT_PREFIX}\n${diff}\n${REVIEW_PROMPT_SUFFIX}`;
+
+ const results = await Promise.all(
+ reviewModels.map((model) =>
+ limit(async (): Promise => {
+ try {
+ signal?.throwIfAborted();
+ debugLogger.info(`Starting review with model: ${model.id}`);
+ const generatorConfig = this.buildGeneratorConfig(model);
+ const generator = await createContentGenerator(
+ generatorConfig,
+ this.config,
+ );
+
+ signal?.throwIfAborted();
+ const response = await generator.generateContent(
+ {
+ model: model.id,
+ contents: [{ role: 'user', parts: [{ text: prompt }] }],
+ config: {
+ abortSignal: signal,
+ },
+ },
+ `review-${model.id}`,
+ );
+
+ const text = getResponseText(response) ?? '';
+
+ if (!text.trim()) {
+ debugLogger.warn(
+ `Model ${model.id} returned empty review response`,
+ );
+ return {
+ modelId: model.id,
+ reviewText: '',
+ error: 'Empty response',
+ };
+ }
+
+ debugLogger.info(`Review complete from model: ${model.id}`);
+ return { modelId: model.id, reviewText: text };
+ } catch (error) {
+ const errorMsg = getErrorMessage(error);
+ debugLogger.error(
+ `Review failed for model ${model.id}: ${errorMsg}`,
+ );
+ return { modelId: model.id, reviewText: '', error: errorMsg };
+ }
+ }),
+ ),
+ );
+
+ const successful = results.filter((r) => !r.error);
+ const failed = results.filter((r) => r.error);
+
+ if (failed.length > 0) {
+ debugLogger.warn(
+ `${failed.length}/${results.length} review models failed: ${failed.map((r) => `${r.modelId} (${r.error})`).join(', ')}`,
+ );
+ }
+
+ return {
+ modelResults: successful,
+ failedModels: failed,
+ diff,
+ };
+ }
+
+ /**
+ * Phase 2: Independent arbitration using a configured arbitrator model.
+ * Used when review.arbitratorModel is set.
+ */
+ async arbitrateIndependently(
+ collected: CollectedReview,
+ arbitratorModel: ResolvedModelConfig,
+ signal?: AbortSignal,
+ ): Promise {
+ debugLogger.info(
+ `Starting independent arbitration with model: ${arbitratorModel.id}`,
+ );
+
+ const modelReviews = this.formatModelReviews(collected.modelResults);
+
+ const fullPrompt = `${ARBITRATION_PROMPT_TEMPLATE}\n\n${modelReviews}\n\n\n${collected.diff}\n`;
+
+ const generatorConfig = this.buildGeneratorConfig(arbitratorModel);
+ const generator = await createContentGenerator(
+ generatorConfig,
+ this.config,
+ );
+
+ const response = await generator.generateContent(
+ {
+ model: arbitratorModel.id,
+ contents: [{ role: 'user', parts: [{ text: fullPrompt }] }],
+ config: {
+ abortSignal: signal,
+ },
+ },
+ 'review-arbitrator',
+ );
+
+ const text = getResponseText(response) ?? '';
+
+ if (!text.trim()) {
+ throw new Error(
+ `Arbitrator model '${arbitratorModel.id}' returned empty response`,
+ );
+ }
+
+ debugLogger.info('Independent arbitration complete');
+ return { report: text };
+ }
+
+ /**
+ * Build the arbitration prompt for session-model arbitration.
+ * Excludes the diff since the session model already has it in context
+ * (it was passed as the tool's input parameter).
+ */
+ buildSessionArbitrationPrompt(collected: CollectedReview): string {
+ const modelReviews = this.formatModelReviews(collected.modelResults);
+
+ return `${ARBITRATION_PROMPT_TEMPLATE}\n\n${modelReviews}\n\nThe diff is already available in context from the tool input — refer to it when validating findings.`;
+ }
+
+ /**
+ * Format model review results into sections for the arbitration prompt.
+ */
+ private formatModelReviews(results: ModelReviewResult[]): string {
+ return results
+ .map((r) => `## Review by ${r.modelId}\n\n${r.reviewText}`)
+ .join('\n\n---\n\n');
+ }
+
+ /**
+ * Map ResolvedModelConfig to ContentGeneratorConfig.
+ */
+ private buildGeneratorConfig(
+ model: ResolvedModelConfig,
+ ): ContentGeneratorConfig {
+ const apiKey = model.envKey ? process.env[model.envKey] : undefined;
+ if (model.envKey && !apiKey) {
+ throw new Error(
+ `Environment variable '${model.envKey}' required for model '${model.id}' is not set.`,
+ );
+ }
+ return {
+ ...model.generationConfig,
+ model: model.id,
+ authType: model.authType,
+ apiKey,
+ apiKeyEnvKey: model.envKey,
+ baseUrl: model.baseUrl,
+ };
+ }
+}
diff --git a/packages/core/src/skills/bundled/review/SKILL.md b/packages/core/src/skills/bundled/review/SKILL.md
index 14e5f27e6d..a62cb82e5d 100644
--- a/packages/core/src/skills/bundled/review/SKILL.md
+++ b/packages/core/src/skills/bundled/review/SKILL.md
@@ -7,6 +7,7 @@ allowedTools:
- grep_search
- read_file
- glob
+ - multi_model_review
---
# Code Review
@@ -29,7 +30,20 @@ Based on the arguments provided:
- Run `git diff HEAD -- ` to get recent changes
- If no diff, read the file and review its current state
-## Step 2: Parallel multi-dimensional review
+## Step 2: Try multi-model review
+
+Unless the user explicitly specified `--single`, check if the `multi_model_review` tool is available. If it is, call it with the diff obtained in Step 1.
+
+- If the tool returns a **complete review report** (from independent arbitration): present it directly as the final output using the format in Step 4, then stop.
+- If the tool returns **collected reviews from multiple models** (for session-model arbitration): you are the arbitrator. Merge, deduplicate, and validate findings from all models, then produce the final report using the format in Step 4. You have access to the full project context and tools to verify findings if needed.
+- If the tool returns a **single-model result** (only one of several models succeeded): treat this as a complete review and present it using the format in Step 4, then stop.
+- For any other tool result — setup guidance (< 2 models configured), all-models-failed message, or configuration error — proceed to Step 3. These are non-final results and the tool output may contain useful context but should not be treated as the final review.
+
+If the `multi_model_review` tool is not available (not in your tool list), or the user specified `--single`, proceed directly to Step 3.
+
+## Step 3: Single-model parallel multi-dimensional review
+
+This step is used when multi-model review is not available, or when the user specified `--single`.
Launch **four parallel review agents** to analyze the changes from different angles. Each agent should focus exclusively on its dimension.
@@ -77,9 +91,9 @@ Focus areas:
- Unexpected side effects or hidden coupling
- Anything else that looks off — trust your instincts
-## Step 3: Aggregate and present findings
+## Step 4: Aggregate and present findings
-Combine results from all four agents into a single, well-organized review. Use this format:
+Combine results into a single, well-organized review. Use this format:
### Summary
diff --git a/packages/core/src/tools/multiModelReview.test.ts b/packages/core/src/tools/multiModelReview.test.ts
new file mode 100644
index 0000000000..3305eec969
--- /dev/null
+++ b/packages/core/src/tools/multiModelReview.test.ts
@@ -0,0 +1,369 @@
+/**
+ * @license
+ * Copyright 2025 Qwen
+ * SPDX-License-Identifier: Apache-2.0
+ */
+
+/* eslint-disable @typescript-eslint/no-explicit-any */
+
+import { describe, it, expect, vi, beforeEach } from 'vitest';
+import { MultiModelReviewTool } from './multiModelReview.js';
+import type { Config } from '../config/config.js';
+import type { ResolvedModelConfig, AvailableModel } from '../models/types.js';
+import { AuthType } from '../core/contentGenerator.js';
+
+// Mock the service
+vi.mock('../services/multiModelReviewService.js', () => ({
+ MultiModelReviewService: vi.fn().mockImplementation(() => ({
+ collectReviews: vi.fn(),
+ arbitrateIndependently: vi.fn(),
+ buildSessionArbitrationPrompt: vi.fn(),
+ })),
+}));
+
+import { MultiModelReviewService } from '../services/multiModelReviewService.js';
+
+const MockedService = vi.mocked(MultiModelReviewService);
+
+function makeModel(id: string): ResolvedModelConfig {
+ return {
+ id,
+ name: id,
+ authType: AuthType.USE_OPENAI,
+ baseUrl: 'https://api.openai.com/v1',
+ generationConfig: {},
+ capabilities: {},
+ };
+}
+
+function makeConfig(overrides: {
+ reviewModels?: ResolvedModelConfig[];
+ arbitratorModel?: ResolvedModelConfig;
+ allConfiguredModels?: AvailableModel[];
+}): Config {
+ return {
+ getReviewModels: vi.fn().mockReturnValue(overrides.reviewModels ?? []),
+ getArbitratorModel: vi.fn().mockReturnValue(overrides.arbitratorModel),
+ getAllConfiguredModels: vi
+ .fn()
+ .mockReturnValue(overrides.allConfiguredModels ?? []),
+ } as unknown as Config;
+}
+
+describe('MultiModelReviewTool', () => {
+ beforeEach(() => {
+ vi.clearAllMocks();
+ });
+
+ it('should return guidance when fewer than 2 models configured', async () => {
+ const config = makeConfig({
+ reviewModels: [makeModel('only-one')],
+ allConfiguredModels: [],
+ });
+ const tool = new MultiModelReviewTool(config);
+ const invocation = (tool as any).createInvocation({ diff: 'some diff' });
+
+ const result = await invocation.execute(new AbortController().signal);
+
+ const text = Array.isArray(result.llmContent)
+ ? result.llmContent[0].text
+ : result.llmContent;
+ expect(text).toContain(
+ 'Multi-model review requires at least 2 configured models',
+ );
+ });
+
+ it('should return guidance when zero models configured', async () => {
+ const config = makeConfig({
+ reviewModels: [],
+ allConfiguredModels: [],
+ });
+ const tool = new MultiModelReviewTool(config);
+ const invocation = (tool as any).createInvocation({ diff: 'some diff' });
+
+ const result = await invocation.execute(new AbortController().signal);
+
+ const text = Array.isArray(result.llmContent)
+ ? result.llmContent[0].text
+ : result.llmContent;
+ expect(text).toContain(
+ 'Multi-model review requires at least 2 configured models',
+ );
+ });
+
+ it('should list available models in guidance text', async () => {
+ const config = makeConfig({
+ reviewModels: [makeModel('only-one')],
+ allConfiguredModels: [
+ { id: 'gpt-4o', label: 'GPT-4o', authType: AuthType.USE_OPENAI },
+ {
+ id: 'claude-sonnet',
+ label: 'Claude Sonnet',
+ authType: AuthType.USE_OPENAI,
+ },
+ ] as AvailableModel[],
+ });
+ const tool = new MultiModelReviewTool(config);
+ const invocation = (tool as any).createInvocation({ diff: 'some diff' });
+
+ const result = await invocation.execute(new AbortController().signal);
+
+ const text = Array.isArray(result.llmContent)
+ ? result.llmContent[0].text
+ : result.llmContent;
+ expect(text).toContain('gpt-4o');
+ expect(text).toContain('claude-sonnet');
+ });
+
+ it('should return error when all review models fail', async () => {
+ const models = [makeModel('model-a'), makeModel('model-b')];
+ const config = makeConfig({ reviewModels: models });
+
+ const serviceInstance = {
+ collectReviews: vi.fn().mockResolvedValue({
+ modelResults: [],
+ failedModels: [
+ { modelId: 'model-a', reviewText: '', error: 'timeout' },
+ { modelId: 'model-b', reviewText: '', error: 'rate limit' },
+ ],
+ diff: 'some diff',
+ }),
+ arbitrateIndependently: vi.fn(),
+ buildSessionArbitrationPrompt: vi.fn(),
+ };
+ MockedService.mockImplementation(() => serviceInstance as any);
+
+ const tool = new MultiModelReviewTool(config);
+ const invocation = (tool as any).createInvocation({ diff: 'some diff' });
+ const result = await invocation.execute(new AbortController().signal);
+
+ const text = Array.isArray(result.llmContent)
+ ? result.llmContent[0].text
+ : result.llmContent;
+ expect(text).toContain('All review models failed');
+ expect(text).toContain('model-a');
+ expect(text).toContain('timeout');
+ expect(result.returnDisplay).toContain('model-a');
+ expect(result.returnDisplay).toContain('model-b');
+ });
+
+ it('should return independent arbitration result when arbitrator is configured', async () => {
+ const models = [makeModel('model-a'), makeModel('model-b')];
+ const arbitrator = makeModel('arbitrator');
+ const config = makeConfig({
+ reviewModels: models,
+ arbitratorModel: arbitrator,
+ });
+
+ const serviceInstance = {
+ collectReviews: vi.fn().mockResolvedValue({
+ modelResults: [
+ { modelId: 'model-a', reviewText: 'Review A' },
+ { modelId: 'model-b', reviewText: 'Review B' },
+ ],
+ failedModels: [],
+ diff: 'some diff',
+ }),
+ arbitrateIndependently: vi.fn().mockResolvedValue({
+ report: 'Final unified report',
+ }),
+ buildSessionArbitrationPrompt: vi.fn(),
+ };
+ MockedService.mockImplementation(() => serviceInstance as any);
+
+ const tool = new MultiModelReviewTool(config);
+ const invocation = (tool as any).createInvocation({ diff: 'some diff' });
+ const result = await invocation.execute(new AbortController().signal);
+
+ expect(result.returnDisplay).toContain('Multi-model review complete');
+ const text = Array.isArray(result.llmContent)
+ ? result.llmContent[0].text
+ : result.llmContent;
+ expect(text).toContain('Final unified report');
+ expect(text).toContain('model-a, model-b');
+ expect(text).toContain('arbitrator');
+ });
+
+ it('should fall back to session arbitration when arbitrator fails', async () => {
+ const models = [makeModel('model-a'), makeModel('model-b')];
+ const arbitrator = makeModel('arbitrator');
+ const config = makeConfig({
+ reviewModels: models,
+ arbitratorModel: arbitrator,
+ });
+
+ const serviceInstance = {
+ collectReviews: vi.fn().mockResolvedValue({
+ modelResults: [
+ { modelId: 'model-a', reviewText: 'Review A' },
+ { modelId: 'model-b', reviewText: 'Review B' },
+ ],
+ failedModels: [],
+ diff: 'some diff',
+ }),
+ arbitrateIndependently: vi
+ .fn()
+ .mockRejectedValue(new Error('arbitrator down')),
+ buildSessionArbitrationPrompt: vi
+ .fn()
+ .mockReturnValue('arbitration prompt'),
+ };
+ MockedService.mockImplementation(() => serviceInstance as any);
+
+ const tool = new MultiModelReviewTool(config);
+ const invocation = (tool as any).createInvocation({ diff: 'some diff' });
+ const result = await invocation.execute(new AbortController().signal);
+
+ expect(result.returnDisplay).toContain('Collected');
+ const text = Array.isArray(result.llmContent)
+ ? result.llmContent[0].text
+ : result.llmContent;
+ expect(text).toContain("Arbitrator model 'arbitrator' failed");
+ expect(text).toContain('session model');
+ });
+
+ it('should use session arbitration when no arbitrator configured', async () => {
+ const models = [makeModel('model-a'), makeModel('model-b')];
+ const config = makeConfig({ reviewModels: models });
+
+ const serviceInstance = {
+ collectReviews: vi.fn().mockResolvedValue({
+ modelResults: [
+ { modelId: 'model-a', reviewText: 'Review A' },
+ { modelId: 'model-b', reviewText: 'Review B' },
+ ],
+ failedModels: [],
+ diff: 'some diff',
+ }),
+ arbitrateIndependently: vi.fn(),
+ buildSessionArbitrationPrompt: vi
+ .fn()
+ .mockReturnValue('session arbitration prompt'),
+ };
+ MockedService.mockImplementation(() => serviceInstance as any);
+
+ const tool = new MultiModelReviewTool(config);
+ const invocation = (tool as any).createInvocation({ diff: 'some diff' });
+ const result = await invocation.execute(new AbortController().signal);
+
+ expect(result.returnDisplay).toContain('Collected');
+ const text = Array.isArray(result.llmContent)
+ ? result.llmContent[0].text
+ : result.llmContent;
+ expect(text).toContain('Please act as the arbitrator');
+ });
+
+ it('should skip arbitration when only 1 model succeeds', async () => {
+ const models = [makeModel('model-a'), makeModel('model-b')];
+ const config = makeConfig({ reviewModels: models });
+
+ const serviceInstance = {
+ collectReviews: vi.fn().mockResolvedValue({
+ modelResults: [{ modelId: 'model-a', reviewText: 'Only review' }],
+ failedModels: [
+ { modelId: 'model-b', reviewText: '', error: 'timeout' },
+ ],
+ diff: 'some diff',
+ }),
+ arbitrateIndependently: vi.fn(),
+ buildSessionArbitrationPrompt: vi.fn(),
+ };
+ MockedService.mockImplementation(() => serviceInstance as any);
+
+ const tool = new MultiModelReviewTool(config);
+ const invocation = (tool as any).createInvocation({ diff: 'some diff' });
+ const result = await invocation.execute(new AbortController().signal);
+
+ const text = Array.isArray(result.llmContent)
+ ? result.llmContent[0].text
+ : result.llmContent;
+ expect(text).toContain('Only review');
+ expect(text).toContain('Arbitration skipped');
+ expect(serviceInstance.arbitrateIndependently).not.toHaveBeenCalled();
+ expect(
+ serviceInstance.buildSessionArbitrationPrompt,
+ ).not.toHaveBeenCalled();
+ });
+
+ it('should surface arbitrator resolution failure in output', async () => {
+ const models = [makeModel('model-a'), makeModel('model-b')];
+ const config = {
+ getReviewModels: vi.fn().mockReturnValue(models),
+ getArbitratorModel: vi.fn().mockImplementation(() => {
+ throw new Error("Arbitrator model 'bad' not found");
+ }),
+ getAllConfiguredModels: vi.fn().mockReturnValue([]),
+ } as unknown as Config;
+
+ const serviceInstance = {
+ collectReviews: vi.fn().mockResolvedValue({
+ modelResults: [
+ { modelId: 'model-a', reviewText: 'Review A' },
+ { modelId: 'model-b', reviewText: 'Review B' },
+ ],
+ failedModels: [],
+ diff: 'some diff',
+ }),
+ arbitrateIndependently: vi.fn(),
+ buildSessionArbitrationPrompt: vi
+ .fn()
+ .mockReturnValue('arbitration prompt'),
+ };
+ MockedService.mockImplementation(() => serviceInstance as any);
+
+ const tool = new MultiModelReviewTool(config);
+ const invocation = (tool as any).createInvocation({ diff: 'some diff' });
+ const result = await invocation.execute(new AbortController().signal);
+
+ const text = Array.isArray(result.llmContent)
+ ? result.llmContent[0].text
+ : result.llmContent;
+ expect(text).toContain('could not be resolved');
+ expect(text).toContain('falling back to session model');
+ });
+
+ it('should handle config resolution errors gracefully', async () => {
+ const config = {
+ getReviewModels: vi.fn().mockImplementation(() => {
+ throw new Error("Model 'bad-model' not found");
+ }),
+ getAllConfiguredModels: vi.fn().mockReturnValue([]),
+ } as unknown as Config;
+
+ const tool = new MultiModelReviewTool(config);
+ const invocation = (tool as any).createInvocation({ diff: 'some diff' });
+ const result = await invocation.execute(new AbortController().signal);
+
+ const text = Array.isArray(result.llmContent)
+ ? result.llmContent[0].text
+ : result.llmContent;
+ expect(text).toContain('configuration error');
+ });
+
+ describe('validateToolParams', () => {
+ it('should reject empty diff', () => {
+ const config = makeConfig({});
+ const tool = new MultiModelReviewTool(config);
+
+ expect(tool.validateToolParams({ diff: '' })).toBe(
+ 'Parameter "diff" must be a non-empty string.',
+ );
+ });
+
+ it('should reject whitespace-only diff', () => {
+ const config = makeConfig({});
+ const tool = new MultiModelReviewTool(config);
+
+ expect(tool.validateToolParams({ diff: ' ' })).toBe(
+ 'Parameter "diff" must be a non-empty string.',
+ );
+ });
+
+ it('should accept valid diff', () => {
+ const config = makeConfig({});
+ const tool = new MultiModelReviewTool(config);
+
+ expect(tool.validateToolParams({ diff: '+ added line' })).toBeNull();
+ });
+ });
+});
diff --git a/packages/core/src/tools/multiModelReview.ts b/packages/core/src/tools/multiModelReview.ts
new file mode 100644
index 0000000000..2904696c8b
--- /dev/null
+++ b/packages/core/src/tools/multiModelReview.ts
@@ -0,0 +1,261 @@
+/**
+ * @license
+ * Copyright 2025 Qwen
+ * SPDX-License-Identifier: Apache-2.0
+ */
+
+import { BaseDeclarativeTool, BaseToolInvocation, Kind } from './tools.js';
+import { ToolNames, ToolDisplayNames } from './tool-names.js';
+import type { ToolResult, ToolResultDisplay } from './tools.js';
+import type { Config } from '../config/config.js';
+import type { ResolvedModelConfig } from '../models/types.js';
+import {
+ MultiModelReviewService,
+ type CollectedReview,
+} from '../services/multiModelReviewService.js';
+import { createDebugLogger } from '../utils/debugLogger.js';
+
+const debugLogger = createDebugLogger('MULTI_MODEL_REVIEW_TOOL');
+
+export interface MultiModelReviewParams {
+ diff: string;
+}
+
+/**
+ * Tool for multi-model code review.
+ * Sends the diff to multiple configured review models in parallel,
+ * then arbitrates results into a unified report.
+ */
+export class MultiModelReviewTool extends BaseDeclarativeTool<
+ MultiModelReviewParams,
+ ToolResult
+> {
+ static readonly Name: string = ToolNames.MULTI_MODEL_REVIEW;
+
+ constructor(private readonly config: Config) {
+ const schema = {
+ type: 'object',
+ properties: {
+ diff: {
+ type: 'string',
+ description: 'The code diff to review',
+ },
+ },
+ required: ['diff'],
+ additionalProperties: false,
+ $schema: 'http://json-schema.org/draft-07/schema#',
+ };
+
+ super(
+ MultiModelReviewTool.Name,
+ ToolDisplayNames.MULTI_MODEL_REVIEW,
+ 'Run multi-model code review. Sends the diff to multiple configured review models in parallel, then produces a unified review report. Requires review.models to be configured in settings with at least 2 models.',
+ Kind.Read,
+ schema,
+ true, // isOutputMarkdown
+ false, // canUpdateOutput
+ );
+ }
+
+ override validateToolParams(params: MultiModelReviewParams): string | null {
+ if (
+ !params.diff ||
+ typeof params.diff !== 'string' ||
+ !params.diff.trim()
+ ) {
+ return 'Parameter "diff" must be a non-empty string.';
+ }
+ return null;
+ }
+
+ protected createInvocation(params: MultiModelReviewParams) {
+ return new MultiModelReviewInvocation(this.config, params);
+ }
+}
+
+class MultiModelReviewInvocation extends BaseToolInvocation<
+ MultiModelReviewParams,
+ ToolResult
+> {
+ constructor(
+ private readonly config: Config,
+ params: MultiModelReviewParams,
+ ) {
+ super(params);
+ }
+
+ getDescription(): string {
+ return 'Run multi-model code review';
+ }
+
+ override async shouldConfirmExecute(): Promise {
+ return false;
+ }
+
+ async execute(
+ signal: AbortSignal,
+ _updateOutput?: (output: ToolResultDisplay) => void,
+ ): Promise {
+ // Resolve review models from config
+ let reviewModels: ResolvedModelConfig[];
+ try {
+ reviewModels = this.config.getReviewModels();
+ } catch (error) {
+ const msg = error instanceof Error ? error.message : String(error);
+ debugLogger.error(`Failed to resolve review models: ${msg}`);
+ return {
+ llmContent: [
+ {
+ text: `Multi-model review configuration error: ${msg}`,
+ },
+ ],
+ returnDisplay: `Configuration error: ${msg}`,
+ };
+ }
+
+ if (reviewModels.length < 2) {
+ // Return guidance — SKILL.md will naturally fall back to 4-agent flow
+ const guidance = this.buildGuidanceText();
+ return {
+ llmContent: [{ text: guidance }],
+ returnDisplay:
+ 'Multi-model review not available (< 2 models configured)',
+ };
+ }
+
+ const service = new MultiModelReviewService(this.config);
+
+ // Phase 1: Collect reviews
+ const collected = await service.collectReviews(
+ this.params.diff,
+ reviewModels,
+ signal,
+ );
+
+ const failureSummary = this.formatFailureSummary(collected);
+
+ if (collected.modelResults.length === 0) {
+ return {
+ llmContent: [
+ {
+ text: `All review models failed. Please proceed with standard single-model review using the 4-agent approach.\n\n${failureSummary}`,
+ },
+ ],
+ returnDisplay: `All review models failed: ${collected.failedModels.map((r) => r.modelId).join(', ')}`,
+ };
+ }
+
+ if (collected.modelResults.length === 1) {
+ // Only one model succeeded — arbitration adds no value, return its review directly
+ const single = collected.modelResults[0];
+ return {
+ llmContent: [
+ {
+ text: `**Review model:** ${single.modelId}\n**Note:** Only 1 of ${reviewModels.length} review models succeeded. Arbitration skipped.\n${failureSummary}\n\n${single.reviewText}`,
+ },
+ ],
+ returnDisplay: `Single model review (${reviewModels.length - 1} model(s) failed)`,
+ };
+ }
+
+ // Phase 2: Arbitration
+ let arbitratorFallbackReason: string | undefined;
+ let arbitratorModel: ResolvedModelConfig | undefined;
+ try {
+ arbitratorModel = this.config.getArbitratorModel();
+ } catch (error) {
+ const errorMsg = error instanceof Error ? error.message : String(error);
+ debugLogger.warn(
+ `Failed to resolve arbitrator model, falling back to session model: ${errorMsg}`,
+ );
+ arbitratorFallbackReason = `Configured arbitrator model could not be resolved (${errorMsg}), falling back to session model.`;
+ }
+
+ if (arbitratorModel) {
+ // Independent arbitration
+ try {
+ const result = await service.arbitrateIndependently(
+ collected,
+ arbitratorModel,
+ signal,
+ );
+
+ const header = this.buildReportHeader(
+ collected,
+ arbitratorModel.id,
+ failureSummary,
+ );
+ return {
+ llmContent: [{ text: `${header}\n\n${result.report}` }],
+ returnDisplay: `Multi-model review complete (${collected.modelResults.length} models + arbitrator)`,
+ };
+ } catch (error) {
+ const errorMsg = error instanceof Error ? error.message : String(error);
+ debugLogger.warn(
+ `Independent arbitration failed, falling back to session model arbitration: ${errorMsg}`,
+ );
+ arbitratorFallbackReason = `Arbitrator model '${arbitratorModel.id}' failed (${errorMsg}), falling back to session model.`;
+ // Fall through to session model arbitration
+ }
+ }
+
+ // Session model arbitration: return collected reviews for the session model
+ const arbitrationPrompt = service.buildSessionArbitrationPrompt(collected);
+ const header = this.buildReportHeader(
+ collected,
+ 'session model',
+ failureSummary,
+ );
+
+ const fallbackNote = arbitratorFallbackReason
+ ? `\n\n> **Note:** ${arbitratorFallbackReason}\n`
+ : '';
+
+ return {
+ llmContent: [
+ {
+ text: `${header}${fallbackNote}\n\nThe following reviews were collected from ${collected.modelResults.length} models. Please act as the arbitrator and produce the final unified review report.\n\n${arbitrationPrompt}`,
+ },
+ ],
+ returnDisplay: `Collected ${collected.modelResults.length} model reviews for arbitration`,
+ };
+ }
+
+ private buildReportHeader(
+ collected: CollectedReview,
+ arbitratorId: string,
+ failureSummary: string,
+ ): string {
+ const modelNames = collected.modelResults.map((r) => r.modelId).join(', ');
+ const header = `**Review models:** ${modelNames}\n**Arbitrator:** ${arbitratorId}`;
+ return failureSummary ? `${header}\n${failureSummary}` : header;
+ }
+
+ private formatFailureSummary(collected: CollectedReview): string {
+ if (collected.failedModels.length === 0) {
+ return '';
+ }
+ const details = collected.failedModels
+ .map((r) => `- ${r.modelId}: ${r.error ?? 'unknown error'}`)
+ .join('\n');
+ return `**Failed models (${collected.failedModels.length}):**\n${details}`;
+ }
+
+ private buildGuidanceText(): string {
+ const availableModels = this.config.getAllConfiguredModels();
+ const modelList =
+ availableModels.length > 0
+ ? availableModels.map((m) => ` - ${m.id} (${m.authType})`).join('\n')
+ : ' (none configured)';
+
+ return `Multi-model review requires at least 2 configured models.
+
+Available models from modelProviders:
+${modelList}
+
+To enable multi-model review, add to settings.json:
+ "review": { "models": ["model-a", "model-b"] }
+
+Please proceed with standard single-model review using the 4-agent approach.`;
+ }
+}
diff --git a/packages/core/src/tools/tool-names.ts b/packages/core/src/tools/tool-names.ts
index c118bffbdf..6781b40727 100644
--- a/packages/core/src/tools/tool-names.ts
+++ b/packages/core/src/tools/tool-names.ts
@@ -26,6 +26,7 @@ export const ToolNames = {
LS: 'list_directory',
LSP: 'lsp',
ASK_USER_QUESTION: 'ask_user_question',
+ MULTI_MODEL_REVIEW: 'multi_model_review',
} as const;
/**
@@ -50,6 +51,7 @@ export const ToolDisplayNames = {
LS: 'ListFiles',
LSP: 'Lsp',
ASK_USER_QUESTION: 'AskUserQuestion',
+ MULTI_MODEL_REVIEW: 'MultiModelReview',
} as const;
// Migration from old tool names to new tool names
diff --git a/packages/vscode-ide-companion/schemas/settings.schema.json b/packages/vscode-ide-companion/schemas/settings.schema.json
index d0eef6ae98..3e2317fb29 100644
--- a/packages/vscode-ide-companion/schemas/settings.schema.json
+++ b/packages/vscode-ide-companion/schemas/settings.schema.json
@@ -612,6 +612,53 @@
}
}
},
+ "review": {
+ "description": "Multi-model code review configuration. When review.models is configured with 2+ models, /review will use multi-model review automatically.",
+ "type": "object",
+ "properties": {
+ "models": {
+ "description": "Models for multi-model review. Each entry can be a model ID string (resolved from modelProviders) or a full model config object with id, authType, baseUrl, envKey.",
+ "type": "array",
+ "items": {
+ "oneOf": [
+ {
+ "type": "string",
+ "description": "Model ID resolved from modelProviders"
+ },
+ {
+ "type": "object",
+ "description": "Inline model configuration",
+ "properties": {
+ "id": {
+ "type": "string",
+ "description": "Model identifier"
+ },
+ "authType": {
+ "type": "string",
+ "description": "Authentication type"
+ },
+ "baseUrl": {
+ "type": "string",
+ "description": "API base URL"
+ },
+ "envKey": {
+ "type": "string",
+ "description": "Environment variable for API key"
+ }
+ },
+ "required": [
+ "id"
+ ]
+ }
+ ]
+ }
+ },
+ "arbitratorModel": {
+ "description": "Model ID for the final arbitrator (resolved from modelProviders). Falls back to the current session model if not set. Recommended: a high-reasoning model.",
+ "type": "string"
+ }
+ }
+ },
"$version": {
"type": "number",
"description": "Settings schema version for migration tracking.",
diff --git a/scripts/generate-settings-schema.ts b/scripts/generate-settings-schema.ts
index 9d13e81666..3d5e833e61 100644
--- a/scripts/generate-settings-schema.ts
+++ b/scripts/generate-settings-schema.ts
@@ -34,6 +34,8 @@ interface JsonSchemaProperty {
description?: string;
properties?: Record;
items?: JsonSchemaProperty;
+ oneOf?: JsonSchemaProperty[];
+ required?: string[];
enum?: (string | number)[];
default?: unknown;
additionalProperties?: boolean | JsonSchemaProperty;
@@ -60,7 +62,9 @@ function convertSettingToJsonSchema(
break;
case 'array':
schema.type = 'array';
- schema.items = { type: 'string' };
+ schema.items = (setting.items as JsonSchemaProperty) ?? {
+ type: 'string',
+ };
break;
case 'enum':
if (setting.options && setting.options.length > 0) {