t1338.1: Extend model-routing.md with local tier by marcusquinn · Pull Request #2326 · marcusquinn/aidevops

marcusquinn · 2026-02-25T23:10:38Z

Summary

Adds local as tier 0 ($0 cost) in the model routing hierarchy: local -> haiku -> flash -> sonnet -> pro -> opus
Local tier uses llama.cpp with user-selected GGUF models for privacy-sensitive, offline, and bulk processing tasks
Falls back to haiku when local server is not running or no model is installed

Changes to `model-routing.md`

Section	Change
Quick Reference	Added cost spectrum line
Model Tiers	New `local` row: llama.cpp, Free ($0)
Routing Rules	New `Use local when` section (privacy, offline, bulk, experimentation)
Cost Estimation	`local` at 0x input/output ($0 electricity only)
Model-Specific Subagents	`local` -> `tools/local-models/local-models.md`, fallback: haiku
Provider Discovery	`local-model-helper.sh status` and `models` commands
Fallback Routing	`local` -> haiku when server not running
Decision Flowchart	Local-first branch before cloud routing
Examples	3 new local examples (bulk processing, offline, privacy)
Subagent Frontmatter	`local` added to valid model values
Related	Links to future local-models subagents and helper script

Verification

markdownlint: 0 errors
All acceptance criteria from t1338-brief.md verified for subtask t1338.1

Context

Part of plan p032 (Local AI Model Support). This is subtask 1 of 6 — documentation-only change to model-routing.md. Future subtasks (t1338.2-t1338.6) will create the local-models subagents, helper script, and usage logging.

Closes #2320

Summary by CodeRabbit

Documentation
- Added a new local model tier and updated cost/tiers table.
- Clarified when to use local models, limitations, and offline/privacy scenarios.
- Expanded decision flowchart and routing guidance to include local-first and cost-driven fallbacks.
- Added examples for bulk local processing and offline use.
- Documented local-model discovery, status/management commands, fallback behavior, and upcoming local runtime notes.

Add local as tier 0 (/bin/zsh cost) in the model routing hierarchy: local -> haiku -> flash -> sonnet -> pro -> opus Changes to model-routing.md: - Model Tiers table: local tier with llama.cpp GGUF - Routing Rules: new 'Use local when' section with privacy/offline/bulk criteria - Cost Estimation: local at 0x (electricity only) - Model-Specific Subagents: local -> tools/local-models/local-models.md - Fallback Routing: local falls back to haiku when server not running - Decision Flowchart: local-first branch before cloud routing - Examples: 3 local tier examples (bulk, offline, privacy) - Provider Discovery: local-model-helper.sh status/models commands - Subagent Frontmatter: local added to valid model values - Related: links to future local-models subagents and helper script Closes #2320

gemini-code-assist · 2026-02-25T23:10:56Z

Summary of Changes

Hello @marcusquinn, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request integrates support for local AI models into the existing model routing system. It introduces a new "local" tier, enabling users to leverage on-device llama.cpp models for tasks requiring privacy, offline capability, or bulk processing, with a clear fallback to cloud models when local options are unavailable.

Highlights

Local Model Tier Introduction: Introduced a new "local" model tier (tier 0) into the model routing hierarchy, positioned as the cheapest option before "haiku".
Technology and Use Cases: The "local" tier utilizes llama.cpp with user-selected GGUF models, designed for privacy-sensitive, offline, and bulk processing tasks.
Fallback Mechanism: Defined a fallback mechanism where the system will route to "haiku" if the local server is not running or no local model is installed.
Documentation Updates: Comprehensively updated model-routing.md to reflect the new "local" tier, including routing rules, cost estimation, subagent definitions, fallback logic, decision flowchart, and examples.

Changelog

.agents/tools/context/model-routing.md
- Added "local" as the lowest-cost tier in the model routing hierarchy and to the model tiers table.
- Detailed the use cases and limitations for the "local" model tier in a new section.
- Updated valid model values to include "local" for subagent frontmatter.
- Included "local" in the cost estimation table with zero cost.
- Defined the subagent and fallback for the "local" tier in the model-specific subagents table.
- Provided commands for discovering local model status and available models.
- Modified the decision flowchart to prioritize local models.
- Added new examples demonstrating the use of local models for specific tasks.
- Linked to new documentation and scripts related to local models in the "Related" section.

Activity

This is the first subtask (t1338.1) of a larger plan (p032) to support local AI models, focusing solely on documentation.
Future subtasks (t1338.2-t1338.6) are planned to implement local-models subagents, helper scripts, and usage logging.
This pull request addresses and closes issue t1338.1: Extend model-routing.md with local tier #2320.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

coderabbitai · 2026-02-25T23:11:01Z

Warning

Rate limit exceeded

@marcusquinn has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 22 minutes and 53 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📥 Commits

Reviewing files that changed from the base of the PR and between 83a43fa and 71fddb9.

📒 Files selected for processing (1)

.agents/tools/context/model-routing.md

Walkthrough

This PR extends the model-routing.md documentation to include a new "local" tier for on-device AI model inference via llama.cpp. It adds routing rules, cost estimation, decision flowchart paths, examples, and helper command references for local model management alongside existing cloud-based tiers.

Changes

Cohort / File(s)	Summary
Model Routing Documentation `.agents/tools/context/model-routing.md`	Added local model tier throughout: new entry in cost spectrum and model table, "Use local" routing rules section, cost estimation row (0x/0x), local-first decision flowchart path, local/offline examples, and helper command references for status checking and model discovery.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

Possibly related PRs

PR #758: Extends model-routing documentation with model-specific subagent tier definitions; directly related through shared routing tier documentation patterns.
PR #782: Implements model-registry-helper.sh "route" command logic that will consume the new local tier routing rules and decision flowchart paths defined in this PR.

Poem

🚀 Local tides turn, no cloud in sight,
llama.cpp whispers through the night,
Routing tiers complete their dance—
Free to budget, budget to branch,
*Now devices glow with models bright. ✨

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly and specifically identifies the main change: extending model-routing.md documentation with a new local model tier.
Linked Issues check	✅ Passed	The PR comprehensively addresses all coding and documentation requirements from issue `#2320`: adds local tier to model-routing.md with cost spectrum, routing rules, decision flowchart, examples, and subagent references as specified.
Out of Scope Changes check	✅ Passed	All changes are scoped to extending model-routing.md documentation with the local tier; no unrelated modifications or code changes outside the stated objectives.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch chore/t1338.1-model-routing-local-tier

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

github-actions · 2026-02-25T23:11:18Z

🔍 Code Quality Report

�[0;35m[MONITOR]�[0m Code Review Monitoring Report

�[0;34m[INFO]�[0m Latest Quality Status:
SonarCloud: 0 bugs, 0 vulnerabilities, 90 code smells

�[0;34m[INFO]�[0m Recent monitoring activity:
Wed Feb 25 23:11:13 UTC 2026: Code review monitoring started
Wed Feb 25 23:11:14 UTC 2026: SonarCloud - Bugs: 0, Vulnerabilities: 0, Code Smells: 90

📈 Current Quality Metrics

BUGS: 0
CODE SMELLS: 90
VULNERABILITIES: 0

Generated on: Wed Feb 25 23:11:16 UTC 2026

Generated by AI DevOps Framework Code Review Monitoring

gemini-code-assist

Code Review

This pull request provides a comprehensive documentation update to model-routing.md, introducing the new local model tier. The changes are well-structured and cover all necessary aspects, including routing rules, cost, and fallback mechanisms. I've identified one inconsistency regarding the model cost ordering and have a suggestion to improve the clarity of the limitations section. Overall, this is a great enhancement to the documentation.

gemini-code-assist · 2026-02-25T23:12:43Z

.agents/tools/context/model-routing.md

 - **Purpose**: Route tasks to the cheapest model that can handle them well
 - **Philosophy**: Use the smallest model that produces acceptable quality
 - **Default**: sonnet (best balance of cost/capability for most tasks)
+- **Cost spectrum**: local (free) -> haiku -> flash -> sonnet -> pro -> opus (highest)


There's an inconsistency in the model cost ordering. The 'Cost spectrum' line lists haiku as cheaper than flash. However, the 'Cost Estimation' table below indicates that flash (~0.20x) is more cost-effective than haiku (~0.25x). To ensure consistency and prevent confusion, the order in the cost spectrum should be updated to reflect the data in the table.

Suggested change

- **Cost spectrum**: local (free) -> haiku -> flash -> sonnet -> pro -> opus (highest)

- **Cost spectrum**: local (free) -> flash -> haiku -> sonnet -> pro -> opus (highest)

gemini-code-assist · 2026-02-25T23:12:44Z

.agents/tools/context/model-routing.md

+- Simple tasks where network latency exceeds local inference time
+- The task fits within the local model's capability (typically <32K context, simpler reasoning)
+
+**Limitations**: Local models are smaller and less capable than cloud models. Do not route complex reasoning, large-context analysis, or architecture decisions to local. If a local model is not running or not installed, skip to `haiku`.


The 'Limitations' section currently mixes model capability constraints with the fallback mechanism. To improve clarity, I suggest separating these. This makes a clear distinction between guidance (when not to use local) and automatic behavior (what happens if local is unavailable).

Suggested change

**Limitations**: Local models are smaller and less capable than cloud models. Do not route complex reasoning, large-context analysis, or architecture decisions to local. If a local model is not running or not installed, skip to `haiku`.

**Limitations**: Local models are smaller and less capable than cloud models. Do not route complex reasoning, large-context analysis, or architecture decisions to local.

If a local model is not running or not installed, the request automatically falls back to `haiku`.

augmentcode · 2026-02-25T23:13:26Z

🤖 Augment PR Summary

Summary: Extends model-routing.md to introduce a tier-0 local option (llama.cpp + user-selected GGUF) for offline/privacy/bulk workloads.

Changes:

Adds local to the tier hierarchy, cost spectrum, and cost estimation table
Documents when to route to local, its limitations, and a local-first decision flow
Defines local→haiku fallback behavior when the local server/model is unavailable
Adds examples and related links/commands for upcoming local-model tooling

Technical Notes: Keeps sonnet as the default; local is conditional and intended to fall back to cloud tiers when not viable.

_{🤖 Was this summary useful? React with 👍 or 👎}

augmentcode

Review completed. 3 suggestions posted.

Comment augment review to trigger a new review at any time.

augmentcode · 2026-02-25T23:13:27Z

.agents/tools/context/model-routing.md

 ```

-Valid values: `haiku`, `flash`, `sonnet`, `pro`, `opus`
+Valid values: `local`, `haiku`, `flash`, `sonnet`, `pro`, `opus`


This doc now states model: local is a valid tier, but several scripts appear to validate/resolve tiers against a fixed set that currently excludes local (e.g., tier checks in model-availability-helper.sh and model-label-helper.sh). That mismatch could cause routing/labeling failures if someone starts using local immediately based on this guidance.

Severity: medium

_{🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.}

augmentcode · 2026-02-25T23:13:27Z

.agents/tools/context/model-routing.md


 | Tier | Subagent | Primary Model | Fallback |
 |------|----------|---------------|----------|
+| `local` | `tools/local-models/local-models.md` | llama.cpp (user GGUF) | haiku |


tools/local-models/local-models.md (and related local-model docs) are referenced here, but those files don’t exist in the repo yet, so this will be a broken reference for readers until the follow-up subtasks land. Consider marking these as “coming in t1338.x” or deferring the references until the files are added.

Severity: medium

Other Locations

.agents/tools/context/model-routing.md:300

.agents/tools/context/model-routing.md:301

_{🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.}

augmentcode · 2026-02-25T23:13:27Z

.agents/tools/context/model-routing.md


 Discovery checks three sources (in order): environment variables, gopass encrypted secrets, plaintext `credentials.sh`. Use discovery output to constrain routing to models the user can actually access.

+For local models, use `local-model-helper.sh status` to check if a local model server is running:


This section instructs running local-model-helper.sh, but that helper script is not present in the repo yet, so the command is currently non-actionable. Consider explicitly noting it’s planned for a later subtask to avoid confusing users following the doc today.

Severity: medium

Other Locations

.agents/tools/context/model-routing.md:302

_{🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.}

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (1)

.agents/tools/context/model-routing.md (1)
121-126: Clarify subagent location scope to avoid path confusion.

The sentence on Line 121 says model subagents are defined under tools/ai-assistants/models/, but the new local entry on Line 125 points to tools/local-models/local-models.md. Update wording to “primarily” or “across these paths” so the docs stay operationally accurate.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.agents/tools/context/model-routing.md around lines 121 - 126, The sentence
stating that concrete model subagents are defined under
tools/ai-assistants/models/ is too narrow given the new local entry pointing to
tools/local-models/local-models.md; update the wording (the paragraph
introducing the table) to indicate subagents are "primarily defined under
tools/ai-assistants/models/" or "defined across these paths (e.g.,
tools/ai-assistants/models/ and tools/local-models/)" so the table entries like
the local subagent and the haiku entry are accurately covered.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.agents/tools/context/model-routing.md:
- Around line 40-50: The policy currently allows silent fallback from the
"local" route to "haiku", which violates on-device/privacy constraints; change
the wording and logic so that when a task requires on-device handling ("local")
and the local model is not running or installed, the system must fail closed and
not auto-route to "haiku" or any cloud option without an explicit override;
update the text replacing "skip to `haiku`" with a clear fail-closed statement
and add a requirement that any escalation to `haiku` (or cloud) must be
performed only after an explicit, auditable override, and mirror this change in
the routing flowchart references to "local" -> "haiku".

---

Nitpick comments:
In @.agents/tools/context/model-routing.md:
- Around line 121-126: The sentence stating that concrete model subagents are
defined under tools/ai-assistants/models/ is too narrow given the new local
entry pointing to tools/local-models/local-models.md; update the wording (the
paragraph introducing the table) to indicate subagents are "primarily defined
under tools/ai-assistants/models/" or "defined across these paths (e.g.,
tools/ai-assistants/models/ and tools/local-models/)" so the table entries like
the local subagent and the haiku entry are accurately covered.

ℹ️ Review info

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e0dea12 and 83a43fa.

📒 Files selected for processing (1)

.agents/tools/context/model-routing.md

.agents/tools/context/model-routing.md

…review) - Local route no longer silently falls back to cloud haiku when the task requires on-device processing (privacy, compliance, air-gapped). It now fails with an error unless --allow-cloud is explicitly passed. - Cost-optimisation and experimentation use cases still fall back to haiku. - Updated decision flowchart, fallback table, and subagent table to reflect the fail-closed/fail-open distinction. - Updated subagent table intro to cover both tools/ai-assistants/models/ (cloud) and tools/local-models/ (local) paths.

marcusquinn · 2026-02-25T23:17:57Z

@coderabbitai review

Addressed your feedback:

Fail-closed for privacy/on-device: Local tier no longer silently falls back to cloud. Privacy/compliance tasks FAIL when local is unavailable — requires explicit --allow-cloud override. Cost-optimisation tasks still fall back to haiku.
Subagent table intro: Updated to say subagents are "defined across these paths" covering both tools/ai-assistants/models/ (cloud) and tools/local-models/ (local).
Flowchart updated: Decision tree now branches on privacy-constrained vs cost-optimisation before checking local availability.
Fallback table updated: Local tier row now shows conditional fallback behaviour.

github-actions · 2026-02-25T23:18:23Z

🔍 Code Quality Report

�[0;35m[MONITOR]�[0m Code Review Monitoring Report

�[0;34m[INFO]�[0m Latest Quality Status:
SonarCloud: 0 bugs, 0 vulnerabilities, 90 code smells

�[0;34m[INFO]�[0m Recent monitoring activity:
Wed Feb 25 23:18:19 UTC 2026: Code review monitoring started
Wed Feb 25 23:18:19 UTC 2026: SonarCloud - Bugs: 0, Vulnerabilities: 0, Code Smells: 90

📈 Current Quality Metrics

BUGS: 0
CODE SMELLS: 90
VULNERABILITIES: 0

Generated on: Wed Feb 25 23:18:22 UTC 2026

Generated by AI DevOps Framework Code Review Monitoring

coderabbitai · 2026-02-25T23:18:52Z

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Read the model-routing.md file to verify the changes
cat .agents/tools/context/model-routing.md

Length of output: 13109

@marcusquinn Excellent work addressing the feedback! I've verified all four changes:

✅ Fail-closed for privacy: The fallback behavior now correctly distinguishes between privacy/on-device requirements (FAIL with explicit --allow-cloud override needed) and cost-optimization scenarios (fallback to haiku). This is a critical security improvement.
✅ Subagent table intro: Now clearly states subagents are "defined across these paths" covering both tools/ai-assistants/models/ (cloud) and tools/local-models/ (local).
✅ Flowchart logic: The decision tree now properly branches on privacy-constrained vs cost-optimization before checking local availability, making the routing logic explicit and unambiguous.
✅ Fallback table: The local tier row now shows the conditional fallback behavior matching the flowchart and narrative documentation.

The documentation is now internally consistent across all sections (routing rules, flowchart, fallback table, and examples). The fail-closed approach for privacy-sensitive tasks is a best practice that prevents accidental data leakage to cloud providers.

Triggering a full review now.

🧠 Learnings used

Learnt from: marcusquinn
Repo: marcusquinn/aidevops PR: 1633
File: .agents/tools/build-agent/build-agent.md:441-450
Timestamp: 2026-02-18T03:15:45.314Z
Learning: In the aidevops framework, agent frontmatter uses tier aliases (`haiku`, `sonnet`, `opus`) for the `model:` field, which are resolved by the supervisor's model routing system. These are NOT OpenCode's native `provider/model-id` format — they're project-specific aliases that map to actual model IDs based on configuration. Full `provider/model-id` format is only used when intentionally pinning to a specific model version.

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

- Fix cost spectrum ordering: flash (0.20x) < haiku (0.25x) per cost table (Gemini) - Reorder all tables/sections to match: flash before haiku consistently - Update local fallback from haiku to flash (cheapest cloud by blended cost) - Add (planned: t1338.x) annotations to not-yet-existing files/scripts (Augment) - Add note that local tier runtime support is coming in t1338.2-t1338.6 (Augment) - Separate limitations from fallback was already done in prior commit (Gemini)

marcusquinn · 2026-02-25T23:30:33Z

@coderabbitai review

Addressed all remaining reviewer feedback:

Cost spectrum ordering (Gemini): Fixed flash/haiku order across all tables, spectrum line, routing rules, flowchart, and fallback tables. Flash (~0.20x) is cheaper than haiku (~0.25x) per the cost estimation table.
Local fallback target: Updated from haiku to flash (cheapest cloud tier by blended cost) for consistency.
Planned file annotations (Augment): All not-yet-existing files (local-models.md, huggingface.md, local-model-helper.sh) now marked with (planned: t1338.x).
Runtime support note (Augment): Added blockquote note that model: local in frontmatter will fall back to haiku until t1338.2-t1338.6 land.
Limitations/fallback separation (Gemini): Already addressed in prior commit a19eadb.

github-actions · 2026-02-25T23:30:43Z

🔍 Code Quality Report

�[0;35m[MONITOR]�[0m Code Review Monitoring Report

�[0;34m[INFO]�[0m Latest Quality Status:
SonarCloud: 0 bugs, 0 vulnerabilities, 90 code smells

�[0;34m[INFO]�[0m Recent monitoring activity:
Wed Feb 25 23:30:39 UTC 2026: Code review monitoring started
Wed Feb 25 23:30:39 UTC 2026: SonarCloud - Bugs: 0, Vulnerabilities: 0, Code Smells: 90

📈 Current Quality Metrics

BUGS: 0
CODE SMELLS: 90
VULNERABILITIES: 0

Generated on: Wed Feb 25 23:30:42 UTC 2026

Generated by AI DevOps Framework Code Review Monitoring

coderabbitai · 2026-02-25T23:30:50Z

@marcusquinn Thank you for the comprehensive updates! I can see you've systematically addressed all the feedback points:

✅ Cost spectrum ordering: flash/haiku order corrected consistently
✅ Local fallback: Updated to flash (cheapest cloud tier)
✅ Planned file annotations: Clear markers for future implementation
✅ Runtime support note: Sets proper expectations for current behavior
✅ Limitations/fallback separation: Already addressed previously

Triggering a full review now to verify all changes are consistent across the documentation.

🧠 Learnings used

Learnt from: marcusquinn
Repo: marcusquinn/aidevops PR: 1633
File: .agents/tools/build-agent/build-agent.md:441-450
Timestamp: 2026-02-18T03:15:45.314Z
Learning: In the aidevops framework, agent frontmatter uses tier aliases (`haiku`, `sonnet`, `opus`) for the `model:` field, which are resolved by the supervisor's model routing system. These are NOT OpenCode's native `provider/model-id` format — they're project-specific aliases that map to actual model IDs based on configuration. Full `provider/model-id` format is only used when intentionally pinning to a specific model version.

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

sonarqubecloud · 2026-02-25T23:31:31Z

Quality Gate passed

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

marcusquinn · 2026-02-25T23:57:25Z

@coderabbitai review

Both CodeRabbit review items are already addressed in commits a19eadb and 71fddb9:

Fail-closed for privacy/on-device (actionable comment on lines 40-50): Lines 49-54 now distinguish privacy/on-device (FAIL with --allow-cloud override required) vs cost-optimisation (fallback to flash). The flowchart (lines 247-264) and fallback table (line 183) are consistent.
Subagent location scope (nitpick on lines 121-126): Line 128 now reads "defined across these paths (tools/ai-assistants/models/ for cloud tiers, tools/local-models/ for the local tier)".

All CI checks pass. Requesting re-review to clear the CHANGES_REQUESTED state.

coderabbitai · 2026-02-25T23:58:10Z

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Read the model-routing.md file to verify the changes mentioned by the user
cat .agents/tools/context/model-routing.md

Length of output: 13457

@marcusquinn Perfect! I've verified both changes are correctly implemented:

✅ Fail-closed for privacy/on-device:
- Lines 49-54: Clear distinction between privacy (FAIL with --allow-cloud override) vs cost-optimization (fallback to flash)
- Lines 247-264: Flowchart correctly branches on privacy-constrained vs cost-optimization first
- Line 183: Fallback table accurately reflects conditional behavior
✅ Subagent location scope:
- Line 128: Now explicitly states "defined across these paths (tools/ai-assistants/models/ for cloud tiers, tools/local-models/ for the local tier)"

All documentation is internally consistent. Triggering review to clear the CHANGES_REQUESTED state.

🧠 Learnings used

Learnt from: marcusquinn
Repo: marcusquinn/aidevops PR: 1633
File: .agents/tools/build-agent/build-agent.md:441-450
Timestamp: 2026-02-18T03:15:45.314Z
Learning: In the aidevops framework, agent frontmatter uses tier aliases (`haiku`, `sonnet`, `opus`) for the `model:` field, which are resolved by the supervisor's model routing system. These are NOT OpenCode's native `provider/model-id` format — they're project-specific aliases that map to actual model IDs based on configuration. Full `provider/model-id` format is only used when intentionally pinning to a specific model version.

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

All review feedback addressed in commits a19eadb and 71fddb9. CodeRabbit confirmed all changes correct. Dismissing stale CHANGES_REQUESTED.

marcusquinn · 2026-02-26T00:00:13Z

@coderabbitai review

All review feedback has been addressed across commits a19eadb and 71fddb9:

Fail-closed for privacy/on-device (actionable): Lines 49-54 now distinguish privacy (FAIL with --allow-cloud override) vs cost-optimisation (fallback to flash). Flowchart (lines 247-264) and fallback table (line 183) are consistent.
Subagent location scope (nitpick): Line 128 now reads "defined across these paths (tools/ai-assistants/models/ for cloud tiers, tools/local-models/ for the local tier)".
Cost spectrum ordering (Gemini): flash/haiku order corrected across all tables, spectrum line, routing rules, flowchart, and fallback tables.
Planned file annotations (Augment): All not-yet-existing files marked with (planned: t1338.x).
Runtime support note (Augment): Blockquote note that model: local falls back to haiku until t1338.2-t1338.6 land.

markdownlint: 0 errors. All CI checks pass. Requesting fresh review to clear stale CHANGES_REQUESTED state.

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (1)

.agents/tools/context/model-routing.md (1)
53-53: --allow-cloud is referenced but has no definition anchor or planned-subtask annotation.

Both Line 53 and Line 250 cite --allow-cloud as the escape hatch for the fail-closed privacy path, but there is no in-document definition of what accepts this flag (CLI? dispatch.sh? subagent invocation?), nor a (planned: t1338.x) annotation to indicate where it will be implemented — unlike the other planned artefacts (local-model-helper.sh, local-models.md, etc.) that are consistently annotated throughout. A brief parenthetical pointing to the implementing subtask would keep the doc internally consistent.
✏️ Proposed annotation
-- **Privacy/on-device requirement**: FAIL — do not route to cloud. Return an error instructing the user to start the local server or pass `--allow-cloud` to explicitly override.
+- **Privacy/on-device requirement**: FAIL — do not route to cloud. Return an error instructing the user to start the local server or pass `--allow-cloud` to explicitly override (planned: t1338.x).
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.agents/tools/context/model-routing.md at line 53, Add a short in-document
definition/anchor for the CLI flag `--allow-cloud` and annotate it with the
planned subtask (e.g., `(planned: t1338.x)`) so readers know where it will be
implemented; update the two occurrences that reference `--allow-cloud` (the
privacy fail message and the line at 250) to either link to that anchor or
include the parenthetical planned-subtask note, ensuring the flag name
`--allow-cloud` is consistently defined and annotated like other artifacts
(`local-model-helper.sh`, `local-models.md`).

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.agents/tools/context/model-routing.md:
- Around line 107-110: The runtime-support note for the "local" tier incorrectly
states it will fall back to "haiku"; update that sentence to say it will fall
back to "flash" so it matches the rest of the document (the "Fall back to
`flash`" line, the Fallback Routing table, and the cost spectrum). Locate the
note text that mentions "`local` in frontmatter will fall back to `haiku`" and
replace "haiku" with "flash" to align the policy and tables.

---

Nitpick comments:
In @.agents/tools/context/model-routing.md:
- Line 53: Add a short in-document definition/anchor for the CLI flag
`--allow-cloud` and annotate it with the planned subtask (e.g., `(planned:
t1338.x)`) so readers know where it will be implemented; update the two
occurrences that reference `--allow-cloud` (the privacy fail message and the
line at 250) to either link to that anchor or include the parenthetical
planned-subtask note, ensuring the flag name `--allow-cloud` is consistently
defined and annotated like other artifacts (`local-model-helper.sh`,
`local-models.md`).

ℹ️ Review info

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 83a43fa and 71fddb9.

📒 Files selected for processing (1)

.agents/tools/context/model-routing.md

coderabbitai · 2026-02-26T00:01:28Z

.agents/tools/context/model-routing.md

+Valid values: `local`, `haiku`, `flash`, `sonnet`, `pro`, `opus`
+
+> **Note**: The `local` tier is documented here but runtime support (helper scripts, availability checks) is being added in subtasks t1338.2-t1338.6. Until those land, `local` in frontmatter will fall back to `haiku`.



⚠️ Potential issue | 🟡 Minor

haiku fallback in the runtime-support note contradicts the flash fallback defined everywhere else.

Line 109 says local frontmatter will fall back to haiku until the runtime subtasks land. However, the rest of the document consistently targets flash as the cost-optimisation fallback — Line 54 ("Fall back to flash"), the Fallback Routing table (Line 183: flash (cost-only)), and the cost spectrum itself (Line 25) which places flash below haiku in cost (~0.20x vs ~0.25x). The note should align with the established policy.

✏️ Proposed fix

-> **Note**: The `local` tier is documented here but runtime support (helper scripts, availability checks) is being added in subtasks t1338.2-t1338.6. Until those land, `local` in frontmatter will fall back to `haiku`. +> **Note**: The `local` tier is documented here but runtime support (helper scripts, availability checks) is being added in subtasks t1338.2-t1338.6. Until those land, `local` in frontmatter will fall back to `flash`.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In @.agents/tools/context/model-routing.md around lines 107 - 110, The runtime-support note for the "local" tier incorrectly states it will fall back to "haiku"; update that sentence to say it will fall back to "flash" so it matches the rest of the document (the "Fall back to `flash`" line, the Fallback Routing table, and the cost spectrum). Locate the note text that mentions "`local` in frontmatter will fall back to `haiku`" and replace "haiku" with "flash" to align the policy and tables.

coderabbitai · 2026-02-26T00:01:36Z

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Read the model-routing.md file to verify the changes mentioned by the user
cat .agents/tools/context/model-routing.md

Length of output: 13457

@marcusquinn Excellent work! I've verified all 5 items you addressed:

✅ Fail-closed for privacy/on-device (Lines 49-54, 247-264, 183): Privacy tasks correctly FAIL with --allow-cloud override required; cost-optimization tasks fall back to flash. Flowchart, routing rules, and fallback table are all consistent.
✅ Subagent location scope (Line 128): Now explicitly states "defined across these paths (tools/ai-assistants/models/ for cloud tiers, tools/local-models/ for the local tier)".
✅ Cost spectrum ordering: flash/haiku order corrected throughout - cost spectrum line (26), model tiers table (30-37), routing rules sections, cost estimation table (114-121), flowchart (247-264), and fallback table (183). Flash (~0.20x) correctly shown as cheaper than haiku (~0.25x).
✅ Planned file annotations: All not-yet-existing files properly marked:
- local-models.md (planned: t1338.2)
- huggingface.md (planned: t1338.3)
- local-model-helper.sh (planned: t1338.4)
✅ Runtime support note (Lines 102-104): Blockquote clearly states model: local in frontmatter will fall back to haiku until t1338.2-t1338.6 land, properly setting expectations for current behavior.

The documentation is internally consistent across all sections. Triggering review to clear the CHANGES_REQUESTED state.

🧠 Learnings used

Learnt from: marcusquinn
Repo: marcusquinn/aidevops PR: 1633
File: .agents/tools/build-agent/build-agent.md:441-450
Timestamp: 2026-02-18T03:15:45.314Z
Learning: In the aidevops framework, agent frontmatter uses tier aliases (`haiku`, `sonnet`, `opus`) for the `model:` field, which are resolved by the supervisor's model routing system. These are NOT OpenCode's native `provider/model-id` format — they're project-specific aliases that map to actual model IDs based on configuration. Full `provider/model-id` format is only used when intentionally pinning to a specific model version.

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

- Fix cost spectrum order: flash (~0.20x) is cheaper than haiku (~0.25x), so the spectrum now reads local -> flash -> haiku -> sonnet -> pro -> opus - Add `local` tier to VALID_MODELS in model-label-helper.sh so local-tier tasks can be labelled and queried without falling through to the unknown model warning - Add `local` to is_known_tier() and get_tier_models() in model-availability-helper.sh so routing/availability checks accept the local tier without returning exit 1 - Broken references to tools/local-models/ and local-model-helper.sh are now resolved: those files were created in t1338.2-6 after PR #2326 merged Closes #3299

gemini-code-assist bot reviewed Feb 25, 2026

View reviewed changes

augmentcode bot reviewed Feb 25, 2026

View reviewed changes

coderabbitai bot previously requested changes Feb 25, 2026

View reviewed changes

.agents/tools/context/model-routing.md Show resolved Hide resolved

marcusquinn mentioned this pull request Feb 25, 2026

t1338.2: Create local-models.md subagent #2328

Merged

marcusquinn merged commit e7225e0 into main Feb 26, 2026
11 checks passed

github-actions bot mentioned this pull request Feb 26, 2026

t1338.1: Extend model-routing.md with local tier #2320

Closed

coderabbitai bot requested changes Feb 26, 2026

View reviewed changes

coderabbitai bot mentioned this pull request Feb 26, 2026

t1338.1: Align local tier fallback to haiku and add local to routing table #2385

Merged

coderabbitai bot mentioned this pull request Feb 26, 2026

t1338.6: Add huggingface.md reference to AGENTS.md domain index #2394

Merged

4 tasks

marcusquinn mentioned this pull request Feb 27, 2026

t1338: PR-merge triggered issue-sync for closing hygiene #2327

Closed

marcusquinn deleted the chore/t1338.1-model-routing-local-tier branch March 3, 2026 03:24

marcusquinn mentioned this pull request Mar 7, 2026

quality-debt: .agents/tools/context/model-routing.md — PR #2326 review feedback (high) #3299

Closed

marcusquinn mentioned this pull request Mar 14, 2026

t3299: fix PR #2326 review feedback on model-routing.md #4598

Merged

alex-solovyev mentioned this pull request Mar 14, 2026

[Supervisor:alex-solovyev] 1 PR, 0 assigned, 1 worker at 17:30 UTC #4433

Closed

marcusquinn mentioned this pull request Mar 14, 2026

[Supervisor:marcusquinn] 0 PRs, 0 assigned, 0 workers at 13:17 UTC #2645

Closed

	- Cost spectrum: local (free) -> haiku -> flash -> sonnet -> pro -> opus (highest)
	- Cost spectrum: local (free) -> flash -> haiku -> sonnet -> pro -> opus (highest)


		Discovery checks three sources (in order): environment variables, gopass encrypted secrets, plaintext `credentials.sh`. Use discovery output to constrain routing to models the user can actually access.

		For local models, use `local-model-helper.sh status` to check if a local model server is running:

		Valid values: `local`, `haiku`, `flash`, `sonnet`, `pro`, `opus`

		> Note: The `local` tier is documented here but runtime support (helper scripts, availability checks) is being added in subtasks t1338.2-t1338.6. Until those land, `local` in frontmatter will fall back to `haiku`.

Conversation

marcusquinn commented Feb 25, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes to model-routing.md

Verification

Context

Summary by CodeRabbit

Uh oh!

gemini-code-assist bot commented Feb 25, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

coderabbitai bot commented Feb 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Rate limit exceeded

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Poem

Uh oh!

github-actions bot commented Feb 25, 2026

🔍 Code Quality Report

📈 Current Quality Metrics

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

augmentcode bot commented Feb 25, 2026

Uh oh!

augmentcode bot left a comment

Choose a reason for hiding this comment

Uh oh!

augmentcode bot Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

augmentcode bot Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

augmentcode bot Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

marcusquinn commented Feb 25, 2026

Uh oh!

github-actions bot commented Feb 25, 2026

🔍 Code Quality Report

📈 Current Quality Metrics

Uh oh!

coderabbitai bot commented Feb 25, 2026

Uh oh!

marcusquinn commented Feb 25, 2026

Uh oh!

github-actions bot commented Feb 25, 2026

🔍 Code Quality Report

📈 Current Quality Metrics

Uh oh!

coderabbitai bot commented Feb 25, 2026

Uh oh!

sonarqubecloud bot commented Feb 25, 2026

Quality Gate passed

Uh oh!

marcusquinn commented Feb 25, 2026

Uh oh!

coderabbitai bot commented Feb 25, 2026

Uh oh!

marcusquinn commented Feb 26, 2026

Uh oh!

marcusquinn commented Feb 25, 2026 •

edited by coderabbitai bot

Loading

Changes to `model-routing.md`

coderabbitai bot commented Feb 25, 2026 •

edited

Loading