Skip to content

t1338.1: Extend model-routing.md with local tier#2326

Merged
marcusquinn merged 3 commits intomainfrom
chore/t1338.1-model-routing-local-tier
Feb 26, 2026
Merged

t1338.1: Extend model-routing.md with local tier#2326
marcusquinn merged 3 commits intomainfrom
chore/t1338.1-model-routing-local-tier

Conversation

@marcusquinn
Copy link
Copy Markdown
Owner

@marcusquinn marcusquinn commented Feb 25, 2026

Summary

  • Adds local as tier 0 ($0 cost) in the model routing hierarchy: local -> haiku -> flash -> sonnet -> pro -> opus
  • Local tier uses llama.cpp with user-selected GGUF models for privacy-sensitive, offline, and bulk processing tasks
  • Falls back to haiku when local server is not running or no model is installed

Changes to model-routing.md

Section Change
Quick Reference Added cost spectrum line
Model Tiers New local row: llama.cpp, Free ($0)
Routing Rules New Use local when section (privacy, offline, bulk, experimentation)
Cost Estimation local at 0x input/output ($0 electricity only)
Model-Specific Subagents local -> tools/local-models/local-models.md, fallback: haiku
Provider Discovery local-model-helper.sh status and models commands
Fallback Routing local -> haiku when server not running
Decision Flowchart Local-first branch before cloud routing
Examples 3 new local examples (bulk processing, offline, privacy)
Subagent Frontmatter local added to valid model values
Related Links to future local-models subagents and helper script

Verification

  • markdownlint: 0 errors
  • All acceptance criteria from t1338-brief.md verified for subtask t1338.1

Context

Part of plan p032 (Local AI Model Support). This is subtask 1 of 6 — documentation-only change to model-routing.md. Future subtasks (t1338.2-t1338.6) will create the local-models subagents, helper script, and usage logging.

Closes #2320

Summary by CodeRabbit

  • Documentation
    • Added a new local model tier and updated cost/tiers table.
    • Clarified when to use local models, limitations, and offline/privacy scenarios.
    • Expanded decision flowchart and routing guidance to include local-first and cost-driven fallbacks.
    • Added examples for bulk local processing and offline use.
    • Documented local-model discovery, status/management commands, fallback behavior, and upcoming local runtime notes.

Add local as tier 0 (/bin/zsh cost) in the model routing hierarchy:
local -> haiku -> flash -> sonnet -> pro -> opus

Changes to model-routing.md:
- Model Tiers table: local tier with llama.cpp GGUF
- Routing Rules: new 'Use local when' section with privacy/offline/bulk criteria
- Cost Estimation: local at 0x (electricity only)
- Model-Specific Subagents: local -> tools/local-models/local-models.md
- Fallback Routing: local falls back to haiku when server not running
- Decision Flowchart: local-first branch before cloud routing
- Examples: 3 local tier examples (bulk, offline, privacy)
- Provider Discovery: local-model-helper.sh status/models commands
- Subagent Frontmatter: local added to valid model values
- Related: links to future local-models subagents and helper script

Closes #2320
@gemini-code-assist
Copy link
Copy Markdown

Summary of Changes

Hello @marcusquinn, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request integrates support for local AI models into the existing model routing system. It introduces a new "local" tier, enabling users to leverage on-device llama.cpp models for tasks requiring privacy, offline capability, or bulk processing, with a clear fallback to cloud models when local options are unavailable.

Highlights

  • Local Model Tier Introduction: Introduced a new "local" model tier (tier 0) into the model routing hierarchy, positioned as the cheapest option before "haiku".
  • Technology and Use Cases: The "local" tier utilizes llama.cpp with user-selected GGUF models, designed for privacy-sensitive, offline, and bulk processing tasks.
  • Fallback Mechanism: Defined a fallback mechanism where the system will route to "haiku" if the local server is not running or no local model is installed.
  • Documentation Updates: Comprehensively updated model-routing.md to reflect the new "local" tier, including routing rules, cost estimation, subagent definitions, fallback logic, decision flowchart, and examples.
Changelog
  • .agents/tools/context/model-routing.md
    • Added "local" as the lowest-cost tier in the model routing hierarchy and to the model tiers table.
    • Detailed the use cases and limitations for the "local" model tier in a new section.
    • Updated valid model values to include "local" for subagent frontmatter.
    • Included "local" in the cost estimation table with zero cost.
    • Defined the subagent and fallback for the "local" tier in the model-specific subagents table.
    • Provided commands for discovering local model status and available models.
    • Modified the decision flowchart to prioritize local models.
    • Added new examples demonstrating the use of local models for specific tasks.
    • Linked to new documentation and scripts related to local models in the "Related" section.
Activity
  • This is the first subtask (t1338.1) of a larger plan (p032) to support local AI models, focusing solely on documentation.
  • Future subtasks (t1338.2-t1338.6) are planned to implement local-models subagents, helper scripts, and usage logging.
  • This pull request addresses and closes issue t1338.1: Extend model-routing.md with local tier #2320.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Feb 25, 2026

Warning

Rate limit exceeded

@marcusquinn has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 22 minutes and 53 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📥 Commits

Reviewing files that changed from the base of the PR and between 83a43fa and 71fddb9.

📒 Files selected for processing (1)
  • .agents/tools/context/model-routing.md

Walkthrough

This PR extends the model-routing.md documentation to include a new "local" tier for on-device AI model inference via llama.cpp. It adds routing rules, cost estimation, decision flowchart paths, examples, and helper command references for local model management alongside existing cloud-based tiers.

Changes

Cohort / File(s) Summary
Model Routing Documentation
.agents/tools/context/model-routing.md
Added local model tier throughout: new entry in cost spectrum and model table, "Use local" routing rules section, cost estimation row (0x/0x), local-first decision flowchart path, local/offline examples, and helper command references for status checking and model discovery.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

Possibly related PRs

  • PR #758: Extends model-routing documentation with model-specific subagent tier definitions; directly related through shared routing tier documentation patterns.
  • PR #782: Implements model-registry-helper.sh "route" command logic that will consume the new local tier routing rules and decision flowchart paths defined in this PR.

Poem

🚀 Local tides turn, no cloud in sight,
llama.cpp whispers through the night,
Routing tiers complete their dance—
Free to budget, budget to branch,
*Now devices glow with models bright.

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and specifically identifies the main change: extending model-routing.md documentation with a new local model tier.
Linked Issues check ✅ Passed The PR comprehensively addresses all coding and documentation requirements from issue #2320: adds local tier to model-routing.md with cost spectrum, routing rules, decision flowchart, examples, and subagent references as specified.
Out of Scope Changes check ✅ Passed All changes are scoped to extending model-routing.md documentation with the local tier; no unrelated modifications or code changes outside the stated objectives.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch chore/t1338.1-model-routing-local-tier

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link
Copy Markdown
Contributor

🔍 Code Quality Report

�[0;35m[MONITOR]�[0m Code Review Monitoring Report

�[0;34m[INFO]�[0m Latest Quality Status:
SonarCloud: 0 bugs, 0 vulnerabilities, 90 code smells

�[0;34m[INFO]�[0m Recent monitoring activity:
Wed Feb 25 23:11:13 UTC 2026: Code review monitoring started
Wed Feb 25 23:11:14 UTC 2026: SonarCloud - Bugs: 0, Vulnerabilities: 0, Code Smells: 90

📈 Current Quality Metrics

  • BUGS: 0
  • CODE SMELLS: 90
  • VULNERABILITIES: 0

Generated on: Wed Feb 25 23:11:16 UTC 2026


Generated by AI DevOps Framework Code Review Monitoring

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request provides a comprehensive documentation update to model-routing.md, introducing the new local model tier. The changes are well-structured and cover all necessary aspects, including routing rules, cost, and fallback mechanisms. I've identified one inconsistency regarding the model cost ordering and have a suggestion to improve the clarity of the limitations section. Overall, this is a great enhancement to the documentation.

- **Purpose**: Route tasks to the cheapest model that can handle them well
- **Philosophy**: Use the smallest model that produces acceptable quality
- **Default**: sonnet (best balance of cost/capability for most tasks)
- **Cost spectrum**: local (free) -> haiku -> flash -> sonnet -> pro -> opus (highest)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

There's an inconsistency in the model cost ordering. The 'Cost spectrum' line lists haiku as cheaper than flash. However, the 'Cost Estimation' table below indicates that flash (~0.20x) is more cost-effective than haiku (~0.25x). To ensure consistency and prevent confusion, the order in the cost spectrum should be updated to reflect the data in the table.

Suggested change
- **Cost spectrum**: local (free) -> haiku -> flash -> sonnet -> pro -> opus (highest)
- **Cost spectrum**: local (free) -> flash -> haiku -> sonnet -> pro -> opus (highest)

- Simple tasks where network latency exceeds local inference time
- The task fits within the local model's capability (typically <32K context, simpler reasoning)

**Limitations**: Local models are smaller and less capable than cloud models. Do not route complex reasoning, large-context analysis, or architecture decisions to local. If a local model is not running or not installed, skip to `haiku`.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The 'Limitations' section currently mixes model capability constraints with the fallback mechanism. To improve clarity, I suggest separating these. This makes a clear distinction between guidance (when not to use local) and automatic behavior (what happens if local is unavailable).

Suggested change
**Limitations**: Local models are smaller and less capable than cloud models. Do not route complex reasoning, large-context analysis, or architecture decisions to local. If a local model is not running or not installed, skip to `haiku`.
**Limitations**: Local models are smaller and less capable than cloud models. Do not route complex reasoning, large-context analysis, or architecture decisions to local.
If a local model is not running or not installed, the request automatically falls back to `haiku`.

@augmentcode
Copy link
Copy Markdown

augmentcode bot commented Feb 25, 2026

🤖 Augment PR Summary

Summary: Extends model-routing.md to introduce a tier-0 local option (llama.cpp + user-selected GGUF) for offline/privacy/bulk workloads.

Changes:

  • Adds local to the tier hierarchy, cost spectrum, and cost estimation table
  • Documents when to route to local, its limitations, and a local-first decision flow
  • Defines local→haiku fallback behavior when the local server/model is unavailable
  • Adds examples and related links/commands for upcoming local-model tooling

Technical Notes: Keeps sonnet as the default; local is conditional and intended to fall back to cloud tiers when not viable.

🤖 Was this summary useful? React with 👍 or 👎

Copy link
Copy Markdown

@augmentcode augmentcode bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review completed. 3 suggestions posted.

Fix All in Augment

Comment augment review to trigger a new review at any time.

```

Valid values: `haiku`, `flash`, `sonnet`, `pro`, `opus`
Valid values: `local`, `haiku`, `flash`, `sonnet`, `pro`, `opus`
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doc now states model: local is a valid tier, but several scripts appear to validate/resolve tiers against a fixed set that currently excludes local (e.g., tier checks in model-availability-helper.sh and model-label-helper.sh). That mismatch could cause routing/labeling failures if someone starts using local immediately based on this guidance.

Severity: medium

Fix This in Augment

🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.


| Tier | Subagent | Primary Model | Fallback |
|------|----------|---------------|----------|
| `local` | `tools/local-models/local-models.md` | llama.cpp (user GGUF) | haiku |
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tools/local-models/local-models.md (and related local-model docs) are referenced here, but those files don’t exist in the repo yet, so this will be a broken reference for readers until the follow-up subtasks land. Consider marking these as “coming in t1338.x” or deferring the references until the files are added.

Severity: medium

Other Locations
  • .agents/tools/context/model-routing.md:300
  • .agents/tools/context/model-routing.md:301

Fix This in Augment

🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.


Discovery checks three sources (in order): environment variables, gopass encrypted secrets, plaintext `credentials.sh`. Use discovery output to constrain routing to models the user can actually access.

For local models, use `local-model-helper.sh status` to check if a local model server is running:
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This section instructs running local-model-helper.sh, but that helper script is not present in the repo yet, so the command is currently non-actionable. Consider explicitly noting it’s planned for a later subtask to avoid confusing users following the doc today.

Severity: medium

Other Locations
  • .agents/tools/context/model-routing.md:302

Fix This in Augment

🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.

coderabbitai[bot]
coderabbitai bot previously requested changes Feb 25, 2026
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
.agents/tools/context/model-routing.md (1)

121-126: Clarify subagent location scope to avoid path confusion.

The sentence on Line 121 says model subagents are defined under tools/ai-assistants/models/, but the new local entry on Line 125 points to tools/local-models/local-models.md. Update wording to “primarily” or “across these paths” so the docs stay operationally accurate.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.agents/tools/context/model-routing.md around lines 121 - 126, The sentence
stating that concrete model subagents are defined under
tools/ai-assistants/models/ is too narrow given the new local entry pointing to
tools/local-models/local-models.md; update the wording (the paragraph
introducing the table) to indicate subagents are "primarily defined under
tools/ai-assistants/models/" or "defined across these paths (e.g.,
tools/ai-assistants/models/ and tools/local-models/)" so the table entries like
the local subagent and the haiku entry are accurately covered.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.agents/tools/context/model-routing.md:
- Around line 40-50: The policy currently allows silent fallback from the
"local" route to "haiku", which violates on-device/privacy constraints; change
the wording and logic so that when a task requires on-device handling ("local")
and the local model is not running or installed, the system must fail closed and
not auto-route to "haiku" or any cloud option without an explicit override;
update the text replacing "skip to `haiku`" with a clear fail-closed statement
and add a requirement that any escalation to `haiku` (or cloud) must be
performed only after an explicit, auditable override, and mirror this change in
the routing flowchart references to "local" -> "haiku".

---

Nitpick comments:
In @.agents/tools/context/model-routing.md:
- Around line 121-126: The sentence stating that concrete model subagents are
defined under tools/ai-assistants/models/ is too narrow given the new local
entry pointing to tools/local-models/local-models.md; update the wording (the
paragraph introducing the table) to indicate subagents are "primarily defined
under tools/ai-assistants/models/" or "defined across these paths (e.g.,
tools/ai-assistants/models/ and tools/local-models/)" so the table entries like
the local subagent and the haiku entry are accurately covered.

ℹ️ Review info

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e0dea12 and 83a43fa.

📒 Files selected for processing (1)
  • .agents/tools/context/model-routing.md

…review)

- Local route no longer silently falls back to cloud haiku when the task
  requires on-device processing (privacy, compliance, air-gapped). It now
  fails with an error unless --allow-cloud is explicitly passed.
- Cost-optimisation and experimentation use cases still fall back to haiku.
- Updated decision flowchart, fallback table, and subagent table to
  reflect the fail-closed/fail-open distinction.
- Updated subagent table intro to cover both tools/ai-assistants/models/
  (cloud) and tools/local-models/ (local) paths.
@marcusquinn
Copy link
Copy Markdown
Owner Author

@coderabbitai review

Addressed your feedback:

  1. Fail-closed for privacy/on-device: Local tier no longer silently falls back to cloud. Privacy/compliance tasks FAIL when local is unavailable — requires explicit --allow-cloud override. Cost-optimisation tasks still fall back to haiku.
  2. Subagent table intro: Updated to say subagents are "defined across these paths" covering both tools/ai-assistants/models/ (cloud) and tools/local-models/ (local).
  3. Flowchart updated: Decision tree now branches on privacy-constrained vs cost-optimisation before checking local availability.
  4. Fallback table updated: Local tier row now shows conditional fallback behaviour.

@github-actions
Copy link
Copy Markdown
Contributor

🔍 Code Quality Report

�[0;35m[MONITOR]�[0m Code Review Monitoring Report

�[0;34m[INFO]�[0m Latest Quality Status:
SonarCloud: 0 bugs, 0 vulnerabilities, 90 code smells

�[0;34m[INFO]�[0m Recent monitoring activity:
Wed Feb 25 23:18:19 UTC 2026: Code review monitoring started
Wed Feb 25 23:18:19 UTC 2026: SonarCloud - Bugs: 0, Vulnerabilities: 0, Code Smells: 90

📈 Current Quality Metrics

  • BUGS: 0
  • CODE SMELLS: 90
  • VULNERABILITIES: 0

Generated on: Wed Feb 25 23:18:22 UTC 2026


Generated by AI DevOps Framework Code Review Monitoring

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Feb 25, 2026

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Read the model-routing.md file to verify the changes
cat .agents/tools/context/model-routing.md

Length of output: 13109


@marcusquinn Excellent work addressing the feedback! I've verified all four changes:

  1. Fail-closed for privacy: The fallback behavior now correctly distinguishes between privacy/on-device requirements (FAIL with explicit --allow-cloud override needed) and cost-optimization scenarios (fallback to haiku). This is a critical security improvement.

  2. Subagent table intro: Now clearly states subagents are "defined across these paths" covering both tools/ai-assistants/models/ (cloud) and tools/local-models/ (local).

  3. Flowchart logic: The decision tree now properly branches on privacy-constrained vs cost-optimization before checking local availability, making the routing logic explicit and unambiguous.

  4. Fallback table: The local tier row now shows the conditional fallback behavior matching the flowchart and narrative documentation.

The documentation is now internally consistent across all sections (routing rules, flowchart, fallback table, and examples). The fail-closed approach for privacy-sensitive tasks is a best practice that prevents accidental data leakage to cloud providers.

Triggering a full review now.


🧠 Learnings used
Learnt from: marcusquinn
Repo: marcusquinn/aidevops PR: 1633
File: .agents/tools/build-agent/build-agent.md:441-450
Timestamp: 2026-02-18T03:15:45.314Z
Learning: In the aidevops framework, agent frontmatter uses tier aliases (`haiku`, `sonnet`, `opus`) for the `model:` field, which are resolved by the supervisor's model routing system. These are NOT OpenCode's native `provider/model-id` format — they're project-specific aliases that map to actual model IDs based on configuration. Full `provider/model-id` format is only used when intentionally pinning to a specific model version.
✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

- Fix cost spectrum ordering: flash (0.20x) < haiku (0.25x) per cost table (Gemini)
- Reorder all tables/sections to match: flash before haiku consistently
- Update local fallback from haiku to flash (cheapest cloud by blended cost)
- Add (planned: t1338.x) annotations to not-yet-existing files/scripts (Augment)
- Add note that local tier runtime support is coming in t1338.2-t1338.6 (Augment)
- Separate limitations from fallback was already done in prior commit (Gemini)
@marcusquinn
Copy link
Copy Markdown
Owner Author

@coderabbitai review

Addressed all remaining reviewer feedback:

  1. Cost spectrum ordering (Gemini): Fixed flash/haiku order across all tables, spectrum line, routing rules, flowchart, and fallback tables. Flash (~0.20x) is cheaper than haiku (~0.25x) per the cost estimation table.
  2. Local fallback target: Updated from haiku to flash (cheapest cloud tier by blended cost) for consistency.
  3. Planned file annotations (Augment): All not-yet-existing files (local-models.md, huggingface.md, local-model-helper.sh) now marked with (planned: t1338.x).
  4. Runtime support note (Augment): Added blockquote note that model: local in frontmatter will fall back to haiku until t1338.2-t1338.6 land.
  5. Limitations/fallback separation (Gemini): Already addressed in prior commit a19eadb.

@github-actions
Copy link
Copy Markdown
Contributor

🔍 Code Quality Report

�[0;35m[MONITOR]�[0m Code Review Monitoring Report

�[0;34m[INFO]�[0m Latest Quality Status:
SonarCloud: 0 bugs, 0 vulnerabilities, 90 code smells

�[0;34m[INFO]�[0m Recent monitoring activity:
Wed Feb 25 23:30:39 UTC 2026: Code review monitoring started
Wed Feb 25 23:30:39 UTC 2026: SonarCloud - Bugs: 0, Vulnerabilities: 0, Code Smells: 90

📈 Current Quality Metrics

  • BUGS: 0
  • CODE SMELLS: 90
  • VULNERABILITIES: 0

Generated on: Wed Feb 25 23:30:42 UTC 2026


Generated by AI DevOps Framework Code Review Monitoring

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Feb 25, 2026

@marcusquinn Thank you for the comprehensive updates! I can see you've systematically addressed all the feedback points:

  1. Cost spectrum ordering: flash/haiku order corrected consistently
  2. Local fallback: Updated to flash (cheapest cloud tier)
  3. Planned file annotations: Clear markers for future implementation
  4. Runtime support note: Sets proper expectations for current behavior
  5. Limitations/fallback separation: Already addressed previously

Triggering a full review now to verify all changes are consistent across the documentation.


🧠 Learnings used
Learnt from: marcusquinn
Repo: marcusquinn/aidevops PR: 1633
File: .agents/tools/build-agent/build-agent.md:441-450
Timestamp: 2026-02-18T03:15:45.314Z
Learning: In the aidevops framework, agent frontmatter uses tier aliases (`haiku`, `sonnet`, `opus`) for the `model:` field, which are resolved by the supervisor's model routing system. These are NOT OpenCode's native `provider/model-id` format — they're project-specific aliases that map to actual model IDs based on configuration. Full `provider/model-id` format is only used when intentionally pinning to a specific model version.
✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@sonarqubecloud
Copy link
Copy Markdown

@marcusquinn
Copy link
Copy Markdown
Owner Author

@coderabbitai review

Both CodeRabbit review items are already addressed in commits a19eadb and 71fddb9:

  1. Fail-closed for privacy/on-device (actionable comment on lines 40-50): Lines 49-54 now distinguish privacy/on-device (FAIL with --allow-cloud override required) vs cost-optimisation (fallback to flash). The flowchart (lines 247-264) and fallback table (line 183) are consistent.

  2. Subagent location scope (nitpick on lines 121-126): Line 128 now reads "defined across these paths (tools/ai-assistants/models/ for cloud tiers, tools/local-models/ for the local tier)".

All CI checks pass. Requesting re-review to clear the CHANGES_REQUESTED state.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Feb 25, 2026

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Read the model-routing.md file to verify the changes mentioned by the user
cat .agents/tools/context/model-routing.md

Length of output: 13457


@marcusquinn Perfect! I've verified both changes are correctly implemented:

  1. Fail-closed for privacy/on-device:

    • Lines 49-54: Clear distinction between privacy (FAIL with --allow-cloud override) vs cost-optimization (fallback to flash)
    • Lines 247-264: Flowchart correctly branches on privacy-constrained vs cost-optimization first
    • Line 183: Fallback table accurately reflects conditional behavior
  2. Subagent location scope:

    • Line 128: Now explicitly states "defined across these paths (tools/ai-assistants/models/ for cloud tiers, tools/local-models/ for the local tier)"

All documentation is internally consistent. Triggering review to clear the CHANGES_REQUESTED state.


🧠 Learnings used
Learnt from: marcusquinn
Repo: marcusquinn/aidevops PR: 1633
File: .agents/tools/build-agent/build-agent.md:441-450
Timestamp: 2026-02-18T03:15:45.314Z
Learning: In the aidevops framework, agent frontmatter uses tier aliases (`haiku`, `sonnet`, `opus`) for the `model:` field, which are resolved by the supervisor's model routing system. These are NOT OpenCode's native `provider/model-id` format — they're project-specific aliases that map to actual model IDs based on configuration. Full `provider/model-id` format is only used when intentionally pinning to a specific model version.
✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@marcusquinn marcusquinn dismissed coderabbitai[bot]’s stale review February 25, 2026 23:59

All review feedback addressed in commits a19eadb and 71fddb9. CodeRabbit confirmed all changes correct. Dismissing stale CHANGES_REQUESTED.

@marcusquinn
Copy link
Copy Markdown
Owner Author

@coderabbitai review

All review feedback has been addressed across commits a19eadb and 71fddb9:

  1. Fail-closed for privacy/on-device (actionable): Lines 49-54 now distinguish privacy (FAIL with --allow-cloud override) vs cost-optimisation (fallback to flash). Flowchart (lines 247-264) and fallback table (line 183) are consistent.
  2. Subagent location scope (nitpick): Line 128 now reads "defined across these paths (tools/ai-assistants/models/ for cloud tiers, tools/local-models/ for the local tier)".
  3. Cost spectrum ordering (Gemini): flash/haiku order corrected across all tables, spectrum line, routing rules, flowchart, and fallback tables.
  4. Planned file annotations (Augment): All not-yet-existing files marked with (planned: t1338.x).
  5. Runtime support note (Augment): Blockquote note that model: local falls back to haiku until t1338.2-t1338.6 land.

markdownlint: 0 errors. All CI checks pass. Requesting fresh review to clear stale CHANGES_REQUESTED state.

@marcusquinn marcusquinn merged commit e7225e0 into main Feb 26, 2026
11 checks passed
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
.agents/tools/context/model-routing.md (1)

53-53: --allow-cloud is referenced but has no definition anchor or planned-subtask annotation.

Both Line 53 and Line 250 cite --allow-cloud as the escape hatch for the fail-closed privacy path, but there is no in-document definition of what accepts this flag (CLI? dispatch.sh? subagent invocation?), nor a (planned: t1338.x) annotation to indicate where it will be implemented — unlike the other planned artefacts (local-model-helper.sh, local-models.md, etc.) that are consistently annotated throughout. A brief parenthetical pointing to the implementing subtask would keep the doc internally consistent.

✏️ Proposed annotation
-- **Privacy/on-device requirement**: FAIL — do not route to cloud. Return an error instructing the user to start the local server or pass `--allow-cloud` to explicitly override.
+- **Privacy/on-device requirement**: FAIL — do not route to cloud. Return an error instructing the user to start the local server or pass `--allow-cloud` to explicitly override (planned: t1338.x).
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.agents/tools/context/model-routing.md at line 53, Add a short in-document
definition/anchor for the CLI flag `--allow-cloud` and annotate it with the
planned subtask (e.g., `(planned: t1338.x)`) so readers know where it will be
implemented; update the two occurrences that reference `--allow-cloud` (the
privacy fail message and the line at 250) to either link to that anchor or
include the parenthetical planned-subtask note, ensuring the flag name
`--allow-cloud` is consistently defined and annotated like other artifacts
(`local-model-helper.sh`, `local-models.md`).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.agents/tools/context/model-routing.md:
- Around line 107-110: The runtime-support note for the "local" tier incorrectly
states it will fall back to "haiku"; update that sentence to say it will fall
back to "flash" so it matches the rest of the document (the "Fall back to
`flash`" line, the Fallback Routing table, and the cost spectrum). Locate the
note text that mentions "`local` in frontmatter will fall back to `haiku`" and
replace "haiku" with "flash" to align the policy and tables.

---

Nitpick comments:
In @.agents/tools/context/model-routing.md:
- Line 53: Add a short in-document definition/anchor for the CLI flag
`--allow-cloud` and annotate it with the planned subtask (e.g., `(planned:
t1338.x)`) so readers know where it will be implemented; update the two
occurrences that reference `--allow-cloud` (the privacy fail message and the
line at 250) to either link to that anchor or include the parenthetical
planned-subtask note, ensuring the flag name `--allow-cloud` is consistently
defined and annotated like other artifacts (`local-model-helper.sh`,
`local-models.md`).

ℹ️ Review info

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 83a43fa and 71fddb9.

📒 Files selected for processing (1)
  • .agents/tools/context/model-routing.md

Comment on lines +107 to 110
Valid values: `local`, `haiku`, `flash`, `sonnet`, `pro`, `opus`

> **Note**: The `local` tier is documented here but runtime support (helper scripts, availability checks) is being added in subtasks t1338.2-t1338.6. Until those land, `local` in frontmatter will fall back to `haiku`.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

haiku fallback in the runtime-support note contradicts the flash fallback defined everywhere else.

Line 109 says local frontmatter will fall back to haiku until the runtime subtasks land. However, the rest of the document consistently targets flash as the cost-optimisation fallback — Line 54 ("Fall back to flash"), the Fallback Routing table (Line 183: flash (cost-only)), and the cost spectrum itself (Line 25) which places flash below haiku in cost (~0.20x vs ~0.25x). The note should align with the established policy.

✏️ Proposed fix
-> **Note**: The `local` tier is documented here but runtime support (helper scripts, availability checks) is being added in subtasks t1338.2-t1338.6. Until those land, `local` in frontmatter will fall back to `haiku`.
+> **Note**: The `local` tier is documented here but runtime support (helper scripts, availability checks) is being added in subtasks t1338.2-t1338.6. Until those land, `local` in frontmatter will fall back to `flash`.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.agents/tools/context/model-routing.md around lines 107 - 110, The
runtime-support note for the "local" tier incorrectly states it will fall back
to "haiku"; update that sentence to say it will fall back to "flash" so it
matches the rest of the document (the "Fall back to `flash`" line, the Fallback
Routing table, and the cost spectrum). Locate the note text that mentions
"`local` in frontmatter will fall back to `haiku`" and replace "haiku" with
"flash" to align the policy and tables.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Feb 26, 2026

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Read the model-routing.md file to verify the changes mentioned by the user
cat .agents/tools/context/model-routing.md

Length of output: 13457


@marcusquinn Excellent work! I've verified all 5 items you addressed:

  1. Fail-closed for privacy/on-device (Lines 49-54, 247-264, 183): Privacy tasks correctly FAIL with --allow-cloud override required; cost-optimization tasks fall back to flash. Flowchart, routing rules, and fallback table are all consistent.

  2. Subagent location scope (Line 128): Now explicitly states "defined across these paths (tools/ai-assistants/models/ for cloud tiers, tools/local-models/ for the local tier)".

  3. Cost spectrum ordering: flash/haiku order corrected throughout - cost spectrum line (26), model tiers table (30-37), routing rules sections, cost estimation table (114-121), flowchart (247-264), and fallback table (183). Flash (~0.20x) correctly shown as cheaper than haiku (~0.25x).

  4. Planned file annotations: All not-yet-existing files properly marked:

    • local-models.md (planned: t1338.2)
    • huggingface.md (planned: t1338.3)
    • local-model-helper.sh (planned: t1338.4)
  5. Runtime support note (Lines 102-104): Blockquote clearly states model: local in frontmatter will fall back to haiku until t1338.2-t1338.6 land, properly setting expectations for current behavior.

The documentation is internally consistent across all sections. Triggering review to clear the CHANGES_REQUESTED state.


🧠 Learnings used
Learnt from: marcusquinn
Repo: marcusquinn/aidevops PR: 1633
File: .agents/tools/build-agent/build-agent.md:441-450
Timestamp: 2026-02-18T03:15:45.314Z
Learning: In the aidevops framework, agent frontmatter uses tier aliases (`haiku`, `sonnet`, `opus`) for the `model:` field, which are resolved by the supervisor's model routing system. These are NOT OpenCode's native `provider/model-id` format — they're project-specific aliases that map to actual model IDs based on configuration. Full `provider/model-id` format is only used when intentionally pinning to a specific model version.
✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@marcusquinn marcusquinn deleted the chore/t1338.1-model-routing-local-tier branch March 3, 2026 03:24
marcusquinn added a commit that referenced this pull request Mar 14, 2026
- Fix cost spectrum order: flash (~0.20x) is cheaper than haiku (~0.25x),
  so the spectrum now reads local -> flash -> haiku -> sonnet -> pro -> opus
- Add `local` tier to VALID_MODELS in model-label-helper.sh so local-tier
  tasks can be labelled and queried without falling through to the unknown
  model warning
- Add `local` to is_known_tier() and get_tier_models() in
  model-availability-helper.sh so routing/availability checks accept the
  local tier without returning exit 1
- Broken references to tools/local-models/ and local-model-helper.sh are
  now resolved: those files were created in t1338.2-6 after PR #2326 merged

Closes #3299
alex-solovyev pushed a commit that referenced this pull request Mar 14, 2026
- Fix cost spectrum order: flash (~0.20x) is cheaper than haiku (~0.25x),
  so the spectrum now reads local -> flash -> haiku -> sonnet -> pro -> opus
- Add `local` tier to VALID_MODELS in model-label-helper.sh so local-tier
  tasks can be labelled and queried without falling through to the unknown
  model warning
- Add `local` to is_known_tier() and get_tier_models() in
  model-availability-helper.sh so routing/availability checks accept the
  local tier without returning exit 1
- Broken references to tools/local-models/ and local-model-helper.sh are
  now resolved: those files were created in t1338.2-6 after PR #2326 merged

Closes #3299
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

t1338.1: Extend model-routing.md with local tier

1 participant