Skip to content

feat(models): improve Hugging Face source ergonomics#146

Merged
leehack merged 2 commits into
mainfrom
feat/hf-source-ergonomics-136
May 14, 2026
Merged

feat(models): improve Hugging Face source ergonomics#146
leehack merged 2 commits into
mainfrom
feat/hf-source-ergonomics-136

Conversation

@leehack
Copy link
Copy Markdown
Owner

@leehack leehack commented May 14, 2026

Summary

  • Adds hf://...?revision= parsing for slash-containing Hugging Face refs while keeping simple @revision shorthand.
  • Encodes Hugging Face resolve URLs and canonical cache identities safely, including non-colliding cache keys for slash revisions.
  • Documents current hf:// behavior, private/gated repo token guidance, and current limitations around single-file sources, mmproj, file listing, and sharded GGUFs.

Closes #136

Production-readiness scope

  • Users can load public Hugging Face GGUF files with ModelSource.parse('hf://owner/repo/path/to/model.gguf').
  • Users can pin simple branches/tags/SHAs with @revision, and slash-containing refs such as refs/pr/12 with ?revision=refs/pr/12.
  • Supported behavior is source parsing, resolved download URL construction, deterministic cache identity, and documentation of private/gated downloads through ModelLoadOptions credentials.
  • Unsupported paths remain explicit documentation non-goals: hf:// identifies one file only, separate mmproj assets require separate sources/load steps, sharded GGUF manifests are not expanded automatically, and llamadart does not list or choose Hugging Face files.
  • Existing public API signatures remain compatible; the change is additive plus stricter rejection of invalid/unsafe revision strings.

Changes

  • Parse ?revision= on hf:// sources and reject ambiguous @revision + ?revision= combinations.
  • Preserve literal + in revision queries and reject invalid percent-encoding / unsafe decoded revision strings before they enter metadata or logs.
  • Encode resolved Hugging Face URLs component-by-component so slash-containing revisions are a single /resolve/{revision}/... segment.
  • Use unambiguous query-form canonical keys for slash-containing revisions to avoid cache-key collisions with inline simple revisions plus nested file paths.
  • Add regression coverage for revision queries, encoded slash revisions, factory URL encoding, cache-key non-collision, plus decoding, invalid query syntax, and invalid revision input.
  • Update README, website docs, changelog, and recent release notes.

Test Plan

  • dart format --output=none --set-exit-if-changed lib/src/core/models/model_source.dart test/unit/core/models/model_source_test.dart
  • dart analyze lib/src/core/models/model_source.dart test/unit/core/models/model_source_test.dart
  • dart analyze
  • dart test test/unit/core/models/model_source_test.dart
  • dart test test/unit/core/models
  • dart test
  • (cd website && npm run build)
  • git diff --check origin/main...HEAD
  • Static added-line scan for hardcoded secrets, dangerous exec, raw URL logging, and token-like URL literals: 0 findings

Review notes

  • Deep review initially found blockers around ambiguous canonical/cache identities for slash-containing revisions, + query decoding, and unsafe decoded revision strings.
  • Those blockers were fixed before submission and covered by new regression tests.
  • Final independent review verdict: approved / ready to push and open PR; no blockers.

Support revision query syntax for hf:// sources with slash-containing refs, encode Hugging Face resolve URLs safely, and document current hf:// behavior, private-token guidance, and non-goals.

Verification: dart analyze; dart analyze lib/src/core/models/model_source.dart test/unit/core/models/model_source_test.dart; dart test test/unit/core/models/model_source_test.dart; dart test test/unit/core/models; npm ci && npm run build (website).
Copilot AI review requested due to automatic review settings May 14, 2026 11:57
@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented May 14, 2026

Codecov Report

❌ Patch coverage is 94.64286% with 3 lines in your changes missing coverage. Please review.
✅ Project coverage is 78.32%. Comparing base (05cb9e9) to head (9df8b61).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
lib/src/core/models/model_source.dart 94.64% 3 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #146      +/-   ##
==========================================
+ Coverage   78.25%   78.32%   +0.07%     
==========================================
  Files          75       75              
  Lines        9715     9764      +49     
==========================================
+ Hits         7602     7648      +46     
- Misses       2113     2116       +3     
Flag Coverage Δ
unittests 78.32% <94.64%> (+0.07%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR improves Hugging Face model source handling by adding query-based revision parsing, safer URL/canonical identity encoding for slash-containing refs, and updated user-facing documentation around hf:// usage and limitations.

Changes:

  • Adds ?revision= parsing for Hugging Face refs and regression tests for slash refs, plus signs, invalid query syntax, and cache identity behavior.
  • Updates Hugging Face URL construction and canonical key generation for slash-containing revisions.
  • Expands README, website docs, and changelog entries for hf://, private/gated repos, mmproj, and sharded GGUF limitations.

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
lib/src/core/models/model_source.dart Adds query revision parsing, validation, URL encoding, and canonical key updates.
test/unit/core/models/model_source_test.dart Adds regression coverage for new Hugging Face parsing and identity cases.
README.md Documents hf:// reference forms and limitations.
CHANGELOG.md Adds unreleased changelog entry for Hugging Face ergonomics.
website/docs/guides/model-lifecycle.md Adds detailed hf:// usage, auth guidance, and limitations.
website/docs/getting-started/finding-models.md Adds concise guidance for using hf:// sources.
website/docs/changelog/recent-releases.md Adds website release-note summary for the feature.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread website/docs/guides/model-lifecycle.md Outdated
Comment thread lib/src/core/models/model_source.dart Outdated
@leehack leehack merged commit 40acad3 into main May 14, 2026
6 checks passed
@leehack leehack deleted the feat/hf-source-ergonomics-136 branch May 14, 2026 12:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat(models): improve Hugging Face source ergonomics

3 participants