feat: add embedding APIs, batching, and docs by leehack · Pull Request #75 · leehack/llamadart

leehack · 2026-02-28T17:38:47Z

Summary

add first-class embedding support in LlamaEngine with optional backend capability interfaces and native backend implementations (embed and embedBatch)
introduce native worker/service embedding paths with multi-sequence batching controls (ModelParams.maxParallelSequences) plus benchmark/sweep tooling for throughput analysis
add embedding example CLI, refresh README/example/website docs, and update template parity mapping for new vendored llama.cpp template fixture

Validation

dart analyze
dart test -p vm -j 1 --exclude-tags local-only
dart test (in example/basic_app)
npm run build (in website)

codecov-commenter · 2026-02-28T17:47:04Z

Codecov Report

❌ Patch coverage is 29.79592% with 172 lines in your changes missing coverage. Please review.
✅ Project coverage is 76.81%. Comparing base (d0734a0) to head (52d156c).
⚠️ Report is 1 commits behind head on main.

Files with missing lines	Patch %	Lines
lib/src/backends/llama_cpp/llama_cpp_service.dart	7.18%	168 Missing ⚠️
lib/src/backends/llama_cpp/llama_cpp_backend.dart	92.59%	2 Missing ⚠️
lib/src/backends/llama_cpp/worker.dart	83.33%	2 Missing ⚠️

❌ Your patch status has failed because the patch coverage (29.79%) is below the target coverage (70.00%). You can increase the patch coverage or adjust the target coverage.

Additional details and impacted files

@@            Coverage Diff             @@
##             main      #75      +/-   ##
==========================================
+ Coverage   76.48%   76.81%   +0.33%     
==========================================
  Files          66       66              
  Lines        8338     8579     +241     
==========================================
+ Hits         6377     6590     +213     
- Misses       1961     1989      +28

Flag	Coverage Δ
unittests	`76.81% <29.79%> (+0.33%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

leehack · 2026-02-28T17:59:47Z

Coverage update after backend/worker/model-params test expansion:

Fixed a platform-sensitive assertion in llama_cpp_service_test (backend info now asserts CPU presence instead of exact list).
Added and expanded unit coverage in backend routing, worker routing, and model params paths.
Validation: dart analyze, targeted tests, and full VM suite (dart test -p vm -j 1 --exclude-tags local-only) all pass.
Refreshed VM coverage (coverage/lcov.info). Current file coverage:
- lib/src/backends/llama_cpp/llama_cpp_service.dart: 42.55% (680/1598)
- lib/src/backends/llama_cpp/llama_cpp_backend.dart: 94.47% (239/253)
- lib/src/backends/llama_cpp/worker.dart: 90.98% (111/122)
- lib/src/core/models/inference/model_params.dart: 92.31% (12/13)
Overall lib coverage now 76.27% (6544/8580).

leehack · 2026-02-28T18:43:58Z

Web bridge update completed:

Published new bridge asset release: leehack/llama-web-bridge-assets@v0.1.6 (llama.cpp b8157).
Updated llamadart web defaults/pins/docs to v0.1.6:
- scripts/fetch_webgpu_bridge_assets.sh
- example/chat_app/web/index.html
- doc/webgpu_bridge.md
- website/docs/platforms/webgpu-bridge.md
- website/versioned_docs/version-0.6.4/platforms/webgpu-bridge.md
- CHANGELOG.md (Unreleased note)
Validation: dart analyze passed in llamadart.

Reference release: https://github.com/leehack/llama-web-bridge-assets/releases/tag/v0.1.6

leehack · 2026-02-28T18:44:24Z

Upstream tracking for the web bridge bump:

Bridge source PR: build: bump web bridge llama.cpp pin to b8157 llama-web-bridge#1
Published assets tag: https://github.com/leehack/llama-web-bridge-assets/releases/tag/v0.1.6

leehack · 2026-02-28T19:52:18Z

Web embeddings are now wired and validated.

What changed in this branch:

Added web backend embedding support (LlamaEngine.embed / embedBatch) via WebGPU bridge APIs.
Updated web backend wrappers/capabilities so embeddings resolve correctly on web.
Added browser unit coverage for web embeddings and legacy bridge fallback behavior.
Bumped default bridge assets to leehack/llama-web-bridge-assets@v0.1.7.

Validation run:

dart analyze
dart test -p chrome test/unit/backends/webgpu/webgpu_backend_test.dart test/unit/backends/web/web_backend_test.dart
WEBGPU_BRIDGE_ASSETS_TAG=v0.1.7 ./scripts/fetch_webgpu_bridge_assets.sh (checksum verification passed)

Upstream references:

Bridge PR: build: bump web bridge llama.cpp pin to b8157 llama-web-bridge#1
Assets release: https://github.com/leehack/llama-web-bridge-assets/releases/tag/v0.1.7

leehack · 2026-02-28T19:53:54Z

Additional browser integration coverage added:

test/integration/backends/webgpu/webgpu_engine_multimodal_browser_integration_test.dart
- new assertion path for LlamaEngine.embed(...) and embedBatch(...) over the mock WebGPU bridge.

Re-validated:

dart analyze
dart test -p chrome test/unit/backends/webgpu/webgpu_backend_test.dart test/unit/backends/web/web_backend_test.dart test/integration/backends/webgpu/webgpu_engine_multimodal_browser_integration_test.dart

feat(embeddings): add native embedding APIs and batching workflows

ad07d7b

leehack added 3 commits February 28, 2026 13:02

test(backends): expand native backend and worker coverage

f23c945

feat(server): add OpenAI-compatible embeddings endpoint

42e4ac9

chore(web): bump bridge assets pin to v0.1.6

975611d

feat(web): add bridge-backed embeddings support

5d7c090

test(web): cover engine embeddings on webgpu bridge

52d156c

leehack merged commit 9a84976 into main Feb 28, 2026
6 checks passed

leehack deleted the feat/embedding-support branch February 28, 2026 23:31

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add embedding APIs, batching, and docs#75

feat: add embedding APIs, batching, and docs#75
leehack merged 6 commits intomainfrom
feat/embedding-support

leehack commented Feb 28, 2026

Uh oh!

codecov-commenter commented Feb 28, 2026 •

edited

Loading

Uh oh!

leehack commented Feb 28, 2026

Uh oh!

leehack commented Feb 28, 2026

Uh oh!

leehack commented Feb 28, 2026

Uh oh!

leehack commented Feb 28, 2026

Uh oh!

leehack commented Feb 28, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

leehack commented Feb 28, 2026

Summary

Validation

Uh oh!

codecov-commenter commented Feb 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

leehack commented Feb 28, 2026

Uh oh!

leehack commented Feb 28, 2026

Uh oh!

leehack commented Feb 28, 2026

Uh oh!

leehack commented Feb 28, 2026

Uh oh!

leehack commented Feb 28, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

codecov-commenter commented Feb 28, 2026 •

edited

Loading