Skip to content

Pull requests: llm-d/llm-d-kv-cache

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

Add Hybrid Multi-head Attention (HMA) support for KV-Cache scoring size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
#533 opened Apr 19, 2026 by kapiljain1989 Loading…
add group_id tracking for HMA model support size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
#532 opened Apr 19, 2026 by kapiljain1989 Loading…
Added new model registry size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
#531 opened Apr 19, 2026 by kapiljain1989 Loading…
Removed Old Helm Setup size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
#530 opened Apr 19, 2026 by kapiljain1989 Loading…
revamp docs size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
#528 opened Apr 16, 2026 by vMaroon Member Loading…
feat(fs_backend): add performance and stress tests size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
#527 opened Apr 16, 2026 by kfirtoledo Collaborator Loading…
2 tasks done
deps(actions): bump softprops/action-gh-release from 2 to 3 dependencies Pull requests that update a dependency file size/XS Denotes a PR that changes 0-9 lines, ignoring generated files.
#516 opened Apr 14, 2026 by dependabot bot Loading…
fix: prevent write queue deadlock under high concurrency size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
#512 opened Apr 13, 2026 by kfirtoledo Collaborator Loading…
5 tasks done
Handling Attention Group id in KV events size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
#510 opened Apr 10, 2026 by kapiljain1989 Loading…
fix: register MaxPodHitCount metric in Collectors() size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
#509 opened Apr 10, 2026 by wenhug Contributor Loading…
3 tasks done
deps(go): bump go.opentelemetry.io/otel/sdk from 1.39.0 to 1.43.0 dependencies Pull requests that update a dependency file size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
#503 opened Apr 8, 2026 by dependabot bot Loading…
deps(actions): bump docker/setup-buildx-action from 3 to 4 dependencies Pull requests that update a dependency file size/XS Denotes a PR that changes 0-9 lines, ignoring generated files.
#501 opened Apr 7, 2026 by dependabot bot Loading…
deps(actions): bump docker/build-push-action from 6 to 7 dependencies Pull requests that update a dependency file size/XS Denotes a PR that changes 0-9 lines, ignoring generated files.
#500 opened Apr 7, 2026 by dependabot bot Loading…
Add object store support to llm-d storage offloading size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
#499 opened Apr 6, 2026 by effi-ofer Loading…
feat: Add golden test case for multi-modal
#485 opened Mar 30, 2026 by gyliu513 Contributor Loading…
feat: Add HMA support to FS connector size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
#476 opened Mar 29, 2026 by kfirtoledo Collaborator Draft
4 tasks done
test: Add test and usage example for mm requests size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
#453 opened Mar 23, 2026 by sagearc Collaborator Loading…
fix lint errors
#446 opened Mar 23, 2026 by roytman Contributor Loading…
build: Upgrade golangci-lint to v2.9.0 size/S Denotes a PR that changes 10-29 lines, ignoring generated files.
#439 opened Mar 20, 2026 by gyliu513 Contributor Loading…
deps(go): bump google.golang.org/grpc from 1.77.0 to 1.79.3 dependencies Pull requests that update a dependency file lifecycle/stale
#438 opened Mar 19, 2026 by dependabot bot Loading…
feat:add support to invalidate KV cache via AllBlocksCleared event size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
#437 opened Mar 18, 2026 by yash9263 Loading…
deps(go): bump the go-dependencies group across 1 directory with 16 updates dependencies Pull requests that update a dependency file size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
#430 opened Mar 17, 2026 by dependabot bot Loading…
feat: Add Hybrid Model Architecture (HMA) Support in Prefix-Cache Aware Scheduling size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
#427 opened Mar 16, 2026 by kapiljain1989 Loading…
ProTip! Mix and match filters to narrow down what you’re looking for.