-
Notifications
You must be signed in to change notification settings - Fork 4.5k
Closed
Description
Checklist
- 1. If the issue you raised is not a feature but a question, please raise a discussion at https://github.com/sgl-project/sglang/discussions/new/choose Otherwise, it will be closed.
- 2. Please use English, otherwise it will be closed.
Motivation
Creating an issue to track the work for supporting a CUTLASS / CUTE kernel for LoRA to see if there is any perf gain comparing with the current Triton one.
Dependency: this task should happen after #7809 as the FlashInfer deprecation is expected to change / simplify the kernel interface.
(cc @Fridge003 @Ying1123 )
Related resources
No response
Reactions are currently unavailable