Skip to content

[Feature] Cutlass kernels for LoRA #7910

@lifuhuang

Description

@lifuhuang

Checklist

Motivation

Creating an issue to track the work for supporting a CUTLASS / CUTE kernel for LoRA to see if there is any perf gain comparing with the current Triton one.

Dependency: this task should happen after #7809 as the FlashInfer deprecation is expected to change / simplify the kernel interface.

(cc @Fridge003 @Ying1123 )

Related resources

No response

Metadata

Metadata

Assignees

Type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions