Skip to content

[Feature] Roadmap for Prefill (Piecewise) CUDA Graph #11490

@Oasis-Git

Description

@Oasis-Git

Checklist

Motivation

With PR #10062 merged, we have implemented the foundational framework for piecewise CUDA Graph and torch.compile backend support.
In this issue, we aim to outline the key follow-up tasks and enhancements planned for future iterations.

Todo List

We list all the model that can not be applied with piecewise cuda graph support by now. All the contributors and users can raise their model with issues here.

Model List

Contributions and discussions are highly welcome.

Related resources

No response

Metadata

Metadata

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions