ADD FP8 forward for FA3 by yuyu5333 · Pull Request #522 · xdit-project/xDiT

yuyu5333 · 2025-05-29T13:04:58Z

FA3 now supports FP8 (torch.float8_e4m3fn) reasoning. However, the forward stage still lacks FP8 support.

I added the necessary parameters qkv scale for quantization and selected the Attention that supports FP8 for forwarding according to AttnType in yunchang (USP). This function will support USP forward of FP8 quantization.

In addition, pull requests have been initiated for quantization-related parameter support for yunchang and FA3.

yunchang (USP): https://github.com/feifeibear/long-context-attention/pull/151（merged into main）
FA3: Dao-AILab/flash-attention#1686

feifeibear

LGTM

ADD FP8 forward for FA3

1abf6a3

feifeibear approved these changes Jun 3, 2025

View reviewed changes

feifeibear merged commit 7fae2e3 into xdit-project:main Jun 3, 2025

MeinhardZhou pushed a commit to bytedance-iaas/xDiT that referenced this pull request Jun 24, 2025

ADD FP8 forward for FA3 (xdit-project#522)

37d69fd

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ADD FP8 forward for FA3#522

ADD FP8 forward for FA3#522
feifeibear merged 1 commit intoxdit-project:mainfrom
yuyu5333:main

yuyu5333 commented May 29, 2025

Uh oh!

feifeibear left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

yuyu5333 commented May 29, 2025

Uh oh!

feifeibear left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants