Skip to content

Add softmax_d in mha_bwd#905

Closed
MayDomine wants to merge 7 commits intoDao-AILab:mainfrom
MayDomine:main
Closed

Add softmax_d in mha_bwd#905
MayDomine wants to merge 7 commits intoDao-AILab:mainfrom
MayDomine:main

Conversation

@MayDomine
Copy link

@MayDomine MayDomine commented Apr 1, 2024

This is helpful in our case for optimizing the distributed flash-attention implementation.
Our work: BurstAttention: An Efficient Distributed Attention Framework for Extremely Long Sequences benefits from this PR.

@MayDomine MayDomine marked this pull request as draft August 19, 2024 10:04
@MayDomine MayDomine closed this Aug 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant