Skip to content

Add support for NV Sage attention#670

Merged
avjves merged 3 commits intoxdit-project:mainfrom
avjves:feature/nv-sage-attn
Mar 17, 2026
Merged

Add support for NV Sage attention#670
avjves merged 3 commits intoxdit-project:mainfrom
avjves:feature/nv-sage-attn

Conversation

@avjves
Copy link
Collaborator

@avjves avjves commented Mar 16, 2026

What?

Adds support for NV Sage attention

Tests

Run command:

xdit --model Wan-AI/Wan2.2-T2V-A14B-Diffusers --attention_backend flash_3 --prompt "test"
wan2.2_t2v_u1r1_tc_False_720x1280_0.1.mp4

Time: 917s (1 GPU)

Run command:

xdit --model Wan-AI/Wan2.2-T2V-A14B-Diffusers --attention_backend sage --prompt "test"
wan2.2_t2v_u1r1_tc_False_720x1280_0.mp4

Time: 885s (1 GPU)

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces comprehensive support for Sage Attention, providing a new high-performance attention backend option. The changes involve integrating Sage Attention into the core attention mechanism, ensuring proper environment detection, and validating its availability during runtime. This enhancement allows users to leverage Sage Attention for potentially improved performance, as demonstrated by the provided benchmark results, and maintains system stability through robust compatibility checks.

Highlights

  • New Attention Backend: Added support for Sage Attention, expanding the range of available attention mechanisms.
  • Integration: Integrated Sage Attention into the system by adding it to the AttentionBackendType enum and registering its corresponding attention function.
  • Runtime Checks: Implemented checks to verify the availability of the SageAttention library at runtime and during environment initialization.
  • Documentation Update: Updated the README to include Sage Attention in the list of supported backends.
Changelog
  • README.md
    • Added Sage Attention to the list of supported attention backends in the documentation.
  • xfuser/core/distributed/attention_backend.py
    • Imported the sageattn function conditionally based on environment information.
    • Added SAGE as a new member to the AttentionBackendType enumeration.
    • Registered _sage_attn_call as the attention function for AttentionBackendType.SAGE, implementing the call to sageattn.
  • xfuser/core/distributed/runtime_state.py
    • Added a runtime compatibility check to ensure SageAttention is installed when the SAGE backend is selected.
  • xfuser/envs.py
    • Implemented a new private method _check_sage to detect the presence of the sageattention library.
    • Integrated the _check_sage method into the environment's initialize function to update packages_info.
Activity
  • No human activity has been recorded on this pull request yet.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces support for the NV Sage attention backend. The changes are well-implemented, covering the addition of the new backend type, the registration of its attention function, and environment checks to ensure the necessary package is installed. I have one minor suggestion to make the environment check more robust by specifying the exception being caught.

@avjves avjves merged commit 5fc47c3 into xdit-project:main Mar 17, 2026
@avjves avjves deleted the feature/nv-sage-attn branch March 17, 2026 08:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants