-
-
Notifications
You must be signed in to change notification settings - Fork 12.4k
[Misc][PCP&DCP] relocate PCP feature check #30050
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request effectively relocates the feature checks for Prefill Context Parallelism (PCP) to a more consolidated and appropriate location, improving code organization and clarity. The introduction of new feature flags in the attention backend's abstract class is a good step towards more granular control over backend capabilities. While the refactoring is well-executed, I've identified one high-severity issue where a compatibility check was incorrectly scoped, which could lead to runtime errors in specific distributed configurations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codex Review
Here are some automated review suggestions for this pull request.
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
55ba613 to
0a76876
Compare
e2ebf25 to
38a6494
Compare
|
Hi @pisceskkk, the pre-commit checks have failed. Please run: uv pip install pre-commit
pre-commit install
pre-commit run --all-filesThen, commit the changes and push to your branch. For future commits, Tip Is
|
Signed-off-by: QiuChunshuo <[email protected]>
LucasWilkinson
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Signed-off-by: QiuChunshuo <[email protected]> Co-authored-by: Cyrus Leung <[email protected]> Signed-off-by: Nathan Price <[email protected]>
Signed-off-by: QiuChunshuo <[email protected]> Co-authored-by: Cyrus Leung <[email protected]> Signed-off-by: Nathan Price <[email protected]> Signed-off-by: Nathan Price <[email protected]>
Signed-off-by: QiuChunshuo <[email protected]> Co-authored-by: Cyrus Leung <[email protected]> Signed-off-by: Nathan Price <[email protected]>
Signed-off-by: QiuChunshuo <[email protected]> Co-authored-by: Cyrus Leung <[email protected]>
Signed-off-by: QiuChunshuo <[email protected]> Co-authored-by: Cyrus Leung <[email protected]>
Signed-off-by: QiuChunshuo <[email protected]> Co-authored-by: Cyrus Leung <[email protected]> Signed-off-by: Ubuntu <[email protected]>
Recently, vllm-ascend has been advancing the main-to-main adaptation work with the vllm repository. We have now fully adapted PCP and other features on vllm-ascend, so we believe it is necessary to add essential feature adaptation flags for the attention backend and modify the logic of the PCP feature check code to ensure that the functional switches for GPU and NPU do not interfere with each other.
Modifications:
CC @LucasWilkinson @FENP @LookAround0301 @zhenwenqi2024