[XPU] [CI] Xpu ci lock PaddlePaddle Version#5218
[XPU] [CI] Xpu ci lock PaddlePaddle Version#5218plusNew001 merged 10 commits intoPaddlePaddle:developfrom
Conversation
Comment out the previous paddlepaddle-xpu installation command and replace it with a specific version installation due to EP parallel error.
|
Thanks for your contribution! |
There was a problem hiding this comment.
Pull request overview
This PR addresses a CI bug in the XPU environment by temporarily pinning the paddlepaddle-xpu package to a specific development version (3.3.0.dev20251123) instead of using the latest nightly build, due to EP (Executor Parallelism) related errors.
Key Changes:
- Commented out the original nightly build installation command
- Added a specific paddlepaddle-xpu version installation from a direct URL
- Included a Chinese comment explaining the temporary fix
| # python -m pip install paddlepaddle-xpu -i https://www.paddlepaddle.org.cn/packages/nightly/xpu-p800/ | ||
| # 由于ep并行报错暂时锁死paddle版本 | ||
| python -m pip install https://paddle-whl.bj.bcebos.com/nightly/xpu-p800/paddlepaddle-xpu/paddlepaddle_xpu-3.3.0.dev20251123-cp310-cp310-linux_x86_64.whl |
There was a problem hiding this comment.
The PR description is incomplete. Please provide:
- Motivation: Explain why this change is needed (e.g., "Due to EP parallelism errors in the nightly build")
- Modifications: Detail what was changed (e.g., "Temporarily pinned paddlepaddle-xpu to version 3.3.0.dev20251123 instead of using the latest nightly build")
- Usage or Command: Include the command to run the CI test
- Checklist: Complete the checklist items, especially regarding unit tests and accuracy tests
| # 由于ep并行报错暂时锁死paddle版本 | ||
| python -m pip install https://paddle-whl.bj.bcebos.com/nightly/xpu-p800/paddlepaddle-xpu/paddlepaddle_xpu-3.3.0.dev20251123-cp310-cp310-linux_x86_64.whl |
There was a problem hiding this comment.
Hardcoding a specific development version URL creates maintainability issues. Consider:
- Adding a version variable at the top of the script for easy updates (e.g.,
PADDLE_XPU_VERSION="3.3.0.dev20251123") - Adding a TODO comment with a tracking issue number to revert to the nightly build once the EP parallelism bug is fixed
- Documenting when this temporary fix should be removed
Example:
# TODO(issue-#XXXX): Revert to nightly build once EP parallelism bug is fixed
PADDLE_XPU_VERSION="3.3.0.dev20251123"
python -m pip install https://paddle-whl.bj.bcebos.com/nightly/xpu-p800/paddlepaddle-xpu/paddlepaddle_xpu-${PADDLE_XPU_VERSION}-cp310-cp310-linux_x86_64.whl| # 由于ep并行报错暂时锁死paddle版本 | |
| python -m pip install https://paddle-whl.bj.bcebos.com/nightly/xpu-p800/paddlepaddle-xpu/paddlepaddle_xpu-3.3.0.dev20251123-cp310-cp310-linux_x86_64.whl | |
| # TODO(issue-#XXXX): Revert to nightly build once EP parallelism bug is fixed. Temporary fix: lock paddlepaddle-xpu version due to EP parallelism error. | |
| PADDLE_XPU_VERSION="3.3.0.dev20251123" | |
| python -m pip install https://paddle-whl.bj.bcebos.com/nightly/xpu-p800/paddlepaddle-xpu/paddlepaddle_xpu-${PADDLE_XPU_VERSION}-cp310-cp310-linux_x86_64.whl |
| python -m pip install paddlepaddle-xpu -i https://www.paddlepaddle.org.cn/packages/nightly/xpu-p800/ | ||
|
|
||
| # python -m pip install paddlepaddle-xpu -i https://www.paddlepaddle.org.cn/packages/nightly/xpu-p800/ | ||
| # 由于ep并行报错暂时锁死paddle版本 |
There was a problem hiding this comment.
建议在中文注释旁边添加英文翻译以便国际贡献者理解:
# 由于ep并行报错暂时锁死paddle版本
# Temporarily locking paddle version due to EP parallelism errorConsider adding an English translation alongside the Chinese comment for better accessibility to international contributors.
| # 由于ep并行报错暂时锁死paddle版本 | |
| # 由于ep并行报错暂时锁死paddle版本 | |
| # Temporarily locking paddle version due to EP parallelism error |
|
|
||
| # python -m pip install paddlepaddle-xpu -i https://www.paddlepaddle.org.cn/packages/nightly/xpu-p800/ | ||
| # 由于ep并行报错暂时锁死paddle版本 | ||
| python -m pip install https://paddle-whl.bj.bcebos.com/nightly/xpu-p800/paddlepaddle-xpu/paddlepaddle_xpu-3.3.0.dev20251123-cp310-cp310-linux_x86_64.whl |
There was a problem hiding this comment.
使用直接URL安装特定开发版本可能会导致问题:
- 该URL指向特定的Python版本(cp310),如果环境使用其他Python版本会失败
- 没有错误处理 - 如果下载失败,安装会静默失败
建议改进:
PADDLE_XPU_VERSION="3.3.0.dev20251123"
PADDLE_XPU_URL="https://paddle-whl.bj.bcebos.com/nightly/xpu-p800/paddlepaddle-xpu/paddlepaddle_xpu-${PADDLE_XPU_VERSION}-cp310-cp310-linux_x86_64.whl"
echo "Installing paddlepaddle-xpu ${PADDLE_XPU_VERSION}..."
python -m pip install "${PADDLE_XPU_URL}" || {
echo "Error: Failed to install paddlepaddle-xpu from ${PADDLE_XPU_URL}"
exit 1
}Installing from a direct URL to a specific dev version can cause issues:
- The URL targets a specific Python version (cp310), which will fail if the environment uses a different Python version
- No error handling - if the download fails, the installation fails silently
Consider improving with the code above.
| python -m pip install https://paddle-whl.bj.bcebos.com/nightly/xpu-p800/paddlepaddle-xpu/paddlepaddle_xpu-3.3.0.dev20251123-cp310-cp310-linux_x86_64.whl | |
| # 动态获取当前Python版本,构造对应的whl文件URL | |
| PYTHON_VERSION=$(python -c "import sys; print(f'cp{sys.version_info.major}{sys.version_info.minor}')") | |
| PADDLE_XPU_VERSION="3.3.0.dev20251123" | |
| PADDLE_XPU_URL="https://paddle-whl.bj.bcebos.com/nightly/xpu-p800/paddlepaddle-xpu/paddlepaddle_xpu-${PADDLE_XPU_VERSION}-${PYTHON_VERSION}-${PYTHON_VERSION}-linux_x86_64.whl" | |
| echo "Installing paddlepaddle-xpu ${PADDLE_XPU_VERSION} for Python ${PYTHON_VERSION}..." | |
| python -m pip install "${PADDLE_XPU_URL}" || { | |
| echo "Error: Failed to install paddlepaddle-xpu from ${PADDLE_XPU_URL}" | |
| exit 1 | |
| } |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## develop #5218 +/- ##
==========================================
Coverage ? 59.84%
==========================================
Files ? 317
Lines ? 38682
Branches ? 5813
==========================================
Hits ? 23150
Misses ? 13707
Partials ? 1825
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Motivation
Modifications
Usage or Command
Accuracy Tests
Checklist
[FDConfig],[APIServer],[Engine],[Scheduler],[PD Disaggregation],[Executor],[Graph Optimization],[Speculative Decoding],[RL],[Models],[Quantization],[Loader],[OP],[KVCache],[DataProcessor],[BugFix],[Docs],[CI],[Optimization],[Feature],[Benchmark],[Others],[XPU],[HPU],[GCU],[DCU],[Iluvatar],[Metax]]pre-commitbefore commit.releasebranch, make sure the PR has been submitted to thedevelopbranch, then cherry-pick it to thereleasebranch with the[Cherry-Pick]PR tag.