Skip to content

Add scripts for building CMake files#372

Open
1329009851 wants to merge 27 commits intosgl-project:mainfrom
1329009851:sgl-cmake2
Open

Add scripts for building CMake files#372
1329009851 wants to merge 27 commits intosgl-project:mainfrom
1329009851:sgl-cmake2

Conversation

@1329009851
Copy link
Contributor

Add scripts for building CMake files

@1329009851 1329009851 changed the title Sgl cmake2 Add scripts for building CMake files Feb 11, 2026
@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @1329009851, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly refactors the build system for the 'deepep' module by introducing a new CMake-based approach. It centralizes the build logic, adds support for custom operators, and improves compatibility with different CANN versions. The changes also involve a substantial cleanup of outdated or redundant build utility scripts, leading to a more streamlined and maintainable build process.

Highlights

  • New CMake Build Orchestration: A new top-level script, 'csrc/deepep_cmake_build.sh', has been introduced to orchestrate the CMake build process for 'deepep' modules, centralizing build commands.
  • Integrated DeepEP CMake Functionality: The main 'build.sh' script now includes a 'create_deepep_cmake' function, which is called during the build process to set up and execute the DeepEP CMake build scripts, ensuring proper permissions and execution.
  • CANN Version Detection in CMakeLists: The 'CMakeLists.txt' files for 'csrc/deepep/ops' and 'csrc/deepep/ops2' have been updated to detect the installed CANN version (8.2+ or 8.3+) and set appropriate compile definitions, improving compatibility and build flexibility.
  • Custom Operator Definition and Compilation Scripts: New files 'csrc/deepep/AddCustom.json', 'csrc/deepep/build.sh', and 'csrc/deepep/compile_ascend_proj.sh' were added to define a custom 'AddCustom' operator and provide the necessary shell scripts for its compilation and project setup.
  • Build System Simplification and Cleanup: A large number of redundant or deprecated utility scripts and CMake configuration files under 'csrc/deepep/ops/cmake/util/' and 'csrc/deepep/ops2/cmake/util/' have been removed, indicating a significant cleanup and streamlining of the build infrastructure.
  • CMake File Renaming for Clarity: The 'func.cmake' files within 'csrc/deepep/ops/cmake/' and 'csrc/deepep/ops2/cmake/' were renamed and moved to their respective parent directories ('csrc/deepep/ops/' and 'csrc/deepep/ops2/'), simplifying their include paths.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog
  • build.sh
    • Added a new function create_deepep_cmake to handle the DeepEP CMake build process.
    • Integrated the call to create_deepep_cmake within the main function to ensure its execution.
  • csrc/deepep/AddCustom.json
    • Added a JSON configuration file for the 'AddCustom' operator, detailing its language, input descriptions (x, y as float16), and output description (z as float16).
  • csrc/deepep/build.sh
    • Added a new shell script to manage the build process for the 'deepep' module, including environment setup, pybind and test build functions, and argument parsing for SOC version and debug options.
    • Included logic to compile Ascend projects using 'compile_ascend_proj.sh'.
  • csrc/deepep/compile_ascend_proj.sh
    • Added a new shell script containing functions to copy operator files, build operators using 'msopgen' based on 'AddCustom.json', and delete temporary operator files.
    • Implemented 'BuildAscendProj' to orchestrate the operator compilation for specified SOC versions.
  • csrc/deepep/ops/CMakeLists.txt
    • Updated CMake logic to detect CANN 8.2+ or 8.3+ include structures and define 'CANN_VERSION_MACRO' accordingly.
    • Removed direct inclusion of 'cmake/func.cmake' and instead included 'func.cmake' from the current directory.
    • Added a custom command using 'execute_process' to compile shared libraries for operators, incorporating the detected CANN version macro.
  • csrc/deepep/ops/cmake/config.cmake
    • Removed the file, indicating a refactoring of global CMake configuration.
  • csrc/deepep/ops/cmake/device_task.cmake
    • Removed the file, suggesting a change in how device tasks are managed or defined.
  • csrc/deepep/ops/cmake/intf.cmake
    • Removed the file, indicating a change in interface library definitions.
  • csrc/deepep/ops/cmake/makeself/COPYING
    • Removed the file, removing the GPL license text for makeself utility.
  • csrc/deepep/ops/cmake/makeself/README.md
    • Removed the file, removing the README for makeself utility.
  • csrc/deepep/ops/cmake/makeself/VERSION
    • Removed the file, removing the version information for makeself utility.
  • csrc/deepep/ops/cmake/makeself/make-release.sh
    • Removed the file, removing the release script for makeself utility.
  • csrc/deepep/ops/cmake/makeself/makeself-header.sh
    • Removed the file, removing the header script for makeself utility.
  • csrc/deepep/ops/cmake/makeself/makeself.1
    • Removed the file, removing the man page for makeself utility.
  • csrc/deepep/ops/cmake/makeself/makeself.lsm
    • Removed the file, removing the LSM file for makeself utility.
  • csrc/deepep/ops/cmake/makeself/makeself.sh
    • Removed the file, removing the main makeself utility script.
  • csrc/deepep/ops/cmake/makeself/run-tests.sh
    • Removed the file, removing the test runner for makeself utility.
  • csrc/deepep/ops/cmake/merge_aicpu_info_json.sh
    • Removed the file, indicating a change in AICPU info merging.
  • csrc/deepep/ops/cmake/util/init.py
    • Removed the file, removing Python package initialization.
  • csrc/deepep/ops/cmake/util/ascendc_bin_param_build.py
    • Removed the file, removing the AscendC binary parameter build script.
  • csrc/deepep/ops/cmake/util/ascendc_compile_kernel.py
    • Removed the file, removing the AscendC kernel compilation script.
  • csrc/deepep/ops/cmake/util/ascendc_gen_options.py
    • Removed the file, removing the AscendC option generation script.
  • csrc/deepep/ops/cmake/util/ascendc_get_op_name.py
    • Removed the file, removing the AscendC operator name retrieval script.
  • csrc/deepep/ops/cmake/util/ascendc_impl_build.py
    • Removed the file, removing the AscendC implementation build script.
  • csrc/deepep/ops/cmake/util/ascendc_op_info.py
    • Removed the file, removing the AscendC operator information script.
  • csrc/deepep/ops/cmake/util/ascendc_ops_config.py
    • Removed the file, removing the AscendC operator configuration script.
  • csrc/deepep/ops/cmake/util/ascendc_pack_kernel.py
    • Removed the file, removing the AscendC kernel packing script.
  • csrc/deepep/ops/cmake/util/ascendc_pack_opregistry.py
    • Removed the file, removing the AscendC operator registry packing script.
  • csrc/deepep/ops/cmake/util/ascendc_replay_build.py
    • Removed the file, removing the AscendC replay build script.
  • csrc/deepep/ops/cmake/util/batch_replay_impl.temp
    • Removed the file, removing a batch replay implementation template.
  • csrc/deepep/ops/cmake/util/code_channel_infer.py
    • Removed the file, removing the code channel inference script.
  • csrc/deepep/ops/cmake/util/const_var.py
    • Removed the file, removing constant variable definitions.
  • csrc/deepep/ops/cmake/util/gen_impl_and_mrege_json.sh
    • Removed the file, removing a script for generating and merging JSON.
  • csrc/deepep/ops/cmake/util/gen_ops_filter.sh
    • Removed the file, removing an operator filter generation script.
  • csrc/deepep/ops/cmake/util/gen_version_info.sh
    • Removed the file, removing a version information generation script.
  • csrc/deepep/ops/cmake/util/insert_op_info.py
    • Removed the file, removing a script for inserting operator information.
  • csrc/deepep/ops/cmake/util/insert_simplified_keys.py
    • Removed the file, removing a script for inserting simplified keys.
  • csrc/deepep/ops/cmake/util/kernel_entry.py
    • Removed the file, removing a kernel entry generation script.
  • csrc/deepep/ops/cmake/util/kernel_impl.temp
    • Removed the file, removing a kernel implementation template.
  • csrc/deepep/ops/cmake/util/merge_aicpu_info_json.sh
    • Removed the file, removing a script for merging AICPU info JSON.
  • csrc/deepep/ops/cmake/util/opdesc_parser.py
    • Removed the file, removing the operator description parser script.
  • csrc/deepep/ops/cmake/util/parse_ini_to_json.py
    • Removed the file, removing the INI to JSON parsing script.
  • csrc/deepep/ops/cmake/util/preset_parse.py
    • Removed the file, removing the preset parsing script.
  • csrc/deepep/ops/cmake/util/replay_codegen.py
    • Removed the file, removing the replay code generation script.
  • csrc/deepep/ops/cmake/util/replay_impl.temp
    • Removed the file, removing a replay implementation template.
  • csrc/deepep/ops/cmake/util/tiling_data_def_build.py
    • Removed the file, removing the tiling data definition build script.
  • csrc/deepep/ops/func.cmake
    • Renamed from csrc/deepep/ops/cmake/func.cmake to csrc/deepep/ops/func.cmake.
  • csrc/deepep/ops2/CMakeLists.txt
    • Updated CMake logic to detect CANN 8.2+ or 8.3+ include structures and define 'CANN_VERSION_MACRO' accordingly.
    • Removed direct inclusion of 'cmake/func.cmake' and instead included 'func.cmake' from the current directory.
    • Added a custom command using 'execute_process' to compile shared libraries for operators, incorporating the detected CANN version macro.
  • csrc/deepep/ops2/cmake/config.cmake
    • Removed the file, indicating a refactoring of global CMake configuration.
  • csrc/deepep/ops2/cmake/device_task.cmake
    • Removed the file, suggesting a change in how device tasks are managed or defined.
  • csrc/deepep/ops2/cmake/intf.cmake
    • Removed the file, indicating a change in interface library definitions.
  • csrc/deepep/ops2/cmake/makeself/COPYING
    • Removed the file, removing the GPL license text for makeself utility.
  • csrc/deepep/ops2/cmake/makeself/README.md
    • Removed the file, removing the README for makeself utility.
  • csrc/deepep/ops2/cmake/makeself/VERSION
    • Removed the file, removing the version information for makeself utility.
  • csrc/deepep/ops2/cmake/makeself/make-release.sh
    • Removed the file, removing the release script for makeself utility.
  • csrc/deepep/ops2/cmake/makeself/makeself-header.sh
    • Removed the file, removing the header script for makeself utility.
  • csrc/deepep/ops2/cmake/makeself/makeself.1
    • Removed the file, removing the man page for makeself utility.
  • csrc/deepep/ops2/cmake/makeself/makeself.lsm
    • Removed the file, removing the LSM file for makeself utility.
  • csrc/deepep/ops2/cmake/makeself/makeself.sh
    • Removed the file, removing the main makeself utility script.
  • csrc/deepep/ops2/cmake/makeself/run-tests.sh
    • Removed the file, removing the test runner for makeself utility.
  • csrc/deepep/ops2/cmake/merge_aicpu_info_json.sh
    • Removed the file, indicating a change in AICPU info merging.
  • csrc/deepep/ops2/cmake/util/init.py
    • Removed the file, removing Python package initialization.
  • csrc/deepep/ops2/cmake/util/ascendc_bin_param_build.py
    • Removed the file, removing the AscendC binary parameter build script.
  • csrc/deepep/ops2/cmake/util/ascendc_compile_kernel.py
    • Removed the file, removing the AscendC kernel compilation script.
  • csrc/deepep/ops2/cmake/util/ascendc_gen_options.py
    • Removed the file, removing the AscendC option generation script.
  • csrc/deepep/ops2/cmake/util/ascendc_get_op_name.py
    • Removed the file, removing the AscendC operator name retrieval script.
  • csrc/deepep/ops2/cmake/util/ascendc_impl_build.py
    • Removed the file, removing the AscendC implementation build script.
  • csrc/deepep/ops2/cmake/util/ascendc_op_info.py
    • Removed the file, removing the AscendC operator information script.
  • csrc/deepep/ops2/cmake/util/ascendc_ops_config.py
    • Removed the file, removing the AscendC operator configuration script.
  • csrc/deepep/ops2/cmake/util/ascendc_pack_kernel.py
    • Removed the file, removing the AscendC kernel packing script.
  • csrc/deepep/ops2/cmake/util/ascendc_pack_opregistry.py
    • Removed the file, removing the AscendC operator registry packing script.
  • csrc/deepep/ops2/cmake/util/ascendc_replay_build.py
    • Removed the file, removing the AscendC replay build script.
  • csrc/deepep/ops2/cmake/util/batch_replay_impl.temp
    • Removed the file, removing a batch replay implementation template.
  • csrc/deepep/ops2/cmake/util/code_channel_infer.py
    • Removed the file, removing the code channel inference script.
  • csrc/deepep/ops2/cmake/util/const_var.py
    • Removed the file, removing constant variable definitions.
  • csrc/deepep/ops2/cmake/util/gen_impl_and_mrege_json.sh
    • Removed the file, removing a script for generating and merging JSON.
  • csrc/deepep/ops2/cmake/util/gen_ops_filter.sh
    • Removed the file, removing an operator filter generation script.
  • csrc/deepep/ops2/cmake/util/gen_version_info.sh
    • Removed the file, removing a version information generation script.
  • csrc/deepep/ops2/cmake/util/insert_op_info.py
    • Removed the file, removing a script for inserting operator information.
  • csrc/deepep/ops2/cmake/util/insert_simplified_keys.py
    • Removed the file, removing a script for inserting simplified keys.
  • csrc/deepep/ops2/cmake/util/kernel_entry.py
    • Removed the file, removing a kernel entry generation script.
  • csrc/deepep/ops2/cmake/util/kernel_impl.temp
    • Removed the file, removing a kernel implementation template.
  • csrc/deepep/ops2/cmake/util/merge_aicpu_info_json.sh
    • Removed the file, removing a script for merging AICPU info JSON.
  • csrc/deepep/ops2/cmake/util/opdesc_parser.py
    • Removed the file, removing the operator description parser script.
  • csrc/deepep/ops2/cmake/util/parse_ini_to_json.py
    • Removed the file, removing the INI to JSON parsing script.
  • csrc/deepep/ops2/cmake/util/preset_parse.py
    • Removed the file, removing the preset parsing script.
  • csrc/deepep/ops2/cmake/util/replay_codegen.py
    • Removed the file, removing the replay code generation script.
  • csrc/deepep/ops2/cmake/util/replay_impl.temp
    • Removed the file, removing a replay implementation template.
  • csrc/deepep/ops2/cmake/util/tiling_data_def_build.py
    • Removed the file, removing the tiling data definition build script.
  • csrc/deepep/ops2/func.cmake
    • Renamed from csrc/deepep/ops2/cmake/func.cmake to csrc/deepep/ops2/func.cmake.
  • csrc/deepep_cmake_build.sh
    • Added a new top-level shell script to set up environment variables and orchestrate the build of 'deepep' modules based on command-line arguments.
Activity
  • The pull request was created by 1329009851 to add scripts for building CMake files.
  • The changes introduce a new CMake-based build system for the 'deepep' module, including new configuration and orchestration scripts.
  • Numerous old build utility scripts and CMake modules have been removed, indicating a significant refactoring and cleanup of the build infrastructure.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces scripts and configuration for building CMake files for the deepep module. The changes include new build scripts, CMake configuration updates, and the removal of old build utility scripts.

My review has identified a few areas for improvement:

  • A critical issue with potential infinite recursion in one of the new build scripts.
  • Some unused variables in another script that affect readability.
  • Code duplication in CMake files which could be refactored for better maintainability.

Overall, the move towards a more standard CMake-based build system is a positive change. Addressing the identified issues will make the new build process more robust and easier to maintain.

…-npu into sgl-cmake2

* 'sgl-cmake2' of https://github.com/1329009851/sgl-kernel-npu:
  CI execution requirements for separating a2 and a3 (sgl-project#367)
  Fix the bug that total expert num greater than 256 or local expert num is less than 8 (sgl-project#364)
  adapt ant moving to A2 single machine (sgl-project#362)
  reset ci -- run test mixed running for experts on a2. (sgl-project#365)
  Revert "Build the deepep package with the chip model included. (sgl-project#274)" (sgl-project#363)
  fix:buffer control (sgl-project#361)
  Build the deepep package with the chip model included. (sgl-project#274)
  bugfix wrong packages build dir (sgl-project#360)
  bump version to 2026.02.01 (sgl-project#359)
  Cover the workflows cases on a3 (sgl-project#321)
  release follows naming convention (sgl-project#356)
  Modify notifydispatch to support DEEPEP_NORMAL_LONG_SEQ_ROUND up to 128. (sgl-project#352)
  fix the hanging bug (sgl-project#355)
  [Bugfix] Fix build script working with cann 8.5.0 (sgl-project#354)
  Modify the description of DeepEP in the README file. (sgl-project#348)
  Revert "Add scripts for building CMake files (sgl-project#344)" (sgl-project#353)
  Add scripts for building CMake files (sgl-project#344)
  Support x86_64 and aarch64 binary release (sgl-project#325)
  add function for deep-ep tests (sgl-project#301)
  [Doc] Improved README.md content and English grammar and integrated the DeepWiki badge for Ask AI (sgl-project#345)
@1329009851
Copy link
Contributor Author

1329009851 commented Feb 26, 2026

The file changes displayed by git status before and after the build of cann8.2, 8.3, and 8.5 are as follows:
before
Untracked files:
.github/
.gitignore
.pre-commit-config.yaml

after
Changes not staged for commit:
modified: csrc/deepep/ops/op_kernel/moe_distribute_base.h

Untracked files:
.github/
.gitignore
.pre-commit-config.yaml

After the build, the moe_distribute_base.h file is updated. The cause of this issue will be located later.

This modification has been verified and passed on the A2, A3, and A5 environments.

@1329009851
Copy link
Contributor Author

Description of build command parameters (for compiling DeepEP):
bash build.sh {operator version} -a deepep/deepep2
Example: bash build.sh Ascend910_9382 -a deepep
If the operator version is not specified, the default operator version is used. (deepep corresponds to Ascend910_9382, and deepep2 corresponds to Ascend910B1)

Copy link
Collaborator

@Yael-X Yael-X left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

整体方向很好,大幅简化了构建系统。提几个建议供参考:

1. CANN 版本检测代码重复

ops/CMakeLists.txtops2/CMakeLists.txt 中有 30+ 行完全相同的 CANN 版本检测逻辑:

set(CANN82_PATH "${ASCEND_CANN_PACKAGE_PATH}/include/experiment/platform/platform/platform_infos_def.h")
set(CANN83_PATH "${ASCEND_CANN_PACKAGE_PATH}/include/platform/platform_infos_def.h")
# ... 30+ 行

建议: 提取为 cmake/cann_version.cmake 并在两个文件中 include(),避免未来维护时出现不一致。

2. build.sh 错误处理增强

csrc/deepep_cmake_build.sh 中调用子脚本时缺少错误检查:

echo "./deepep/build.sh $@"
./deepep/build.sh $@
# 建议增加:
if [ $? -ne 0 ]; then
    echo "ERROR: deepep build failed"
    exit 1
fi

3. 日志输出统一性

build.shecho "Use SOC_VERSION: $SOC_VERSION" 很好,建议在 compile_ascend_proj.shBuildAscendProj 函数开头也增加类似日志,方便调试时追踪实际使用的芯片型号。

4. A2/A3/A5 芯片映射注释

建议在 build.sh:84-88 附近增加注释,明确说明:

  • deepep → A3+ (Ascend910_9382)
  • deepep2 → A2 (Ascend910B1)
  • A5 暂不开源

这样后续维护者更容易理解默认值选择的逻辑。


以上都是锦上添花的建议,不影响整体合入。👍

message(FATAL_ERROR "Unsupported host processor: ${CMAKE_SYSTEM_PROCESSOR}. Please specify a valid architecture.")
endif()

set(CANN82_PATH "${ASCEND_CANN_PACKAGE_PATH}/include/experiment/platform/platform/platform_infos_def.h")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider extracting this CANN version detection logic into a shared cmake/cann_version.cmake file and including it from both ops/CMakeLists.txt and ops2/CMakeLists.txt. This avoids code duplication (30+ identical lines) and ensures consistency when updating CANN version support in the future.

export BUILD_TYPE="Release"
MODULE_NAME="all"
MODULE_BUILD_ARG=""
IS_MODULE_EXIST=0
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider adding error handling after calling ./deepep/build.sh. If the build fails, the script should exit with an error code:

./deepep/build.sh $@
if [ $? -ne 0 ]; then
    echo "ERROR: deepep build failed"
    exit 1
fi

local soc_version=$2

if [ -d "./${proj_name}" ]; then
rm -rf ${proj_name}/cmake
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider adding a log statement at the beginning of this function to show which SOC version is being used, similar to the echo "Use SOC_VERSION: $SOC_VERSION" in build.sh. This helps with debugging:

BuildAscendProj() {
  local soc_version=$2
  echo "[${FUNCNAME[0]}] Building for SOC: $soc_version"
  # ... rest of the function

export DEBUG_MODE=$DEBUG_MODE

SOC_VERSION="${1:-Ascend910_9382}"
if [[ "$BUILD_DEEPEP_OPS" == "ON" ]]; then
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider adding a comment here to clarify the chip mapping logic for future maintainers:

# Chip mapping:
# - deepep  → A3+ (Ascend910_9382)
# - deepep2 → A2  (Ascend910B1)
# - A5 is not open-sourced yet

This makes it clearer why different default SOC_VERSION values are used based on the BUILD_DEEPEP_OPS flag.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants