Skip to content

docker: add manifest to versioned docker releases#11268

Merged
zhyncs merged 2 commits intomainfrom
ishan/more-unif
Oct 6, 2025
Merged

docker: add manifest to versioned docker releases#11268
zhyncs merged 2 commits intomainfrom
ishan/more-unif

Conversation

@ishandhanani
Copy link
Collaborator

Goal is to run docker pull lmsysorg/sglang:v0.5.3 and get the proper image corresponding to your machine. Also added in a latest as well @zhyncs

@gemini-code-assist
Copy link
Contributor

Note

Gemini is unable to generate a summary for this pull request due to the file types involved not being currently supported.

@kyleliang-nv
Copy link
Contributor

Can you also fix https://github.com/sgl-project/sglang/blob/main/docker/Dockerfile#L102 to add another OR condition for all_aarch64.
This line (https://github.com/sgl-project/sglang/blob/main/docker/Dockerfile#L115) should also be changed to
check BUILD_TYPE for either blackwell, blackwell_aarch64, or all_aarch64

@kyleliang-nv
Copy link
Contributor

kyleliang-nv commented Oct 6, 2025

Another issue is this (https://github.com/sgl-project/sglang/blob/main/docker/Dockerfile#L143-L149) will fail if CUDA_VERISON=12.6.1.
You will want to set env FLASH_MLA_DISABLE_SM100=1 for CUDA_VERSION less than 12.9.1, otherwise will get error like this.

#22 4.375 Processing /sgl-workspace/flash-mla
#22 4.375   Preparing metadata (setup.py): started
#22 4.375   Running command python setup.py egg_info
#22 4.833   /usr/local/lib/python3.12/dist-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you.
#22 4.833     import pynvml  # type: ignore[import]
#22 5.573   W1006 05:49:58.407000 114 torch/utils/cpp_extension.py:118] No CUDA runtime is found, using CUDA_HOME='/usr/local/cuda'
#22 5.612   Traceback (most recent call last):
#22 5.612     File "<string>", line 2, in <module>
#22 5.612     File "<pip-setuptools-caller>", line 35, in <module>
#22 5.612     File "/sgl-workspace/flash-mla/setup.py", line 93, in <module>
#22 5.612       ] + get_features_args() + get_arch_flags() + get_nvcc_thread_args(),
#22 5.612                                 ^^^^^^^^^^^^^^^^
#22 5.612     File "/sgl-workspace/flash-mla/setup.py", line 39, in get_arch_flags
#22 5.613       assert DISABLE_SM100, "sm100 compilation for Flash MLA requires NVCC 12.9 or higher. Please set FLASH_MLA_DISABLE_SM100=1 to disable sm100 compilation, or update your environment."
#22 5.613              ^^^^^^^^^^^^^
#22 5.613   AssertionError: sm100 compilation for Flash MLA requires NVCC 12.9 or higher. Please set FLASH_MLA_DISABLE_SM100=1 to disable sm100 compilation, or update your environment.

docker buildx build --platform linux/arm64 --push -f docker/Dockerfile --build-arg CUDA_VERSION=${{ matrix.variant.cuda_version }} --build-arg BUILD_TYPE=${{ matrix.variant.build_type }} -t lmsysorg/sglang:${tag}${tag_suffix} --no-cache .
# Create versioned manifest
docker buildx imagetools create \
-t lmsysorg/sglang:v${version}-cu129 \
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we don't need -cu129 suffix

Copy link
Contributor

@merrymercy merrymercy Oct 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe still keep it somewhere?
Later we might have -cu129, -cu130 , which cannot be unified.

but for lmsys/sglang:latest, we can remove it and have a default cuda verison

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

e.g.
lmsysorg/sglang:v0.5.3 for cu129
lmsysorg/sglang:v0.5.3-cu130 for cu130

how about this

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes sense. I think default for almost all users will be cu129 therefore we don't need it. As we push on cu130 lets keep it tagged as such

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am actually worried that users may be troubled, for example, this image can also be used on cu126/cu128/cu129 hopper, not just cu129.

@zhyncs zhyncs merged commit 73ea484 into main Oct 6, 2025
23 of 24 checks passed
@zhyncs zhyncs deleted the ishan/more-unif branch October 6, 2025 21:53
ch-tiger1 pushed a commit to ch-tiger1/sglang that referenced this pull request Oct 9, 2025
lpc0220 pushed a commit to lpc0220/sglang that referenced this pull request Oct 29, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants

Comments