Skip to content

add back flashinfer jit cache to dev docker#12851

Merged
Fridge003 merged 1 commit intomainfrom
brayden/add-flashinfer-jit-cache
Nov 7, 2025
Merged

add back flashinfer jit cache to dev docker#12851
Fridge003 merged 1 commit intomainfrom
brayden/add-flashinfer-jit-cache

Conversation

@b8zhong
Copy link
Collaborator

@b8zhong b8zhong commented Nov 7, 2025

for ease of development & since the dev docker does not have a slim requirement

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @b8zhong, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request reintroduces the flashinfer-jit-cache into the development Docker image to enhance the developer experience. It achieves this by adding a new build argument, INSTALL_FLASHINFER_JIT_CACHE, which allows for the conditional installation of the cache. This change acknowledges that development environments prioritize convenience over strict size optimization, thereby streamlining the compilation process for FlashInfer components.

Highlights

  • Conditional FlashInfer JIT Cache Installation: The flashinfer-jit-cache package is now conditionally installed in the development Docker image, enabling faster compilation for FlashInfer operations during development.
  • New Build Argument for Control: A new build argument, INSTALL_FLASHINFER_JIT_CACHE, has been introduced in the Dockerfile, allowing users to explicitly enable or disable the installation of the JIT cache.
  • Enhanced Development Experience: This change aims to improve the development workflow by providing the FlashInfer JIT cache in the development environment, leveraging the fact that development Docker images do not have strict size constraints.
Ignored Files
  • Ignored by pattern: .github/workflows/** (1)
    • .github/workflows/release-docker-dev.yml
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request adds back the installation of flashinfer-jit-cache to the Dockerfile, controlled by a new build argument INSTALL_FLASHINFER_JIT_CACHE. This is useful for development environments. My review identifies a critical syntax error in the shell command that will break the build, and also suggests an improvement for robustness. Please see the detailed comment.

fi \
&& python3 -m pip install -e "python[${BUILD_TYPE}]" --extra-index-url https://download.pytorch.org/whl/cu${CUINDEX} \
&& if [ "$INSTALL_FLASHINFER_JIT_CACHE" = "1" ]; then \
python3 -m pip install flashinfer-jit-cache==0.5.0 --index-url https://flashinfer.ai/whl/cu${CUINDEX} ; \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

This line has two issues:

  1. Syntax Error: The trailing backslash \ is incorrect. It escapes the newline, which will cause the if statement to be malformed and break the build. It should be removed.

  2. Robustness: Using --index-url replaces the default package index. If flashinfer-jit-cache has dependencies that are not on the flashinfer.ai index, this step will fail. Using --extra-index-url is safer as it adds the new index without removing the default one.

The suggested code below fixes both issues.

      python3 -m pip install flashinfer-jit-cache==0.5.0 --extra-index-url https://flashinfer.ai/whl/cu${CUINDEX} ;

@b8zhong
Copy link
Collaborator Author

b8zhong commented Nov 7, 2025

Ready to merge once this https://github.com/sgl-project/sglang/actions/runs/19182192233 is green

@b8zhong
Copy link
Collaborator Author

b8zhong commented Nov 7, 2025

Screenshot 2025-11-07 at 2 48 24 PM

@Fridge003

@Fridge003 Fridge003 merged commit 55e8e39 into main Nov 7, 2025
30 checks passed
@Fridge003 Fridge003 deleted the brayden/add-flashinfer-jit-cache branch November 7, 2025 22:51
@sglang-bot
Copy link
Member

This failed the nightly dev image build. Can you take a look? We also need a more automated way to maintain the version in two places.

https://github.com/sgl-project/sglang/actions/runs/19397619609/job/55499873230#step:7:4792

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants

Comments