Fix model loading from S3 buckets#11865
Fix model loading from S3 buckets#11865ganochenkodg wants to merge 21 commits intosgl-project:mainfrom
Conversation
Summary of ChangesHello @ganochenkodg, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request resolves an issue where models failed to load from S3 buckets due to incorrect local file path generation. The fix involves a precise adjustment to how file paths are handled when downloading from S3, preventing files from being inadvertently saved to the root directory. Additionally, it updates the Docker build process to include essential S3 client libraries, ensuring the application has all required components for S3 integration. Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request addresses a critical bug in the SGLang framework that prevented models from being loaded correctly from S3 buckets. The issue stemmed from an incorrect path manipulation within the pull_files function, leading to files being saved in the wrong directory. The fix involves correcting the file.removeprefix call to properly construct the destination file path. Additionally, the PR adds necessary S3 dependencies to the Docker image to ensure the framework can interact with S3 storage. I have identified a critical issue in the pull_files function that needs to be addressed to ensure correct functionality.
| return | ||
|
|
||
| for file in files: | ||
| destination_file = os.path.join(self.local_dir, file.removeprefix(base_dir)) |
There was a problem hiding this comment.
The original code file.removeprefix(base_dir) can lead to incorrect file paths if base_dir does not end with a /. This can cause files to be saved in the root directory instead of the intended subdirectory. The corrected code file.removeprefix(f"{base_dir}/") ensures that the prefix always includes the trailing slash, preventing this issue.
It's important to ensure that base_dir always ends with a / to avoid unexpected behavior. If base_dir is guaranteed to always end with a /, then the change is correct. However, if there's a possibility that base_dir might not end with a /, it would be safer to add a check to ensure that it does before calling removeprefix.
I'm marking this as critical because the bug causes all files to be saved in the root folder, which is a major issue.
| destination_file = os.path.join(self.local_dir, file.removeprefix(base_dir)) | |
| destination_file = os.path.join(self.local_dir, file.removeprefix(f"{base_dir}/")) |
|
To fully support loading from S3 buckets the call to This is because The device parameter is passed only when the model weights are loaded, but not in the other calls |
|
@noa-neria is it really necessary to fix this concrete bug? i tested this fix with sglang 0.5.2 and it works fine, you can find my image here - https://hub.docker.com/layers/dganochenko/sglang/v0.5.2/ |
|
Yes, it is necessary. |
|
|
@noa-neria done, i fixed it and tested with s3 and huggingface S3 logs HF logs |
|
@merrymercy @hnyls2002 can you please help review? |
| model_impl: Union[str, ModelImpl] = ModelImpl.AUTO, | ||
| sampling_defaults: str = "openai", | ||
| quantize_and_serve: bool = False, | ||
| device: Optional[str] = None, |
There was a problem hiding this comment.
Is there a way to avoid adding this to model config? @ganochenkodg
Edit: disregard this comment.
There was a problem hiding this comment.
@ganochenkodg, why would you need this device config?
There was a problem hiding this comment.
@hnyls2002 In ModelConfig init function we run self._maybe_pull_model_tokenizer_from_remote() (line 116) that executes create_remote_connector, which requires 2 params - model_path and device (line 812). This device variable i can take in from_server_args method (line 240) and must store somewhere to pass in create_remote_connector.
I can remove storing device as a model parameter and pass it to pull_model_tokenizer function directly, like this:
- self.device = device
- self._maybe_pull_model_tokenizer_from_remote()
+ self._maybe_pull_model_tokenizer_from_remote(device)
There was a problem hiding this comment.
please do it, the device parameter should not be stored in model config.
There was a problem hiding this comment.
@ganochenkodg It still seems. tobe in ModelConfig?
There was a problem hiding this comment.
@b8zhong i left it in parameters, because i have no choice. we doesn't store store it in this.device now, only pass next. ModelConfig init function calls self._maybe_pull_model_tokenizer_from_remote (here) -> which calls client_remote_connector (here) whioch requires device as a parameter. so i added passing device parameter to init function in from_server_args (here), i can't find any other way to take "device" other than passing it here. without device create_remote_connector obviously crashes
docker/Dockerfile
Outdated
| && rm -rf /root/.cargo /root/.rustup target dist ~/.cargo | ||
|
|
||
| # Install S3 dependencies | ||
| RUN python3 -m pip install --no-cache-dir --break-system-packages \ |
There was a problem hiding this comment.
Why break system packages?
There was a problem hiding this comment.
Always add this to avoid warnings from pip. Looks like not needed here, fixed.
| model_impl: Union[str, ModelImpl] = ModelImpl.AUTO, | ||
| sampling_defaults: str = "openai", | ||
| quantize_and_serve: bool = False, | ||
| device: Optional[str] = None, |
There was a problem hiding this comment.
@ganochenkodg, why would you need this device config?
|
UPD: |
| model_impl: Union[str, ModelImpl] = ModelImpl.AUTO, | ||
| sampling_defaults: str = "openai", | ||
| quantize_and_serve: bool = False, | ||
| device: Optional[str] = None, |
There was a problem hiding this comment.
please do it, the device parameter should not be stored in model config.
|
/tag-and-rerun-ci |
|
@b8zhong @hnyls2002 @ganochenkodg Model Streamer team just opened PR #17948 for a solid integration with runai model streamer, as was done in vLLM. |
|
Closing because of updated PR. But thanks @ganochenkodg |
Motivation
The goal of this PR is to fix loading models from S3 buckets. The code contains a bug in iterator function: https://github.com/sgl-project/sglang/blob/main/python/sglang/srt/connector/s3.py#L104
file.removeprefix lefts / in the beginning of filename, and it causes issues during os.path.join execution. Python documentation about path.join says: "If any component is an absolute path, all previous components (on Windows, including the previous drive letter, if there was one) are thrown away, and joining continues." With such bug all files from S3 are saved in the root folder as a result, and Sglang can't find them during model loading:
ValueError: Unrecognized model in /tmp/tmpgtn2cumr. Should have a model_type key in its config.json, or contain one of the following strings in its name: list of model types
Modifications
This PR fixes path.join function and also adds S3 related python dependencies into docker image, because they are non present in current images by some reason.
Accuracy Tests
Benchmarking and Profiling
Checklist