[Docs] Add docs for Qwen3-VL image and video support#12554
[Docs] Add docs for Qwen3-VL image and video support#12554hnyls2002 merged 6 commits intosgl-project:mainfrom
Conversation
Summary of ChangesHello @adarshxs, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request introduces comprehensive documentation for integrating and utilizing the Qwen3-VL multimodal large language model within SGLang. The new guide covers the process of launching the model and provides practical code examples for sending both image and video input requests, thereby enhancing SGLang's support for advanced multimodal capabilities and making it easier for users to leverage these features. Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request adds documentation for the Qwen3-VL model family, covering image and video input support. The new documentation page is well-structured and provides clear examples for launching the server and sending requests. I've identified a small but critical issue in the server launch command example that would cause it to fail. The suggested fix will ensure the command is runnable as-is.
| To serve the model: | ||
|
|
||
| ```bash | ||
| python3 -m sglang.launch_server \ |
There was a problem hiding this comment.
We could mention:
python3 -m sglang.launch_server --model-path Qwen/Qwen3-VL-235B-A22B-Instruct-FP8 --tp 8 --ep 8
There was a problem hiding this comment.
thanks, but sorry for the confusion, I meant we should list commands for both models😂
|
And also, please make the examples in this doc runs ok |
|
Yes i've ran these examples for both video and image and they work well |
merrymercy
left a comment
There was a problem hiding this comment.
- list usage for both fp8 and non fp8
- list the command for more hardware (A100, H200)
- try to match the vllm repice https://docs.vllm.ai/projects/recipes/en/latest/Qwen/Qwen3-VL.html
Updated launch commands and added hardware-specific recommendations for Qwen3-VL model in SGLang documentation.
Added standalone page for Qwen3 VL model family.
cc @mickqian