Skip to content

[Feature] Multi Modal Models Support #476

@sii-xinglong

Description

@sii-xinglong

SGLang-Jax multimodal models support

If there is any problem, you can raise it here. #488

Goal

  1. Design and implement inference sequences for multiple modal models.
  2. Implement the Wan model and ensure its correctness.
  3. Implement the Mimo-Audio model and ensure its correctness.
  4. Implement the Qwen2.5-VL model and ensure its correctness.

Plans

Research Doc

we can use google doc to record

Design Doc

Host Compoment @SII-limingliu @pathfinder-pf

Device Compoment @zkkython @SiqiLi-Fighting

Work Assignment

  1. Implement the Wan baseline model in bonsai and ensure its correctness. @labyrinth-ssr @Iamleos
  2. Implement the Mimo-Audio baseline model in bonsai and ensure its correctness. @Mozoltov821 @SiqiLi-Fighting

TODO

Review

TODO

Test

TODO

Benchmark & Profile

TODO

Community Members

SII-Team : @SII-limingliu @yangdian96 @liao1995 @lianga1

SGLang-Jax Team :  @zkkython @pathfinder-pf @SiqiLi-Fighting @JamesBrianD

Discord

Sub-issues

Metadata

Metadata

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions