[BugFix] fix VL fp8 bug when moe token_num is 0 by ming1753 · Pull Request #4928 · PaddlePaddle/FastDeploy

ming1753 · 2025-11-10T10:15:06Z

Motivation

多模态模型可能存在仅有文本无图片，或者仅有图片无文本的情况，此时图像Moe或文本Moe会收到一个token_num=0的输入。Triton算子不支持。

Modifications

一种方法是拼接一个长度为1的token vector，不跳过kernel执行，并且在计算完成后切回真实长度。优点是这样理论上Prefill可以跑Cuda Graph，但是缺点是会造成轻微显存增长。
考虑到短期内prefill不需要跑Cuda Graph，现采用token_num=0时跳过执行的修复方式。

Usage or Command

无

Accuracy Tests

无

Checklist

Add at least a tag in the PR title.
- Tag list: [[FDConfig],[APIServer],[Engine], [Scheduler], [PD Disaggregation], [Executor], [Graph Optimization], [Speculative Decoding], [RL], [Models], [Quantization], [Loader], [OP], [KVCache], [DataProcessor], [BugFix], [Docs], [CI], [Optimization], [Feature], [Benchmark], [Others], [XPU], [HPU], [GCU], [DCU], [Iluvatar], [Metax]]
- You can add new tags based on the PR content, but the semantics must be clear.
Format your code, run pre-commit before commit.
Add unit tests. Please write the reason in this PR if no unit tests.
Provide accuracy results.
If the current PR is submitting to the release branch, make sure the PR has been submitted to the develop branch, then cherry-pick it to the release branch with the [Cherry-Pick] PR tag.

paddle-bot · 2025-11-10T10:15:11Z

Thanks for your contribution!

…into dev_fp8

[BugFix] fix VL fp8 bug when moe token_num is 0

1d98454

ming1753 and others added 5 commits November 10, 2025 18:31

fix bug

42358f5

format

09097a9

Merge branch 'develop' into dev_fp8

aa8fdc3

fix bug

067576e

Merge branch 'develop' of https://github.com/PaddlePaddle/FastDeploy …

f594ae3

…into dev_fp8

YuanRisheng added the skip-ci: coverage label Nov 12, 2025

EmmonsCurse approved these changes Nov 12, 2025

View reviewed changes

EmmonsCurse merged commit 3148dbc into PaddlePaddle:develop Nov 12, 2025
19 of 23 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

[BugFix] fix VL fp8 bug when moe token_num is 0#4928

[BugFix] fix VL fp8 bug when moe token_num is 0#4928
EmmonsCurse merged 6 commits intoPaddlePaddle:developfrom
ming1753:dev_fp8

ming1753 commented Nov 10, 2025 •

edited

Loading

Uh oh!

paddle-bot bot commented Nov 10, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Comments

Conversation

ming1753 commented Nov 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Modifications

Usage or Command

Accuracy Tests

Checklist

Uh oh!

paddle-bot bot commented Nov 10, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

ming1753 commented Nov 10, 2025 •

edited

Loading