Skip to content

Deepseek-R1 MTP poor performance #4360

@jokerwyt

Description

@jokerwyt

I expect the time we spend on "verify" part should be close to a normal decode forward (less than 100ms, my setting is bs=16 and ctx=12k), but now it takes about 400ms. It slows down my output throughput severely. Seems like a kernel performance issue?

decoding with mtp profile:
Image

normal decode profile:
Image

The commit I test:
commit 4a05bdf (gh/main)
Author: Lianmin Zheng lianminzheng@gmail.com
Date: Sun Mar 9 18:53:33 2025 -0700

Revert "Check eagle server args" (#4242)

Originally posted by @jokerwyt in #3582 (comment)

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions