Support Dp and Dp attn for MTP by ZhangLirong-amd · Pull Request #297 · ROCm/ATOM

ZhangLirong-amd · 2026-03-10T09:16:16Z

Motivation

support MTP draft model run in dp dummy docode/prefill
Solve some seqlen or mtp_k issue for DP

python3 -m atom.entrypoints.openai_server --model /data/DeepSeek-R1-0528/ -tp 8 --port 5678 --server-port 7777  --kv_cache_dtype fp8 --torch-profiler-dir ./log --method mtp --num-speculative-tokens 3 --block-size 10000 --gpu-memory-utilization 0.41 --enable-dp-attention --enable-expert-parallel

Technical Details

Test Plan

Test Result

Submission Checklist

Look over the contributing guidelines at https://github.com/ROCm/ROCm/blob/develop/CONTRIBUTING.md#pull-requests.

Copilot

Pull request overview

This PR extends Data Parallel (DP) support and DP attention (Dp attn) to the DeepSeek MTP (Multi-Token Prediction) speculative decoding path. It fixes crashes and incorrect behavior that occur when dummy runs—used for DP synchronization—are executed alongside a drafter model.

Changes:

Added is_dummy_run guards around slot mapping computation and spec decode metadata calculation to prevent crashes during DP synchronization
Updated dummy_execution and dummy_prefill_execution to capture and propagate hidden_states through the drafter model for CUDA graph capture
Added defensive guards in SpecStats._log() to prevent division-by-zero in DP edge cases, and initialized num_rejected/num_bonus in tokenIDProcessor.clean()

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments.

File	Description
`atom/model_ops/attentions/aiter_mla.py`	Updated `max_seqlen_qo` to account for MTP tokens; wrapped slot mapping and sum token computation with `is_dummy_run` guards in `prepare_decode`
`atom/model_engine/model_runner.py`	Captures `hidden_states` from `run_model` in dummy execution paths; runs drafter model in dummy runs for CUDA graph capture; skips `calc_spec_decode_metadata` during dummy runs; initializes `num_rejected`/`num_bonus` in `clean()`
`atom/model_engine/scheduler.py`	Added `ts == 0` and `iv_steps == 0` early returns in `SpecStats._log()` to prevent division-by-zero in DP scenarios

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Support Dp and Dp attn for DS MTP

72a6a9e

Copilot AI review requested due to automatic review settings March 10, 2026 09:16

Copilot started reviewing on behalf of ZhangLirong-amd March 10, 2026 09:17 View session

ZhangLirong-amd changed the title ~~Support Dp and Dp attn for DS MTP~~ Support Dp and Dp attn for MTP Mar 10, 2026

Copilot AI reviewed Mar 10, 2026

View reviewed changes

Merge branch 'main' into zlr/mtp_dp

f36eca6

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support Dp and Dp attn for MTP#297

Support Dp and Dp attn for MTP#297
ZhangLirong-amd wants to merge 2 commits intomainfrom
zlr/mtp_dp

ZhangLirong-amd commented Mar 10, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

ZhangLirong-amd commented Mar 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Technical Details

Test Plan

Test Result

Submission Checklist

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ZhangLirong-amd commented Mar 10, 2026 •

edited

Loading