Skip to content

Add DeepSeek-V3 CP support #821

@yfw

Description

@yfw

Requires upgrading to latest mcore (post NVIDIA/Megatron-LM@d07e4e5) and TE 2.6 for MLA + CP fixes

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions