Skip to content

Subtask 2: Model Implementation #274

@rahul-tuli

Description

@rahul-tuli

Problem Statement

Implement the FastMTP model architecture - a single MTP head with position-shared weights that performs recursive K-step prediction.

We need to understand their exact architecture and implement it within the speculators framework.

References

What We Need

  1. FastMTPConfig - Configuration class registered with the speculators framework
  2. FastMTPSpeculator - Model class implementing the MTP head and recursive prediction
  3. Tests - Verify model instantiation, forward pass, verifier attachment
  4. Documentation - Architecture explanation and usage

Success Criteria

  • Model can be loaded via SpeculatorModel.from_pretrained()
  • Forward pass produces correct output shapes for K-step prediction
  • Verifier attachment works (train_only mode for training)
  • Integration tests pass

Notes

The architecture must match the reference implementation. Key questions: How is the MTP head structured? How are hidden states and embeddings combined? How do position-shared weights work?

Metadata

Metadata

Assignees

Labels

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions