Subtask 2: Model Implementation

## Problem Statement

Implement the FastMTP model architecture - a single MTP head with position-shared weights that performs recursive K-step prediction.

We need to understand their exact architecture and implement it within the speculators framework.

## References

- **Paper:** [FastMTP arXiv:2509.18362](https://arxiv.org/abs/2509.18362) - Section 3.1, Figure 1
- **Reference Implementation:** [Tencent-BAC/FastMTP](https://github.com/Tencent-BAC/FastMTP)
- **Existing Code:** `src/speculators/models/eagle3/` for patterns

## What We Need

1. **FastMTPConfig** - Configuration class registered with the speculators framework
2. **FastMTPSpeculator** - Model class implementing the MTP head and recursive prediction
3. **Tests** - Verify model instantiation, forward pass, verifier attachment
4. **Documentation** - Architecture explanation and usage

## Success Criteria

- Model can be loaded via `SpeculatorModel.from_pretrained()`
- Forward pass produces correct output shapes for K-step prediction
- Verifier attachment works (train_only mode for training)
- Integration tests pass

## Notes

The architecture must match the reference implementation. Key questions: How is the MTP head structured? How are hidden states and embeddings combined? How do position-shared weights work?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Subtask 2: Model Implementation #274

Problem Statement

References

What We Need

Success Criteria

Notes

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Subtask 2: Model Implementation #274

Description

Problem Statement

References

What We Need

Success Criteria

Notes

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions