Skip to content

sync attention, deepseek doc#14335

Merged
Fridge003 merged 5 commits intosgl-project:mainfrom
bzhng-development:brayden/sync-doc
Dec 3, 2025
Merged

sync attention, deepseek doc#14335
Fridge003 merged 5 commits intosgl-project:mainfrom
bzhng-development:brayden/sync-doc

Conversation

@b8zhong
Copy link
Collaborator

@b8zhong b8zhong commented Dec 3, 2025

@b8zhong b8zhong requested review from Fridge003 and Copilot December 3, 2025 04:31
@gemini-code-assist
Copy link
Contributor

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

@github-actions github-actions bot added documentation Improvements or additions to documentation deepseek labels Dec 3, 2025
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR synchronizes and updates documentation for attention backends and DeepSeek model support. The changes focus on improving clarity, adding new deployment guides, and updating technical specifications for various hardware architectures.

Key changes:

  • Updated attention backend documentation with refined FA4 specifications and removed outdated warnings
  • Enhanced DeepSeek V3/R1 documentation with expanded hardware configurations, new deployment guides, and improved formatting using structured callout blocks
  • Updated expert parallelism backend descriptions to use "Blackwell" instead of "SM100+" for better clarity

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.

File Description
docs/index.rst Added new documentation entries for multi-modal encoder DP and classify models; reordered references section
docs/basic_usage/deepseek_v3.md Expanded hardware configurations, added deployment guides/blog links, improved documentation structure with callout blocks, and clarified MTP usage
docs/advanced_features/expert_parallelism.md Updated backend descriptions to use "Blackwell" architecture name instead of "SM100+"
docs/advanced_features/attention_backend.md Updated FA4 page size specifications, removed outdated FP8 KV cache warning, and streamlined speculative decoding constraints

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

| **FA3 (FlashAttention 3)** | n/a | ❌ | ✅ | ✅ | ⚠️ (page_size=1 only) |
| **Triton** | n/a | ❌ | ❌ | ✅ | ⚠️ (page_size=1 only) |
| **FA4** | 128 | ❌ | ❌ | ❌ | ❌ |
| **FA4** | 1 | ❌ | ❌ | ❌ | ❌ |
Copy link

Copilot AI Dec 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's an inconsistency in FA4's page size specification between the MHA and MLA tables. The MHA table (line 20) shows FA4 with page size "128", but the MLA table (line 41) shows FA4 with page size "1". Please verify which is correct and ensure consistency across both tables.

Suggested change
| **FA4** | 1 |||||
| **FA4** | 128 |||||

Copilot uses AI. Check for mistakes.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(its actually like this.)

| **Quantized weights ([W4A8](https://huggingface.co/novita/Deepseek-R1-0528-W4AFP8))** | 8 x H20/100, 4 x H200 |
| **Quantized weights ([AWQ](https://huggingface.co/QuixiAI/DeepSeek-R1-0528-AWQ))** | 8 x H100/800/20 |
| | 8 x A100/A800 |
| **Quantized weights ([MXFP4](https://huggingface.co/amd/DeepSeek-R1-MXFP4-Preview))** | 8, 4 x MI355X/350X |
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i have personally tried the w4a8 + mxfp4 combinations, so they work fine.

@Fridge003 Fridge003 merged commit 65c8568 into sgl-project:main Dec 3, 2025
45 checks passed
@b8zhong b8zhong deleted the brayden/sync-doc branch December 3, 2025 05:34
yingluosanqian pushed a commit to yingluosanqian/sglang that referenced this pull request Dec 4, 2025
Co-authored-by: Brayden Zhong <b8zhong@users.noreply.github.com>
tonyluj pushed a commit to openanolis/sglang that referenced this pull request Dec 5, 2025
Co-authored-by: Brayden Zhong <b8zhong@users.noreply.github.com>
tonyluj pushed a commit to openanolis/sglang that referenced this pull request Dec 5, 2025
Co-authored-by: Brayden Zhong <b8zhong@users.noreply.github.com>
sunxxuns pushed a commit to sunxxuns/sglang that referenced this pull request Dec 5, 2025
Co-authored-by: Brayden Zhong <b8zhong@users.noreply.github.com>
yuchengz816-bot pushed a commit to yuchengz816-bot/sglang that referenced this pull request Dec 8, 2025
Co-authored-by: Brayden Zhong <b8zhong@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

deepseek documentation Improvements or additions to documentation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants

Comments