[Feature][NVIDIA] EPLB enablement for DSR1 disagg on GB200

### Checklist

- [ ] If this is not a feature request but a general question, please start a discussion at https://github.com/sgl-project/sglang/discussions. Otherwise, it will be closed.
- [ ] Please use English. Otherwise, it will be closed.

### Motivation

This issue tracks the ongoing work originating from NVIDIA.

1. The current EPLB implementation is broken because the first dimension of the global scaling factor does not match num_local_expert. As a temporary workaround, see https://github.com/sgl-project/sglang/pull/13715. @shifangx
 @wenscarl 
2. NVIDIA has proposed a Linear Programming (LP)–based expert-parallelism algorithm. The integration will proceed in two stages:

a) Integrate the LP kernel into FlashInfer. @feliang-git

b) Integrate the FlashInfer operator into sglang. 

### Related resources

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature][NVIDIA] EPLB enablement for DSR1 disagg on GB200 #14661

Checklist

Motivation

Related resources

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Feature][NVIDIA] EPLB enablement for DSR1 disagg on GB200 #14661

Description

Checklist

Motivation

Related resources

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions