Skip to content

Conversation

@stmatengss
Copy link
Collaborator

Motivation

Currently, PD doesn't support indicating specified IB devices for different GPU id (tp/dp rank).
This PR supports this feature by introducing a new json format, and it consists of a mapping between gpu id and IB devices.

Usage

# Support both methods.
--disaggregation-ib-device ib0,ib1,ib2
--disaggregation-ib-device {0: "ib0, ib1", 1: "ib2, ib3", 2: "ib4"}

Modifications

Accuracy Tests

Benchmarking and Profiling

Checklist

@gemini-code-assist
Copy link
Contributor

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

@stmatengss stmatengss force-pushed the add_gpu_id_device_topo_support branch from 66a78d4 to 81cdaf2 Compare November 10, 2025 15:37
@stmatengss
Copy link
Collaborator Author

@ShangmingCai PTAL. thx

Copy link
Collaborator

@ShangmingCai ShangmingCai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. But we definitely should fix auto-discovery.

nit: if ib_device_str is None has already been checked before the try block, so it is impossible to be None here, we can just return ib_device_str.

@ShangmingCai ShangmingCai merged commit 44e391b into sgl-project:main Nov 12, 2025
149 of 157 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants