Conversation
|
@ByronHsu Although I've implemented the transfer, the model output is still garbage. Do you know if there's any more dummy code for PD that needs to be implemented? |
|
Cool! The change looks pretty neat! I suspect there is misalignment between kv on prefill and decode tensor. Can you print the kv tensor on prefill and decode side to compare? |
|
@ByronHsu Good news, I fixed the model output issue. |
|
Neat work! Can we have a README for how to deploy a demo? I tried to run it with Guess: must we build ucx with-cuda or with-gdrcopy? I didn't enable them. |
|
@trevor-m Great work, could you please share the installation guide and the commands to test the demo? Thanks a lot! |
|
Hi @hnyls2002 the installation instructions for NIXL can be found here: https://github.com/ai-dynamo/nixl I'm using a container with UCX already installed at git clone https://github.com/ai-dynamo/nixl.git
cd nixl
pip install meson
meson setup build
cd build
meson configure -Ducx_path=/opt/hpcx/ucx
ninja
ninja install
cd ..
pip install .To run the demo: |
Hi, I still cannot reproduce it; it is the same error as my previous comments. Do you need GDRCopy enabled for nixl? |
|
Hi @jokerwyt, that error is coming from UCX. It could be a build config issue, or a mismatch between UCX versions. One suggestion I have is to use one of NVIDIA's containers which has UCX already installed, such as |
|
Hi, NIXL is easy to use. I made some effort to extend this PR to full support for xPyD and tensor parallelism. Now I have a demo for 2P2D, TP=2 for each instance, on an 8-L20 GPU single node. |
|
closing in favor of #5477 |
Currently supports 1P+1D
Motivation
#4655
Modifications
Checklist