Skip to content

[ray] launch multiple GPU with ray#396

Merged
feifeibear merged 10 commits intoxdit-project:mainfrom
lihuahua123:main
Dec 20, 2024
Merged

[ray] launch multiple GPU with ray#396
feifeibear merged 10 commits intoxdit-project:mainfrom
lihuahua123:main

Conversation

@lihuahua123
Copy link
Contributor

Support Ray to start the pipeline

@feifeibear
Copy link
Collaborator

PR实现了通过ray方式启动多进程。参考vllm使用RayGPUExecutor来管理多个worker,每个worker执行diffusers pipefline的逻辑。

目前这种方式和torchrun启动程序(example.py)用法差别太大。

我建议设计一个DiffusionPipeline的Ray分布式版本,RayDiffusionPipeline,然后这个类提供from_pretrained,forward等接口。

PR中hardcode了一些地方,比如对模型初始化text_encoder处理,因为目前text_encoder是没有多卡切分的,可以让每个worker都重复载入text_encoder,希望尽量保持和torchrun接口一致性。

@feifeibear feifeibear changed the title [WIP] Ray Support [WIP] launch multiple GPU with ray Dec 20, 2024
# output is a list of results from each worker, we take the last one
for i, image in enumerate(output[-1].images):
image.save(
f"/data/results/{model_name}_result_{i}.png"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

save to a relative path ./results/xxx

tp_config: TensorParallelConfig
distributed_executor_backend: Optional[str] = None
world_size: int = 1 # FIXME: remove this
worker_cls: str = "xfuser.ray.worker.worker.Worker"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need distributed_executor_backend and worker_cls?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need distributed_executor_backend, but we need worker_cls for ray to initial worker by its class name

def init_worker(self, *args, **kwargs):
      worker_class = resolve_obj_by_qualname(
          self.worker_cls)
      self.worker = worker_class(*args, **kwargs)
      assert self.worker is not None

Copy link
Collaborator

@feifeibear feifeibear left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@feifeibear feifeibear changed the title [WIP] launch multiple GPU with ray [ray] launch multiple GPU with ray Dec 20, 2024
@feifeibear feifeibear marked this pull request as ready for review December 20, 2024 07:26
@feifeibear feifeibear merged commit f58302a into xdit-project:main Dec 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants