feat: use cuda_graph by default for vllm by parthchadha · Pull Request #116 · NVIDIA-NeMo/RL

parthchadha · 2025-04-01T21:51:45Z

What does this PR do ?

Enables cuda graph by default for vllm generation.
For llama-8b I am seeing 10-15% better generation speed compared to eager mode.

Issues

Closes #115.

Usage

You can potentially add a usage example below

# Add a code snippet demonstrating how to use this

Before your PR is "Ready for review"

Pre checks:

Make sure you read and followed Contributor guidelines
Did you write any new necessary tests?
Did you run the unit tests and functional tests locally? Visit our Testing Guide for how to run tests
Did you add or update any necessary documentation? Visit our Document Development Guide for how to write, build and test the docs.

Additional Information

...

Signed-off-by: Parth Chadha <pchadha@nvidia.com>

Signed-off-by: Parth Chadha <pchadha@nvidia.com> Signed-off-by: Yi-Fu Wu <yifu.wu@gmail.com>

Signed-off-by: Parth Chadha <pchadha@nvidia.com>

feat: use cuda_graph by default for vllm

54a3b4a

Signed-off-by: Parth Chadha <pchadha@nvidia.com>

parthchadha requested review from SahilJain314 and terrykong April 1, 2025 21:51

parthchadha added the Run CICD label Apr 1, 2025

terrykong reviewed Apr 1, 2025

View reviewed changes

Comment thread nemo_reinforcer/models/generation/vllm.py

Merge branch 'main' into pchadha/vllm-cuda-graph

f78e4f8

parthchadha added Run CICD and removed Run CICD labels Apr 1, 2025

terrykong approved these changes Apr 1, 2025

View reviewed changes

parthchadha enabled auto-merge (squash) April 1, 2025 22:56

parthchadha merged commit d9277a8 into main Apr 1, 2025
11 checks passed

parthchadha deleted the pchadha/vllm-cuda-graph branch April 1, 2025 22:58

yfw pushed a commit that referenced this pull request Apr 2, 2025

feat: use cuda_graph by default for vllm (#116)

97c5e1b

Signed-off-by: Parth Chadha <pchadha@nvidia.com> Signed-off-by: Yi-Fu Wu <yifu.wu@gmail.com>

KiddoZhu pushed a commit that referenced this pull request May 6, 2025

feat: use cuda_graph by default for vllm (#116)

64b39f3

Signed-off-by: Parth Chadha <pchadha@nvidia.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: use cuda_graph by default for vllm#116

feat: use cuda_graph by default for vllm#116
parthchadha merged 2 commits intomainfrom
pchadha/vllm-cuda-graph

parthchadha commented Apr 1, 2025

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

parthchadha commented Apr 1, 2025

What does this PR do ?

Issues

Usage

Before your PR is "Ready for review"

Additional Information

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants