Skip to content

[rollout] perf: replace AsyncOpenAI to aiohttp client in ChatCompletionScheduler#1588

Merged
vermouth1992 merged 1 commit intomainfrom
wuxibin/async_vllm_perf
May 20, 2025
Merged

[rollout] perf: replace AsyncOpenAI to aiohttp client in ChatCompletionScheduler#1588
vermouth1992 merged 1 commit intomainfrom
wuxibin/async_vllm_perf

Conversation

@wuxibin89
Copy link
Collaborator

Checklist Before Starting

  • Search for similar PR(s).

What does this PR do?

AsyncOpenAI has very severe performance issue due to httpx, replace it to aiohttp client. For train_batch_size=1024, AsyncOpenAI introduces ~25s per generation phase.

High-Level Design

Demonstrate the high-level design if this PR is complex.

Specific Changes

List the specific changes.

API

Demonstrate how the API changes if any.

Usage Example

Provide usage example(s) for easier usage.

# Add code snippet or script demonstrating how to use this 

Test

For changes that can not be tested by CI (e.g., algorithm implementation, new model support), validate by experiment(s) and show results like training curve plots, evaluatuion results, etc.

Additional Info.

  • Issue Number: Fixes issue # or discussion # if any.
  • Training: [Note which backend this PR will affect: FSDP, Megatron, both, or none]
  • Inference: [Note which backend this PR will affect: vLLM, SGLang, both, or none]

Checklist Before Submitting

  • Read the Contribute Guide.
  • Apply pre-commit checks.
  • Add [BREAKING] to the PR title if it breaks any API.
  • Update the documentation about your changes in the docs.
  • Add CI test(s) if necessary.

Copy link
Collaborator

@hongpeng-guo hongpeng-guo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just with one small nit question.

timeout=None,
max_retries=0
)
client = AsyncOpenAI(base_url=f"http://{address}/v1", api_key="token-abc123", timeout=None, max_retries=0)
Copy link
Collaborator

@hongpeng-guo hongpeng-guo May 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems this line is the same as before, but the format changes. Just want to double check if the current one is lint with the pre-commit hook :)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it's auto format by pre-commit hook.

@vermouth1992 vermouth1992 merged commit 3eaaf24 into main May 20, 2025
40 of 43 checks passed
@vermouth1992 vermouth1992 deleted the wuxibin/async_vllm_perf branch May 20, 2025 03:31
@casper-hansen
Copy link
Contributor

@wuxibin89 I found that this PR reintroduced the problem fixed in #1483 because we switched from httpx to aiohttp which has a default timeout of 5 minutes. Would you mind having a look at this error to fix it? CC @U-rara.

Traceback (most recent call last):
  File "/opt/conda/envs/py_3.11/lib/python3.11/site-packages/verl/workers/rollout/async_server.py", line 188, in submit_chat_completions
    completions = await self._chat_completions_aiohttp(address, **chat_complete_request)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/envs/py_3.11/lib/python3.11/site-packages/verl/workers/rollout/async_server.py", line 203, in _chat_completions_aiohttp
    async with session.post(
  File "/opt/conda/envs/py_3.11/lib/python3.11/site-packages/aiohttp/client.py", line 1425, in __aenter__
    self._resp: _RetType = await self._coro
                           ^^^^^^^^^^^^^^^^
  File "/opt/conda/envs/py_3.11/lib/python3.11/site-packages/aiohttp/client.py", line 730, in _request
    await resp.start(conn)
  File "/opt/conda/envs/py_3.11/lib/python3.11/site-packages/aiohttp/client_reqrep.py", line 1054, in start
    with self._timer:
  File "/opt/conda/envs/py_3.11/lib/python3.11/site-packages/aiohttp/helpers.py", line 685, in __exit__
    raise asyncio.TimeoutError from exc_val
TimeoutError

@casper-hansen casper-hansen mentioned this pull request May 26, 2025
6 tasks
@casper-hansen
Copy link
Contributor

I ended up creating a PR #1702 @U-rara @wuxibin89. Please take a look

chenjiaoAngel added a commit to chenjiaoAngel/verl that referenced this pull request Nov 14, 2025
…onScheduler (verl-project#1588)

### Checklist Before Starting

- [ ] Search for similar PR(s).

### What does this PR do?

AsyncOpenAI has very severe performance issue due to httpx, replace it
to aiohttp client. For train_batch_size=1024, AsyncOpenAI introduces
~25s per generation phase.

### High-Level Design

> Demonstrate the high-level design if this PR is complex.

### Specific Changes

> List the specific changes.

### API

> Demonstrate how the API changes if any.

### Usage Example

> Provide usage example(s) for easier usage.

```python
# Add code snippet or script demonstrating how to use this 
```

### Test

> For changes that can not be tested by CI (e.g., algorithm
implementation, new model support), validate by experiment(s) and show
results like training curve plots, evaluatuion results, etc.

### Additional Info.

- **Issue Number**: Fixes issue # or discussion # if any.
- **Training**: [Note which backend this PR will affect: FSDP, Megatron,
both, or none]
- **Inference**: [Note which backend this PR will affect: vLLM, SGLang,
both, or none]

### Checklist Before Submitting

- [ ] Read the [Contribute
Guide](https://github.com/volcengine/verl?tab=readme-ov-file#contribution-guide).
- [ ] Apply [pre-commit
checks](https://github.com/volcengine/verl?tab=readme-ov-file#code-linting-and-formatting).
- [ ] Add `[BREAKING]` to the PR title if it breaks any API.
- [ ] Update the documentation about your changes in the
[docs](https://github.com/volcengine/verl/tree/main/docs).
- [ ] Add CI test(s) if necessary.
TimurTaepov pushed a commit to giorgossideris/verl that referenced this pull request Dec 20, 2025
…onScheduler (verl-project#1588)

### Checklist Before Starting

- [ ] Search for similar PR(s).

### What does this PR do?

AsyncOpenAI has very severe performance issue due to httpx, replace it
to aiohttp client. For train_batch_size=1024, AsyncOpenAI introduces
~25s per generation phase.

### High-Level Design

> Demonstrate the high-level design if this PR is complex.

### Specific Changes

> List the specific changes.

### API

> Demonstrate how the API changes if any.

### Usage Example

> Provide usage example(s) for easier usage.

```python
# Add code snippet or script demonstrating how to use this 
```

### Test

> For changes that can not be tested by CI (e.g., algorithm
implementation, new model support), validate by experiment(s) and show
results like training curve plots, evaluatuion results, etc.

### Additional Info.

- **Issue Number**: Fixes issue # or discussion # if any.
- **Training**: [Note which backend this PR will affect: FSDP, Megatron,
both, or none]
- **Inference**: [Note which backend this PR will affect: vLLM, SGLang,
both, or none]

### Checklist Before Submitting

- [ ] Read the [Contribute
Guide](https://github.com/volcengine/verl?tab=readme-ov-file#contribution-guide).
- [ ] Apply [pre-commit
checks](https://github.com/volcengine/verl?tab=readme-ov-file#code-linting-and-formatting).
- [ ] Add `[BREAKING]` to the PR title if it breaks any API.
- [ ] Update the documentation about your changes in the
[docs](https://github.com/volcengine/verl/tree/main/docs).
- [ ] Add CI test(s) if necessary.
vyomakesh0728 added a commit to vyomakesh0728/verl that referenced this pull request Jan 22, 2026
…onScheduler (verl-project#1588)

### Checklist Before Starting

- [ ] Search for similar PR(s).

### What does this PR do?

AsyncOpenAI has very severe performance issue due to httpx, replace it
to aiohttp client. For train_batch_size=1024, AsyncOpenAI introduces
~25s per generation phase.

### High-Level Design

> Demonstrate the high-level design if this PR is complex.

### Specific Changes

> List the specific changes.

### API

> Demonstrate how the API changes if any.

### Usage Example

> Provide usage example(s) for easier usage.

```python
# Add code snippet or script demonstrating how to use this 
```

### Test

> For changes that can not be tested by CI (e.g., algorithm
implementation, new model support), validate by experiment(s) and show
results like training curve plots, evaluatuion results, etc.

### Additional Info.

- **Issue Number**: Fixes issue # or discussion # if any.
- **Training**: [Note which backend this PR will affect: FSDP, Megatron,
both, or none]
- **Inference**: [Note which backend this PR will affect: vLLM, SGLang,
both, or none]

### Checklist Before Submitting

- [ ] Read the [Contribute
Guide](https://github.com/volcengine/verl?tab=readme-ov-file#contribution-guide).
- [ ] Apply [pre-commit
checks](https://github.com/volcengine/verl?tab=readme-ov-file#code-linting-and-formatting).
- [ ] Add `[BREAKING]` to the PR title if it breaks any API.
- [ ] Update the documentation about your changes in the
[docs](https://github.com/volcengine/verl/tree/main/docs).
- [ ] Add CI test(s) if necessary.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants