Skip to content

Bug Report: ValueError with Paper Hyperparameters (training_batch_size vs ppo_mini_batch) #13

@ji-zhe

Description

@ji-zhe

Bug Context

When running the code with exactly the hyperparameters specified in the paper, i.e. rollout batch_size=32 and ppo_minibatch_size=128, a ValueError is raised:
training_batch_size must be greater than ppo_mini_batch

The error originates from verl/workers/config/actor.py at line 198, caused by an incorrect validation check on line 197.

Root Cause

The current validation logic does not account for rollout_n, leading to a false positive error when using the paper's official hyperparameters.

Fix Suggestion

Modify line 197 in verl/workers/config/actor.py from the original check to:

if train_batch_size * self.rollout_n < self.ppo_mini_batch_size:

This corrects the validation to use the total effective batch size (accounting for rollout_n) when comparing against ppo_mini_batch_size.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions