[veomni] feat: enable VeOmni engine for on-policy distillation by hjshi84 · Pull Request #6072 · verl-project/verl

hjshi84 · 2026-04-20T11:00:18Z

What does this PR do?

Add VeOmni strategy support to the distillation and separation worker components so that VeOmniEngine can be used as a training backend in on-policy distillation (OPD) scenarios.

Design & Code Changes

verl/trainer/distillation/losses.py: Match 'veomni' alongside 'fsdp' in compute_topk_loss() so VeOmni can reuse FSDP's topk forward KL loss implementation. VeOmni inherits from FSDP2 and shares the same loss computation logic.
verl/experimental/separation/engine_workers.py: Add 'veomni' to DetachActorWorker's save/restore strategy handlers (fsdp2 sharded save/load), enabling model offload in separation mode. Also pass **kwargs through __init__ to parent class for distillation_config compatibility.
examples/on_policy_distillation_trainer/run_qwen_gsm8k_veomni.sh: Add example script for testing OPD with VeOmni backend, following the model_engine=veomni configuration pattern with actor.veomni.* settings.

API and Usage Example

export DATA_PATH=/path/to/data
bash examples/on_policy_distillation_trainer/run_qwen_gsm8k_veomni.sh

Test

Verified with GSM8K dataset using Qwen2.5-0.5B (student) and Qwen2.5-3B-Instruct (teacher) on 2+4 GPUs.

Checklist Before Starting

Search for similar PRs: no existing PR adds veomni OPD support.
PR title follows [veomni, trainer] feat: ... format.

Checklist Before Submitting

Read the Contribute Guide.
Apply pre-commit checks: pre-commit install && pre-commit run --all-files --show-diff-on-failure --color=always
Add unit or end-to-end test(s) to cover the code. If not feasible, explain why: this is an integration-level feature requiring multi-GPU and the veomni external dependency; e2e testing covered by manual script validation.
Add / Update the documentation.

gemini-code-assist

Code Review

This pull request introduces support for the veomni engine in on-policy distillation, including a new example script for GSM8K and updates to worker initialization and loss computation. Feedback suggests explicitly defining the distillation_config parameter in the DetachActorWorker constructor to prevent potential TypeError issues when arguments are passed positionally.

Add VeOmni strategy support to the distillation and separation worker components so that VeOmniEngine can be used as a training backend in on-policy distillation (OPD) scenarios. Changes: - Match 'veomni' alongside 'fsdp' in compute_topk_loss() so VeOmni can reuse FSDP's topk forward KL loss implementation. - Add 'veomni' to DetachActorWorker's save/restore strategy handlers (fsdp2 sharded save/load), enabling model offload in separation mode. - Pass **kwargs through DetachActorWorker.__init__ to parent class for distillation_config compatibility. - Add example script run_qwen_gsm8k_veomni.sh for testing OPD with VeOmni backend.

wuxibin89 · 2026-04-21T02:18:54Z

    """

-    def __init__(self, config: DictConfig, role: str):
+    def __init__(self, config: DictConfig, role: str, distillation_config: Optional[DistillationConfig] = None, **kwargs):


We haven't support distillation in fully async training yet.

This change is not about enabling distillation in fully async training. It's a signature alignment fix — the parent class ActorRolloutRefWorker.init already accepts distillation_config and **kwargs, and ray_trainer.py already passes distillation_config when constructing the worker. The previous DetachActorWorker.init(self, config, role) silently dropped these arguments because its signature was narrower than the base class. This PR simply makes the subclass properly forward them to super().init.

Got it, I think pass **kwargs should be enough?

yes, however gemini suggests explicitly defining the distillation_config parameter in the DetachActorWorker constructor to prevent potential TypeError issues when arguments are passed positionally TT

wuxibin89 · 2026-04-21T03:02:02Z

Please format code: https://github.com/verl-project/verl/blob/main/CONTRIBUTING.md#code-linting-and-formatting

- Explain why VeOmni shares FSDP2 save/load handlers (inherits FSDPEngine, parameters are DTensors compatible with fsdp2_sharded_save/load_from_cpu) - Document known caveat: param_offload=True may cause device mismatch in save_model_to_cpu / restore_model_from_cpu (historical issue) - Add inline comment in compute_topk_loss clarifying VeOmni uses FSDP loss path - Update class docstring to list VeOmni as a supported strategy

gemini-code-assist bot reviewed Apr 20, 2026

View reviewed changes

Comment thread verl/experimental/separation/engine_workers.py Outdated

hjshi84 force-pushed the feat/veomni-opd-support branch from f814978 to 5c081d8 Compare April 20, 2026 11:13

hjshi84 marked this pull request as ready for review April 21, 2026 01:44

hjshi84 requested review from ArronHZG, PeterSH6, eric-haibin-lin, tongyx361, vermouth1992 and wuxibin89 as code owners April 21, 2026 01:44

wuxibin89 reviewed Apr 21, 2026

View reviewed changes

wuxibin89 previously approved these changes Apr 21, 2026

View reviewed changes

hjshi84 dismissed wuxibin89’s stale review via d1422c1 April 21, 2026 03:12

hjshi84 force-pushed the feat/veomni-opd-support branch from 86dc2fe to d1422c1 Compare April 21, 2026 03:12

wuxibin89 approved these changes Apr 21, 2026

View reviewed changes

wuxibin89 merged commit 0114e2a into verl-project:main Apr 21, 2026
62 of 74 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[veomni] feat: enable VeOmni engine for on-policy distillation#6072

[veomni] feat: enable VeOmni engine for on-policy distillation#6072
wuxibin89 merged 2 commits intoverl-project:mainfrom
hjshi84:feat/veomni-opd-support

hjshi84 commented Apr 20, 2026 •

edited

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

wuxibin89 Apr 21, 2026

Uh oh!

hjshi84 Apr 21, 2026

Uh oh!

wuxibin89 Apr 21, 2026

Uh oh!

hjshi84 Apr 21, 2026

Uh oh!

wuxibin89 commented Apr 21, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

hjshi84 commented Apr 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Design & Code Changes

API and Usage Example

Test

Checklist Before Starting

Checklist Before Submitting

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

wuxibin89 Apr 21, 2026

Choose a reason for hiding this comment

Uh oh!

hjshi84 Apr 21, 2026

Choose a reason for hiding this comment

Uh oh!

wuxibin89 Apr 21, 2026

Choose a reason for hiding this comment

Uh oh!

hjshi84 Apr 21, 2026

Choose a reason for hiding this comment

Uh oh!

wuxibin89 commented Apr 21, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

hjshi84 commented Apr 20, 2026 •

edited

Loading