[ckpt, model] fix: preserve lora_alpha in model_merger via training meta by Yatogaii · Pull Request #5326 · verl-project/verl

Yatogaii · 2026-02-15T02:39:46Z

What does this PR do?

This PR fixes the issue where model_merger generates a merged LoRA adapter with lora_alpha=0, even though a non-zero value was specified during training.

The fix is done by persisting the LoRA training-time config and updating the base merger logic to read it as the source of truth during merge.

Fixes: [BUG] model_merger sets lora_alpha=0 when merging SFT LoRA checkpoints #5325
Related: lora_alpha is set to 0.0 by dafault without warning in base_model_merger.py #3050
Related: [SFT] fix: lora_alpha set default to twice lora_rank with warning #3100

Key points:

Save LoRA config during training into lora_train_meta.json.
Update base merger logic to load lora_train_meta.json when merging, ensuring lora_alpha is correct.
This approach is self-contained and deterministic (no inference / heuristic fallback).

Compared to #3100:

[SFT] fix: lora_alpha set default to twice lora_rank with warning #3100 adds a warning and uses a heuristic fallback (lora_alpha = rank * 2).
Existing approach infers rank from checkpoint .pt weights instead of reading the original training configuration.
This PR directly reads the training-time config saved by verl, avoiding ambiguity and incorrect defaults.

Checklist Before Starting

Search for similar PRs. Paste at least one query link here:
- fix(model_merger): add LoRA weight merging support #4704
- [SFT] fix: lora_alpha set default to twice lora_rank with warning #3100
Format the PR title as [{modules}] {type}: {description} (This will be checked by the CI)

Test

This change is not covered by CI.

I validated the fix with a local end-to-end workflow:

Train SFT + LoRA with:
- model.lora_rank=8
- model.lora_alpha=16
Merge with:
```
python3 -m verl.model_merger merge ...
```
Verify the merged adapter config:
- Before: lora_alpha = 0
- After: lora_alpha = 16

API and Usage Example

No API changes are introduced.

This PR only adds an additional training artifact:

lora_train_meta.json

The merge process remains unchanged for users.

Design & Code Changes

Design

Training stage persists LoRA-related configuration into lora_train_meta.json.
Merge stage loads lora_train_meta.json and uses it as the source of truth for LoRA parameters.

This PR intentionally does not rely on adapter_config.json, since that file may be ambiguous (it may represent merged/runtime state rather than training-time configuration).

Code Changes

Save LoRA training metadata during SFT training.
Update model_merger base merge logic:
- Prefer reading lora_train_meta.json
- Avoid inferring LoRA parameters from .pt weights

Checklist Before Submitting

Important

Please check all the following items before requesting a review, otherwise the reviewer might deprioritize this PR for review.

Read the [Contribute Guide](https://github.com/volcengine/verl/blob/main/CONTRIBUTING.md).
Apply [pre-commit checks](https://github.com/volcengine/verl/blob/main/CONTRIBUTING.md#code-linting-and-formatting): pre-commit install && pre-commit run --all-files --show-diff-on-failure --color=always
Add / Update [the documentation](https://github.com/volcengine/verl/tree/main/docs).
Add unit or end-to-end test(s) to [the CI workflow](https://github.com/volcengine/verl/tree/main/.github/workflows) to cover all the code. If not feasible, explain why:
- Not feasible for CI at the moment because it requires an end-to-end LoRA training + merge pipeline.
[] Once your PR is ready for CI, send a message in the ci-request channel in the verl Slack workspace. (If not accessible, please try [the Feishu group (飞书群)](https://applink.larkoffice.com/client/chat/chatter/add_by_link?link_token=772jd4f1-cd91-441e-a820-498c6614126a).)
If your PR is related to the recipe submodule, please also update the reference to the submodule commit via git submodule update --remote or cd recipe && git pull origin main.

…th warning fallback

CLAassistant · 2026-02-15T02:39:54Z

All committers have signed the CLA.

gemini-code-assist

Code Review

This pull request introduces a solid fix for preserving lora_alpha during model merging by persisting LoRA training metadata. The approach of saving lora_train_meta.json during training and then using it as the source of truth in the merger is clean and effective. The changes in base_model_merger.py to read this metadata and handle potential mismatches with warnings are well-implemented. I've identified one critical issue in the checkpoint saving logic that could lead to a crash if certain configuration values are None. Please see my detailed comment.

verl/trainer/fsdp_sft_trainer.py

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

Yatogaii added 2 commits February 15, 2026 10:14

feat(fsdp_sft): persist lora rank/alpha metadata in checkpoint

ac6de71

feat(model_merger): load lora rank/alpha from lora_train_meta.json wi…

0eca25b

…th warning fallback

Yatogaii requested review from PeterSH6, eric-haibin-lin, tongyx361 and vermouth1992 as code owners February 15, 2026 02:39

gemini-code-assist bot reviewed Feb 15, 2026

View reviewed changes

verl/trainer/fsdp_sft_trainer.py Outdated Show resolved Hide resolved

[trainer] fix: avoid crash when serializing LoRA meta

7ccfe9c

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

[ckpt, model] fix: preserve lora_alpha in model_merger via training meta#5326

[ckpt, model] fix: preserve lora_alpha in model_merger via training meta#5326
Yatogaii wants to merge 3 commits intoverl-project:mainfrom
Yatogaii:fix/lora_alpha_set_0

Yatogaii commented Feb 15, 2026

Uh oh!

CLAassistant commented Feb 15, 2026 •

edited

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Comments

Conversation

Yatogaii commented Feb 15, 2026

What does this PR do?

Checklist Before Starting

Test

API and Usage Example

Design & Code Changes

Design

Code Changes

Checklist Before Submitting

Uh oh!

CLAassistant commented Feb 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

CLAassistant commented Feb 15, 2026 •

edited

Loading