Skip to content

Comments

[ckpt, model] fix: preserve lora_alpha in model_merger via training meta#5326

Open
Yatogaii wants to merge 3 commits intoverl-project:mainfrom
Yatogaii:fix/lora_alpha_set_0
Open

[ckpt, model] fix: preserve lora_alpha in model_merger via training meta#5326
Yatogaii wants to merge 3 commits intoverl-project:mainfrom
Yatogaii:fix/lora_alpha_set_0

Conversation

@Yatogaii
Copy link

What does this PR do?

This PR fixes the issue where model_merger generates a merged LoRA adapter with lora_alpha=0, even though a non-zero value was specified during training.

The fix is done by persisting the LoRA training-time config and updating the base merger logic to read it as the source of truth during merge.

Key points:

  1. Save LoRA config during training into lora_train_meta.json.
  2. Update base merger logic to load lora_train_meta.json when merging, ensuring lora_alpha is correct.
  3. This approach is self-contained and deterministic (no inference / heuristic fallback).

Compared to #3100:

  • [SFT] fix: lora_alpha set default to twice lora_rank with warning #3100 adds a warning and uses a heuristic fallback (lora_alpha = rank * 2).
  • Existing approach infers rank from checkpoint .pt weights instead of reading the original training configuration.
  • This PR directly reads the training-time config saved by verl, avoiding ambiguity and incorrect defaults.

Checklist Before Starting


Test

This change is not covered by CI.

I validated the fix with a local end-to-end workflow:

  1. Train SFT + LoRA with:

    • model.lora_rank=8
    • model.lora_alpha=16
  2. Merge with:

    python3 -m verl.model_merger merge ...
  3. Verify the merged adapter config:

    • Before: lora_alpha = 0
    • After: lora_alpha = 16

API and Usage Example

No API changes are introduced.

This PR only adds an additional training artifact:

  • lora_train_meta.json

The merge process remains unchanged for users.


Design & Code Changes

Design

  • Training stage persists LoRA-related configuration into lora_train_meta.json.
  • Merge stage loads lora_train_meta.json and uses it as the source of truth for LoRA parameters.

This PR intentionally does not rely on adapter_config.json, since that file may be ambiguous (it may represent merged/runtime state rather than training-time configuration).

Code Changes

  • Save LoRA training metadata during SFT training.

  • Update model_merger base merge logic:

    • Prefer reading lora_train_meta.json
    • Avoid inferring LoRA parameters from .pt weights

Checklist Before Submitting

Important

Please check all the following items before requesting a review, otherwise the reviewer might deprioritize this PR for review.

@CLAassistant
Copy link

CLAassistant commented Feb 15, 2026

CLA assistant check
All committers have signed the CLA.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a solid fix for preserving lora_alpha during model merging by persisting LoRA training metadata. The approach of saving lora_train_meta.json during training and then using it as the source of truth in the merger is clean and effective. The changes in base_model_merger.py to read this metadata and handle potential mismatches with warnings are well-implemented. I've identified one critical issue in the checkpoint saving logic that could lead to a crash if certain configuration values are None. Please see my detailed comment.

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] model_merger sets lora_alpha=0 when merging SFT LoRA checkpoints

2 participants