Skip to content

Changes for transformers 5 weight conversion#3083

Open
BenjaminBossan wants to merge 15 commits intohuggingface:mainfrom
BenjaminBossan:transformers-weight-conversion-additions
Open

Changes for transformers 5 weight conversion#3083
BenjaminBossan wants to merge 15 commits intohuggingface:mainfrom
BenjaminBossan:transformers-weight-conversion-additions

Conversation

@BenjaminBossan
Copy link
Member

@BenjaminBossan BenjaminBossan commented Mar 5, 2026

See accompanying huggingface/transformers#44478.

  • better handling of swapped in and out features
  • move PEFT config update functions to PEFT
  • move PEFT-specific weight conversion logic to PEFT

Note that the newly added tests will fail until a new transformers release with the linked PR is out. This should be v5.4, so the corresponding tests only run with that transformers version. I locally tested with the current main branch and the tests pass.

- better handling of swapped in and out features
- move PEFT config update functions to PEFT
This allows the weight conversion to be correctly applied without going
through transformer_model.load_adapter.
@BenjaminBossan BenjaminBossan marked this pull request as ready for review March 6, 2026 17:03
Move weight conversion code to its own module.
@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@BenjaminBossan
Copy link
Member Author

@githubnemo The PR should now be ready for review.

@BenjaminBossan BenjaminBossan changed the title [WIP] Changes for transformers 5 weight conversion Changes for transformers 5 weight conversion Mar 11, 2026
- always apply in/out feature swapping for MoE params
- add a test for this with Qwen3 MoE
- expose swapping argument to provide escape hatch
Whether to tie weights or not after peft initialization. This will ensure that the adapters added to the
tied layers are also tied. This is only applicable for layers passed via `modules_to_save` and
`target_modules`.
param_wrapper_swap_in_out_features (`bool`, *optional)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this parameter used to resolve #3112? If so, maybe automatic detection would be better?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my latest commit, I changed the code to use module.is_transposed.

@jeejeelee
Copy link

jeejeelee commented Mar 23, 2026

I tested your branch, the saved LoRA weights for qwen35-moe still have the same issues,see: #3112

@BenjaminBossan
Copy link
Member Author

I tested your branch, the saved LoRA weights for qwen35-moe still have the same issues,see: #3112

Could you please show a small reproducer for that error?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants