Refactor trainer init by SunMarc · Pull Request #43807 · huggingface/transformers

SunMarc · 2026-02-06T16:18:12Z

What does this PR do?

This PR simplifies Trainer __init__:

Quantization validation extracted
PEFT unwrapping deduplicated
Liger Kernel extracted — apply_liger_kernel
Label smoother simplified
Validations grouped — _validate_args() method consolidates three scattered validation blocks (arg checks, optimizer checks, dataset checks) into one place
and more

HuggingFaceDocBuilderDev · 2026-02-06T16:27:03Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

qgallouedec · 2026-02-06T17:37:18Z

/trl-ci

SunMarc · 2026-02-06T17:48:37Z

src/transformers/integrations/liger.py

+    from liger_kernel.transformers import _apply_liger_kernel_to_instance
+
+    kernel_config = args.liger_kernel_config if args.liger_kernel_config is not None else {}
+    base_model = unwrap_peft_model(model)
+
+    if isinstance(base_model, PreTrainedModel):
+        _apply_liger_kernel_to_instance(model=base_model, **kernel_config)
+    else:
+        logger.warning("The model is not an instance of PreTrainedModel. No liger kernels will be applied.")


simpler now

src/transformers/trainer.py

qgallouedec · 2026-02-06T17:52:31Z

/trl-ci

PAT is not yet approved for this bot, I ran manually here https://github.com/huggingface/trl/actions/runs/21760373311

winglian · 2026-02-06T18:02:13Z

src/transformers/trainer.py

@@ -546,18 +484,20 @@ def __init__(
        ):
            self.place_model_on_device = False


this pattern feels a bit off. args.place_model_on_device is a @property on args that calculates based on not is_sagemaker_mp_enabled(). A developer reading this might assume they would have some control over this arg and would want to manually affect this state. I'm pretty sure there are a whole host of edge cases as there are also about 5 or cases above that we currently account for, so making this settable from the TrainingArguments would be a nice to have.

Indeed, this is a bit off. Let me fix this ! Note that for now, I'm only trying to do simple changes to make it easier to read. But I will start soon cleaning things that feels a bit off in Trainer =)

winglian

Generally, I love seeing this method being more compact 🤗. Added some additional thoughts, but can get addressed separate from this PR as well

src/transformers/integrations/liger.py

src/transformers/trainer.py

qgallouedec

lgtm, just a question and suggestions

src/transformers/trainer.py

Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>

SunMarc · 2026-02-06T18:51:19Z

src/transformers/trainer.py

+        # Postpone switching model to cuda when MP, DeepSpeed, full bf16/fp16 eval, or FSDP
+        if args.place_model_on_device is not None:
+            self.place_model_on_device = args.place_model_on_device
+        elif (


lmk if this is better @winglian. The changes should be BC as the user can still overwrite place_model_on_device as a property

qgallouedec · 2026-02-09T14:56:52Z

(just a test, now that the PAT is approved)
/trl-ci

SunMarc · 2026-02-10T14:39:51Z

@bot /style

github-actions · 2026-02-10T14:40:36Z

Style fix bot fixed some files and pushed the changes.

@qgallouedec

* refactor trainer init * update init * simplify liger * udapte * better * comments * do_train not reliable at all * This should make more sense now * Apply suggestion from @qgallouedec Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com> * Apply suggestion from @qgallouedec Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com> * Apply suggestion from @qgallouedec Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com> * Apply suggestion from @qgallouedec Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com> * move sagemaker mixed precision to validation * Apply repo consistency fixes --------- Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

refactor trainer init

72d47ca

update init

4fcc2de

SunMarc requested review from ArthurZucker, kashif, qgallouedec and winglian February 6, 2026 17:31

SunMarc added 2 commits February 6, 2026 17:42

simplify liger

908112c

udapte

87518e8

SunMarc commented Feb 6, 2026

View reviewed changes

winglian reviewed Feb 6, 2026

View reviewed changes

src/transformers/trainer.py Outdated Show resolved Hide resolved

better

0700adb

winglian reviewed Feb 6, 2026

View reviewed changes

winglian approved these changes Feb 6, 2026

View reviewed changes

SunMarc added 2 commits February 6, 2026 18:04

comments

be059e5

do_train not reliable at all

425d86e

qgallouedec reviewed Feb 6, 2026

View reviewed changes

src/transformers/integrations/liger.py Outdated Show resolved Hide resolved

src/transformers/integrations/liger.py Outdated Show resolved Hide resolved

src/transformers/integrations/liger.py Outdated Show resolved Hide resolved

src/transformers/trainer.py Outdated Show resolved Hide resolved

qgallouedec approved these changes Feb 6, 2026

View reviewed changes

src/transformers/trainer.py Show resolved Hide resolved

SunMarc and others added 5 commits February 6, 2026 18:44

This should make more sense now

7a31486

Apply suggestion from @qgallouedec

dc9c387

Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>

Apply suggestion from @qgallouedec

112b682

Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>

Apply suggestion from @qgallouedec

e616721

Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>

Apply suggestion from @qgallouedec

dc1b876

Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>

SunMarc commented Feb 6, 2026

View reviewed changes

SunMarc mentioned this pull request Feb 6, 2026

Tell Us: What Would Make Trainer Better? #43595

Open

Merge branch 'main' into refactor-trainer

b16ab6c

SunMarc requested a review from winglian February 9, 2026 16:03

SunMarc and others added 2 commits February 9, 2026 16:09

move sagemaker mixed precision to validation

7e8865a

Merge branch 'main' into refactor-trainer

427d1bd

Apply repo consistency fixes

4329338

SunMarc merged commit e12aa2d into main Feb 10, 2026
26 checks passed

SunMarc deleted the refactor-trainer branch February 10, 2026 15:00

		@@ -546,18 +484,20 @@ def __init__(
		):
		self.place_model_on_device = False

Conversation

SunMarc commented Feb 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Uh oh!

HuggingFaceDocBuilderDev commented Feb 6, 2026

Uh oh!

qgallouedec commented Feb 6, 2026

Uh oh!

SunMarc Feb 6, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

qgallouedec commented Feb 6, 2026

Uh oh!

winglian Feb 6, 2026

Choose a reason for hiding this comment

Uh oh!

SunMarc Feb 6, 2026

Choose a reason for hiding this comment

Uh oh!

winglian left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

qgallouedec left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

SunMarc Feb 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

qgallouedec commented Feb 9, 2026

Uh oh!

SunMarc commented Feb 10, 2026

Uh oh!

github-actions bot commented Feb 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

SunMarc commented Feb 6, 2026 •

edited

Loading

qgallouedec left a comment •

edited

Loading

SunMarc Feb 6, 2026 •

edited

Loading

github-actions bot commented Feb 10, 2026 •

edited

Loading