Conversation
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
|
/trl-ci |
| from liger_kernel.transformers import _apply_liger_kernel_to_instance | ||
|
|
||
| kernel_config = args.liger_kernel_config if args.liger_kernel_config is not None else {} | ||
| base_model = unwrap_peft_model(model) | ||
|
|
||
| if isinstance(base_model, PreTrainedModel): | ||
| _apply_liger_kernel_to_instance(model=base_model, **kernel_config) | ||
| else: | ||
| logger.warning("The model is not an instance of PreTrainedModel. No liger kernels will be applied.") |
PAT is not yet approved for this bot, I ran manually here https://github.com/huggingface/trl/actions/runs/21760373311 |
| @@ -546,18 +484,20 @@ def __init__( | |||
| ): | |||
| self.place_model_on_device = False | |||
There was a problem hiding this comment.
this pattern feels a bit off. args.place_model_on_device is a @property on args that calculates based on not is_sagemaker_mp_enabled(). A developer reading this might assume they would have some control over this arg and would want to manually affect this state. I'm pretty sure there are a whole host of edge cases as there are also about 5 or cases above that we currently account for, so making this settable from the TrainingArguments would be a nice to have.
There was a problem hiding this comment.
Indeed, this is a bit off. Let me fix this ! Note that for now, I'm only trying to do simple changes to make it easier to read. But I will start soon cleaning things that feels a bit off in Trainer =)
winglian
left a comment
There was a problem hiding this comment.
Generally, I love seeing this method being more compact 🤗. Added some additional thoughts, but can get addressed separate from this PR as well
Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>
Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>
Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>
Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>
| # Postpone switching model to cuda when MP, DeepSpeed, full bf16/fp16 eval, or FSDP | ||
| if args.place_model_on_device is not None: | ||
| self.place_model_on_device = args.place_model_on_device | ||
| elif ( |
There was a problem hiding this comment.
lmk if this is better @winglian. The changes should be BC as the user can still overwrite place_model_on_device as a property
|
(just a test, now that the PAT is approved) |
|
@bot /style |
|
Style fix bot fixed some files and pushed the changes. |
* refactor trainer init * update init * simplify liger * udapte * better * comments * do_train not reliable at all * This should make more sense now * Apply suggestion from @qgallouedec Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com> * Apply suggestion from @qgallouedec Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com> * Apply suggestion from @qgallouedec Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com> * Apply suggestion from @qgallouedec Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com> * move sagemaker mixed precision to validation * Apply repo consistency fixes --------- Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
What does this PR do?
This PR simplifies Trainer
__init__:_validate_args()method consolidates three scattered validation blocks (arg checks, optimizer checks, dataset checks) into one place