Skip to content

[Trainer] Allow passing image processor#29896

Merged
NielsRogge merged 2 commits intohuggingface:mainfrom
NielsRogge:add_image_processor_to_trainer
Apr 5, 2024
Merged

[Trainer] Allow passing image processor#29896
NielsRogge merged 2 commits intohuggingface:mainfrom
NielsRogge:add_image_processor_to_trainer

Conversation

@NielsRogge
Copy link
Copy Markdown
Collaborator

@NielsRogge NielsRogge commented Mar 27, 2024

What does this PR do?

Fixes #29790. Also reported here: https://discuss.huggingface.co/t/vitimageprocessor-object-has-no-attribute-pad/32511.

Currently, passing an image processor to the Trainer is pretty hacky, as users need to do tokenizer=image_processor. 🤔 yes, that's right.

This PR adds the image_processor argument to the Trainer, such that the default data collator is used by default in case an image processor is passed.

To do:

  • update example scripts to no longer use tokenizer=image_processor

@HuggingFaceDocBuilderDev
Copy link
Copy Markdown

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@NielsRogge NielsRogge requested a review from ArthurZucker March 27, 2024 10:01
Copy link
Copy Markdown
Collaborator

@ArthurZucker ArthurZucker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very good to have

self.place_model_on_device = False

default_collator = default_data_collator if tokenizer is None else DataCollatorWithPadding(tokenizer)
default_collator = DataCollatorWithPadding(tokenizer) if tokenizer is not None else default_data_collator
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

might not be BC if people pass a ImageProcessor. Let's deprecate passing image processors checking the type

Copy link
Copy Markdown
Collaborator Author

@NielsRogge NielsRogge Mar 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If people currently pass an image processor as tokenizer, then they need to pass a data collator as otherwise they'll hit the issue mentioned at #29790 (as image processors typically don't have a pad method). So it doesn't break backwards

@amyeroberts
Copy link
Copy Markdown
Contributor

Can we add enabling passing a feature_extractor and processor too?

@NielsRogge
Copy link
Copy Markdown
Collaborator Author

@ArthurZucker feel free to approve the PR, @amyeroberts I would add those in a separate PR

Copy link
Copy Markdown
Collaborator

@ArthurZucker ArthurZucker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, let's do a followup PR then for feature processor and else

@NielsRogge NielsRogge merged commit 1ab7136 into huggingface:main Apr 5, 2024
@daniellok-db
Copy link
Copy Markdown
Contributor

cc @NielsRogge @tomaarsen the addition of the new param to trainer_callback.CallbackHandler is currently causing problems for the setfit.Trainer class in the SetFit library. can we either make the param optional, or update setfit.Trainer to pass a value in for the new param?

@chenin-wang
Copy link
Copy Markdown
Contributor

self.callback_handler = CallbackHandler(callbacks, self.model, self.model.model_body.tokenizer, None, None, None)

Can we add enabling passing a feature_extractor and processor too?

@ArthurZucker feel free to approve the PR, @amyeroberts I would add those in a separate PR

useful, waiting...

@chenin-wang
Copy link
Copy Markdown
Contributor

cc @NielsRogge @tomaarsen the addition of the new param to trainer_callback.CallbackHandler is currently causing problems for the setfit.Trainer class in the SetFit library. can we either make the param optional, or update setfit.Trainer to pass a value in for the new param?

self.callback_handler = CallbackHandler(callbacks, self.model, self.model.model_body.tokenizer, None, None, None)

@chenin-wang
Copy link
Copy Markdown
Contributor

@NielsRogge @ArthurZucker For Preprocessing classes. Suggest modifying the tokenizer parameter name instead of adding parameters. Consider changing tokenizer to processors.

@NielsRogge
Copy link
Copy Markdown
Collaborator Author

Yes @chenin-wang I was also considering this, rather than having various attributes for tokenizer, image processor, feature extractor and multimodal processor it probably makes more sense to just have a single argument called processor. Although that would be a breaking change since many people alreasy use tokenizer=...

Will await opinion of Arthur and Amy

@chenin-wang
Copy link
Copy Markdown
Contributor

Yes @chenin-wang I was also considering this, rather than having various attributes for tokenizer, image processor, feature extractor and multimodal processor it probably makes more sense to just have a single argument called processor. Although that would be a breaking change since many people alreasy use tokenizer=...

Will await opinion of Arthur and Amy

You are right. Will change user habits.

@NielsRogge
Copy link
Copy Markdown
Collaborator Author

Reflecting more on this, I think the best way is to have a proper deprecation message.

Basically now whenever people pass the tokenizer argument, we should add a message saying "the tokenizer argument is going to be deprecated in v4.xx of Transformers and you need to update your code to pass processor instead."

=> this way people can safely update their code without having breaking changes from the start. Will open a follow-up PR for this.

NielsRogge added a commit that referenced this pull request Apr 9, 2024
* Undo

* Use tokenizer

* Undo data collator
@chenin-wang
Copy link
Copy Markdown
Contributor

@NielsRogge @amyeroberts The problem remains. #30102 (comment)

@NielsRogge
Copy link
Copy Markdown
Collaborator Author

Hi @chenin-wang could you clarify your issue?

@chenin-wang
Copy link
Copy Markdown
Contributor

@NielsRogge tokenizer=image_processor is a confusing API (as highlighted by a few issues from users) and considering we have more and more multimodal, audio and vision models increasingly out-of-date,what amyeroberts saied #30102 (comment).

@NielsRogge
Copy link
Copy Markdown
Collaborator Author

NielsRogge commented Apr 10, 2024

Yes @chenin-wang I agree. So indeed we should continue with #30102. However I'm thinking about the term preprocessor rather than processor, as processor is already used for multimodal processors like CLIP, BLIP-2, etc.

Not sure if people prefer processor=image_processor or preprocessor=image_processor.

Hence I'll ask the question to some HF members and see which term they prefer. Will then proceed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Missing preprocessor_config.json file after training segformer model

6 participants