Skip to content

Conversation

@tomaarsen
Copy link
Member

Supersedes #327

Hello!

Pull Request overiew

  • Save differentiable model head on CPU
  • Move differentiable model heads to the right device after loading

Details

See #327 for more information:

If we train a differentiable head on a CUDA machine, the head checkpoint cannot be loaded via Joblib on CPU machine (Serialization error).
To address this issue, I modified the code to store the head checkpoint with CPU mapping.
When we load the head checkpoint, we map it to the target device.

Thanks @karter-liner for providing the PR. I've simply applied roughly their changes on top of the upcoming v1.0.0-pre branch.

  • Tom Aarsen

And move models to the right head after loading

Co-authored-by: karter-liner <[email protected]>
@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint.

@tomaarsen tomaarsen merged commit 4123609 into huggingface:v1.0.0-pre Nov 24, 2023
@tomaarsen tomaarsen deleted the feat/cpu_load_diff_head branch November 24, 2023 12:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants