Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[PyTorchModelHubMixin] Fix saving model with shared tensors #2086

Merged
merged 4 commits into from
Mar 6, 2024

Conversation

NielsRogge
Copy link
Contributor

@NielsRogge NielsRogge commented Mar 4, 2024

This PR uses save_model instead of save_file in order to properly save shared tensors for the PyTorchModelHubMixin.

In the future, this may be replaced by save_file again when we support saving sharded checkpoints, which deduplicates shared tensors as in the Transformers library.

@NielsRogge NielsRogge changed the title Feature/add save model [PyTorchModelHubMixin] Use save_model instead of save_file Mar 4, 2024
@NielsRogge NielsRogge requested a review from Wauplin March 4, 2024 13:56
@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@codecov-commenter
Copy link

codecov-commenter commented Mar 4, 2024

Codecov Report

Attention: Patch coverage is 0% with 6 lines in your changes are missing coverage. Please review.

Project coverage is 82.92%. Comparing base (0ab9391) to head (3cc9f71).

❗ Current head 3cc9f71 differs from pull request most recent head eb0111f. Consider uploading reports for the commit eb0111f to get more accurate results

Files Patch % Lines
src/huggingface_hub/hub_mixin.py 0.00% 6 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2086      +/-   ##
==========================================
+ Coverage   82.90%   82.92%   +0.02%     
==========================================
  Files         102      102              
  Lines        9480     9477       -3     
==========================================
  Hits         7859     7859              
+ Misses       1621     1618       -3     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@Wauplin
Copy link
Contributor

Wauplin commented Mar 5, 2024

Thanks for the PR @NielsRogge. I've added a test + used safetensors.torch.load_model to correctly load models with shared tensors. Everything should be fine now.

Regarding load_model, it does not support the device parameter at the moment. I have open a PR to fix this in safetensors directly: huggingface/safetensors#449. Will add back proper support once merged. Current workaround is to load on cpu first and then move to gpu (=> increase loading time but 🤷)

Copy link
Contributor

@Wauplin Wauplin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's get this merged :)

@Wauplin Wauplin changed the title [PyTorchModelHubMixin] Use save_model instead of save_file [PyTorchModelHubMixin] Fix saving model with shared tensors Mar 6, 2024
@Wauplin Wauplin merged commit 4286577 into huggingface:main Mar 6, 2024
14 checks passed
Wauplin added a commit that referenced this pull request Mar 6, 2024
* Use save_model

* Use model

* add tests + comments + load_model

* move to device

---------

Co-authored-by: Lucain Pouget <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants