Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Weight conversion to safetensors format #153

Closed
Adversarian opened this issue Apr 14, 2024 · 2 comments · May be fixed by #157
Closed

Weight conversion to safetensors format #153

Adversarian opened this issue Apr 14, 2024 · 2 comments · May be fixed by #157

Comments

@Adversarian
Copy link

In the wake of security concerns associated with the pickle module (see this), almost all of the new entries at HF are in the new safetensors format which seems to be becoming the de facto replacement for pickle at this rate (see this by Cohere for instance). Not only has it been shown that unlike pickle, arbitrary code execution is impossible through safetensors, safetensors also reduces potential RAM usage for bigger models through sharding and it demonstrates a loading speed-up of up to 2x on GPU and up to 76.6x on CPU.

I would like to propose a simple solution that allows you to serve your trained models on HF using the new safetensors format. The official safetensors HF page has a space that allows for very simple conversion of already existing models. This page, although very streamlined, has the drawback that existing model weights need to follow a certain naming convention (pytorch_model.bin or pytorch_model-xxxx-of-xxxx.bin).

Moreover, for offline conversion and a more customizable approach, I can refer you to a script from the text-generation-webui repository which is an actively maintained repository, and this.

I understand that this may not be a priority at this time and I can potentially volunteer a PR myself later when I find some free time if this pitch sounds reasonable to you.

I'd like to close this out by thanking you for your tremendous work on advancing Persian NLP and saving countless hours of research time.

@arxyzan
Copy link
Member

arxyzan commented Apr 15, 2024

Hi @Adversarian, thanks a lot for the suggestion.
Actually, I worked on this a while back. As a first step, I tried to provide safetensors format for our current weights alongside the old .pt weights for the same reasons you mentioned, but unfortunately I ran into some errors when trying to convert Hezar Model instances to safetensors format. I don't actually remember what the error was but I'm pretty sure I put a good amount of time into it. The thing is that Hezar Model classes are regular PyTorch nn.Module subclasses which must be (in theory) easily serialized using safetensors so I do know that it's definitely possible. If you're willing to help I'm down to start this task again. Since it's pretty straightforward to test it out, I'd be glad if you could try it on a few of our models and share the results.

Best,
Aryan

@Adversarian
Copy link
Author

Adversarian commented Apr 22, 2024

I'm afraid you were right. Since hezar doesn't piggyback on HF for model save/load utilities, adding safetensors support is not trivial and requires a fair bit of alteration to the codebase. In the spirit of "not re-inventing the wheel", I strongly recommend sub-classing HF's AutoModel classes to minimize technical debt and facilitate further development efforts in the future. However, I'm working on a quick-and-dirty proof of concept PR which I will submit as soon as I have a MWE.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants