-
Notifications
You must be signed in to change notification settings - Fork 54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Weight conversion to safetensors
format
#153
Comments
Hi @Adversarian, thanks a lot for the suggestion. Best, |
I'm afraid you were right. Since |
In the wake of security concerns associated with the
pickle
module (see this), almost all of the new entries at HF are in the newsafetensors
format which seems to be becoming the de facto replacement forpickle
at this rate (see this by Cohere for instance). Not only has it been shown that unlikepickle
, arbitrary code execution is impossible throughsafetensors
,safetensors
also reduces potential RAM usage for bigger models through sharding and it demonstrates a loading speed-up of up to 2x on GPU and up to 76.6x on CPU.I would like to propose a simple solution that allows you to serve your trained models on HF using the new
safetensors
format. The officialsafetensors
HF page has a space that allows for very simple conversion of already existing models. This page, although very streamlined, has the drawback that existing model weights need to follow a certain naming convention (pytorch_model.bin
orpytorch_model-xxxx-of-xxxx.bin
).Moreover, for offline conversion and a more customizable approach, I can refer you to a script from the text-generation-webui repository which is an actively maintained repository, and this.
I understand that this may not be a priority at this time and I can potentially volunteer a PR myself later when I find some free time if this pitch sounds reasonable to you.
I'd like to close this out by thanking you for your tremendous work on advancing Persian NLP and saving countless hours of research time.
The text was updated successfully, but these errors were encountered: