-
Notifications
You must be signed in to change notification settings - Fork 196
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
2:4 Sparse training #1425
Comments
cc @jcaip |
Hi @phyllispeng123 the tutorial describes a one-shot pruning flow, where we only calculate the mask once (before fine-tuning) and then train to update the weights. So if you're following the tutorial then I would expect the mask to be the same for each epoch. Sparsifier.step() should update the mask though, so if you've modified the code to call step() during the training, then this is unexpected. Can you share your code in that case? |
@jcaip Thank you for your reply !!!! I am doing the model training with respect to a transformer model. My simplified training code look like below. Hope you can give me some hints about weight finetuning together with mask finetuning, many thanks !!!!!
|
@phyllispeng123 can you try using the
|
Hi, when I was doing the model training using 2:4 semi-sparse mask following the tutorial https://pytorch.org/tutorials/prototype/semi_structured_sparse.html?highlight=transformer, I find that the
sparsity.step()
does not update mask in training ( I saved FakeSparsity.mask at each layer at each epoch)because the saved masks are always the same. I thought the 2:4 semi-sparse training will train the mask and also finetune the model iteself. Did I misunderstood?The text was updated successfully, but these errors were encountered: