You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I've noticed the superloss implementation is similar with AlanChou's unoffical implementation (https://github.com/AlanChou/Super-Loss). Both of which used scipy to calculate lambertw. However, as stated in AlanChou's implementation, quoted:
The labertw function should be implemented with PyTorch instead of using the scipy library as mentioned in AlanChou/Truncated-Loss#3 (comment).
There is a mistake because the additive regularization part doesn't have any gradients for Autograd.
Does this implementation solve the above problem?
The text was updated successfully, but these errors were encountered:
I've noticed the superloss implementation is similar with AlanChou's unoffical implementation (https://github.com/AlanChou/Super-Loss). Both of which used scipy to calculate lambertw. However, as stated in AlanChou's implementation, quoted:
Does this implementation solve the above problem?
The text was updated successfully, but these errors were encountered: