Autograd of superloss #1

yuanpinz · 2023-06-27T08:43:34Z

I've noticed the superloss implementation is similar with AlanChou's unoffical implementation (https://github.com/AlanChou/Super-Loss). Both of which used scipy to calculate lambertw. However, as stated in AlanChou's implementation, quoted:

The labertw function should be implemented with PyTorch instead of using the scipy library as mentioned in AlanChou/Truncated-Loss#3 (comment).

There is a mistake because the additive regularization part doesn't have any gradients for Autograd.

Does this implementation solve the above problem?

RishabhMaheshwary · 2023-07-04T08:12:08Z

There are some implementations of lambertw using pytorch as mentioned here pytorch/pytorch#49851 (comment).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Autograd of superloss #1

Autograd of superloss #1

yuanpinz commented Jun 27, 2023

RishabhMaheshwary commented Jul 4, 2023 •

edited

Loading

Autograd of superloss #1

Autograd of superloss #1

Comments

yuanpinz commented Jun 27, 2023

RishabhMaheshwary commented Jul 4, 2023 • edited Loading

RishabhMaheshwary commented Jul 4, 2023 •

edited

Loading