Multiple forward per backward #81

ClashLuke · 2022-09-11T20:45:58Z

Currently, our model does one forward pass and uses the intermediate states to do one backward pass. However, a backward pass is over 3x as expensive as a forward pass, so we could change the ratio of forward to backward passes to speed up the model.
One such approach would be MESA, which adds KL(model(x), ema_model(x)). Another method is RHO-Loss, which prioritizes some samples over others, by running (model(x) - oracle(x)).topk(). Both of these methods claim to improve sample efficiency by up to 18x.

The text was updated successfully, but these errors were encountered:

ClashLuke added research Creative project that might fail but could give high returns engineering Software-engineering problems that don't require ML-Expertise core Improves core model while keeping core idea intact labels Sep 11, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Multiple forward per backward #81

Multiple forward per backward #81

ClashLuke commented Sep 11, 2022

Multiple forward per backward #81

Multiple forward per backward #81

Comments

ClashLuke commented Sep 11, 2022