You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
was training a model with cyclic learning rate and after 7th epoch i get nan validation loss..isn't it "exploding gradient problem"? will gradient accumulation be able to solve this issue? i don't get such errors when i try adam or adamW.
i am facing this problem whenever i try deepmemory or diffgrad optimizer,any help?
training statistics so far of the model i am trying :
Train Epoch: 0 LR: 0.0049400000 Loss: 1.878861
Dev loss: 2.1991
Train Epoch: 1 LR: 0.0086800000 Loss: 1.840539
Dev loss: 1.7849
Train Epoch: 2 LR: 0.0075800000 Loss: 1.847198
Dev loss: 1.9127
Train Epoch: 3 LR: 0.0038400000 Loss: 1.287447
Dev loss: 1.3331
Train Epoch: 4 LR: 0.0023000000 Loss: 1.416327
Dev loss: 1.2588
Train Epoch: 5 LR: 0.0060400000 Loss: 1.299999
Dev loss: 1.4838
Train Epoch: 6 LR: 0.0097800000 Loss: 1.540868
Dev loss: 1.5280
Train Epoch: 7 LR: 0.0064800000 Loss: 1.790969
Dev loss: 1.2738
Train Epoch: 8 LR: 0.0027400000 Loss: 1.092477
Dev loss: nan
The text was updated successfully, but these errors were encountered:
was training a model with cyclic learning rate and after 7th epoch i get nan validation loss..isn't it "exploding gradient problem"? will gradient accumulation be able to solve this issue? i don't get such errors when i try adam or adamW.
i am facing this problem whenever i try deepmemory or diffgrad optimizer,any help?
training statistics so far of the model i am trying :
Train Epoch: 0 LR: 0.0049400000 Loss: 1.878861
Dev loss: 2.1991
Train Epoch: 1 LR: 0.0086800000 Loss: 1.840539
Dev loss: 1.7849
Train Epoch: 2 LR: 0.0075800000 Loss: 1.847198
Dev loss: 1.9127
Train Epoch: 3 LR: 0.0038400000 Loss: 1.287447
Dev loss: 1.3331
Train Epoch: 4 LR: 0.0023000000 Loss: 1.416327
Dev loss: 1.2588
Train Epoch: 5 LR: 0.0060400000 Loss: 1.299999
Dev loss: 1.4838
Train Epoch: 6 LR: 0.0097800000 Loss: 1.540868
Dev loss: 1.5280
Train Epoch: 7 LR: 0.0064800000 Loss: 1.790969
Dev loss: 1.2738
Train Epoch: 8 LR: 0.0027400000 Loss: 1.092477
Dev loss: nan
The text was updated successfully, but these errors were encountered: