You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I attempted to apply the evaluate() method from your finetune.py to the training process of MiniLLM, but I noticed that model.eval() causes anomalies in the rl_loss. Specifically, at certain fixed steps, the pg_loss becomes extremely large, and the reg_loss also increases slightly, leading to the issue: "Current loss scale already at minimum - cannot decrease scale anymore. Exiting run." However, after removing model.eval(), the training proceeds normally. Do you understand the reason behind this? Is model.eval() necessary?
I attempted to apply the
evaluate()
method from yourfinetune.py
to the training process of MiniLLM, but I noticed thatmodel.eval()
causes anomalies in therl_loss
. Specifically, at certain fixed steps, thepg_loss
becomes extremely large, and thereg_loss
also increases slightly, leading to the issue: "Current loss scale already at minimum - cannot decrease scale anymore. Exiting run." However, after removingmodel.eval()
, the training proceeds normally. Do you understand the reason behind this? Ismodel.eval()
necessary?Here is my
evaluate()
method:The text was updated successfully, but these errors were encountered: