Set Model Temperature to 0 for Consistent Leaderboard Results #562
Replies: 6 comments 1 reply
-
What do folks thinks about this? I'm mostly ok as long as we are consistent across all models. |
Beta Was this translation helpful? Give feedback.
-
I fully agree with a much lower temperature. Temperature 0.7 is extremely high. However, if I remember correctly it cannot be exactly zero for some models I have tried but strictly positive. Therefore, I think a positive but much smaller value would be better, for example Temperature=0.01. |
Beta Was this translation helpful? Give feedback.
-
hello. |
Beta Was this translation helpful? Give feedback.
-
Thank you @aastroza and @hexists for weighing in. Ok @HuanzhiMao let's go with a lower temperature then maybe something like 0.1? But this would change fundamentally all numbers in the leaderboard. So, once we land all the existing PRs we can do this? I'll keep this issue open. |
Beta Was this translation helpful? Give feedback.
-
Yea agree. Let's wait till all PR are merged and then we update this. |
Beta Was this translation helpful? Give feedback.
-
Temperature has been set to 0 in #574. |
Beta Was this translation helpful? Give feedback.
-
The current model generation script (model_handlers) uses a default temperature of 0.7 for inference. This introduces some degree of randomness into the model output generation, leading to potential variability in the evaluation scores from run to run.
For benchmarking purposes, we should set it to 0 for consistency and reliability of the evaluation results.
Beta Was this translation helpful? Give feedback.
All reactions