Set Model Temperature to 0 for Consistent Leaderboard Results #562

HuanzhiMao · 2024-07-05T23:22:59Z

HuanzhiMao
Jul 5, 2024
Collaborator

The current model generation script (model_handlers) uses a default temperature of 0.7 for inference. This introduces some degree of randomness into the model output generation, leading to potential variability in the evaluation scores from run to run.
For benchmarking purposes, we should set it to 0 for consistency and reliability of the evaluation results.

ShishirPatil · 2024-07-07T22:08:26Z

ShishirPatil
Jul 7, 2024
Maintainer

What do folks thinks about this? I'm mostly ok as long as we are consistent across all models.

0 replies

alonsosilvaallende · 2024-07-22T03:49:47Z

alonsosilvaallende
Jul 22, 2024

I fully agree with a much lower temperature. Temperature 0.7 is extremely high. However, if I remember correctly it cannot be exactly zero for some models I have tried but strictly positive. Therefore, I think a positive but much smaller value would be better, for example Temperature=0.01.

0 replies

hexists · 2024-07-25T04:17:04Z

hexists
Jul 25, 2024

hello.
I think lowering the temperature is a good idea.
I think it would be good to set the temperature to 0 or at least a low value close to 0 for reproducibility.
Thank you for providing the BFCL.

0 replies

ShishirPatil · 2024-07-25T07:10:21Z

ShishirPatil
Jul 25, 2024
Maintainer

Thank you @aastroza and @hexists for weighing in. Ok @HuanzhiMao let's go with a lower temperature then maybe something like 0.1? But this would change fundamentally all numbers in the leaderboard. So, once we land all the existing PRs we can do this? I'll keep this issue open.

0 replies

HuanzhiMao · 2024-07-25T07:13:07Z

HuanzhiMao
Jul 25, 2024
Collaborator Author

Thank you @aastroza and @hexists for weighing in. Ok @HuanzhiMao let's go with a lower temperature then maybe something like 0.1? But this would change fundamentally all numbers in the leaderboard. So, once we land all the existing PRs we can do this? I'll keep this issue open.

Yea agree. Let's wait till all PR are merged and then we update this.

1 reply

ShishirPatil Sep 14, 2024
Maintainer

This has been merged, so should we close the discussion down?

HuanzhiMao · 2024-09-27T19:47:33Z

HuanzhiMao
Sep 27, 2024
Collaborator Author

Temperature has been set to 0 in #574.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Set Model Temperature to 0 for Consistent Leaderboard Results #562

{{title}}

Replies: 6 comments 1 reply

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

Set Model Temperature to 0 for Consistent Leaderboard Results #562

HuanzhiMao Jul 5, 2024 Collaborator

Replies: 6 comments · 1 reply

ShishirPatil Jul 7, 2024 Maintainer

alonsosilvaallende Jul 22, 2024

hexists Jul 25, 2024

ShishirPatil Jul 25, 2024 Maintainer

HuanzhiMao Jul 25, 2024 Collaborator Author

ShishirPatil Sep 14, 2024 Maintainer

HuanzhiMao Sep 27, 2024 Collaborator Author

HuanzhiMao
Jul 5, 2024
Collaborator

Replies: 6 comments 1 reply

ShishirPatil
Jul 7, 2024
Maintainer

alonsosilvaallende
Jul 22, 2024

hexists
Jul 25, 2024

ShishirPatil
Jul 25, 2024
Maintainer

HuanzhiMao
Jul 25, 2024
Collaborator Author

ShishirPatil Sep 14, 2024
Maintainer

HuanzhiMao
Sep 27, 2024
Collaborator Author