-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bug in the float limit handling #2324
Labels
feature request
A feature that isn't implemented yet.
Comments
Hi! nice catch! yes, a PR will be appreciated! |
cjluo-omniml
added a commit
to cjluo-omniml/lm-evaluation-harness
that referenced
this issue
Sep 19, 2024
See: EleutherAI#2324 The float limit will be override with the previous int limit of multiple tasks are triggered together. This PR fix this issue
@baberabb could you help review the fix above? |
@baberabb friendly ping? This should be an easy fix |
Hi! @cjluo-omniml . left a comment in the PR earlier. The script uses |
@baberabb fixed. Could you review again? |
@baberabb friendly ping? |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Hi, LM eval team,
In this line: https://github.com/EleutherAI/lm-evaluation-harness/blob/main/lm_eval/evaluator.py#L439
The limit is reassigned to an int if the original limit flag is a float. However, this does not count the case if I'm running two tasks together.
E.g. TASK A: Full data samples 100, TASK B: Full data samples 1000. And I use a limit of 0.1.
I'm expecting the eval to run 10 samples from TASK A and 100 samples from TASK B.
However, the current logic does not process the TASK B limit compute as the limit is set to 10 already after processing TASK A.
So it ends with we run only 10 samples for TASK B as well.
This should be an easy fix, could you help update your code? Thanks
The text was updated successfully, but these errors were encountered: