Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug in the float limit handling #2324

Open
cjluo-omniml opened this issue Sep 19, 2024 · 6 comments
Open

Bug in the float limit handling #2324

cjluo-omniml opened this issue Sep 19, 2024 · 6 comments
Labels
feature request A feature that isn't implemented yet.

Comments

@cjluo-omniml
Copy link

Hi, LM eval team,

In this line: https://github.com/EleutherAI/lm-evaluation-harness/blob/main/lm_eval/evaluator.py#L439

The limit is reassigned to an int if the original limit flag is a float. However, this does not count the case if I'm running two tasks together.

E.g. TASK A: Full data samples 100, TASK B: Full data samples 1000. And I use a limit of 0.1.

I'm expecting the eval to run 10 samples from TASK A and 100 samples from TASK B.

However, the current logic does not process the TASK B limit compute as the limit is set to 10 already after processing TASK A.

So it ends with we run only 10 samples for TASK B as well.

This should be an easy fix, could you help update your code? Thanks

@baberabb
Copy link
Contributor

Hi! nice catch! yes, a PR will be appreciated!

cjluo-omniml added a commit to cjluo-omniml/lm-evaluation-harness that referenced this issue Sep 19, 2024
See: EleutherAI#2324

The float limit will be override with the previous int limit of multiple tasks are triggered together.

This PR fix this issue
@cjluo-omniml
Copy link
Author

@baberabb could you help review the fix above?

@cjluo-omniml
Copy link
Author

@baberabb friendly ping? This should be an easy fix

@baberabb
Copy link
Contributor

Hi! @cjluo-omniml . left a comment in the PR earlier. The script uses limit again downstream as well so need to handle that as well.

@baberabb baberabb added the feature request A feature that isn't implemented yet. label Sep 23, 2024
@cjluo-omniml
Copy link
Author

@baberabb fixed. Could you review again?

@cjluo-omniml
Copy link
Author

@baberabb friendly ping?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request A feature that isn't implemented yet.
Projects
None yet
Development

No branches or pull requests

2 participants