Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gptjudge empty response handling #34

Draft
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

snova-nidhih
Copy link
Collaborator

Occasionally when using gpt as a judge, the response is empty. From our test set, this occurred only once and is considered incorrect by default.

Test command:
python3 -m accelerate.commands.launch \ --num_processes=1 \ -m lmms_eval \ --model internvl2 \ --model_args pretrained="/import/ml-sc-scratch3/nidhih/mm_hungarian/finetune_outputs/pretrain_hupdf2_synthdog_hu_all_unfrozen" \ --tasks docvqa_hu_syn \ --batch_size 1 \ --log_samples \ --log_samples_suffix intervl2_8b \ --output_path /import/ml-sc-scratch3/nidhih/mm_hungarian/lmms_eval_op/internvl2_8b_docvqasyn_pt_test \ --limit 200

@@ -405,7 +405,10 @@ def gpt4judge(references, predictions, query): # This is a passthrough function
eval_logger.error(f"All 5 attempts failed. Last error message: {str(e)}.\nResponse: {str(error_msg)}")
response = ""

score = int(extract_number_from_brackets(response))
if response is None: # Rare case of gpt returning empty response
score = 0
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we do a try+except wrapping extract_number_from_brackets, and also add logging whenever this is caught?

@snova-nidhih snova-nidhih marked this pull request as draft October 31, 2024 21:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants