Error while evaluating with PPI with API model LLM judge #67

islambs231bi · 2024-07-27T18:55:42Z

I am using ares evaluator with the given data in the docs to evaluate using prediction powered inference

ppi_config {
        evaluation_datasets: ['./aresdocs/nq_unlabeled_output_trimmed.tsv']
        few_shot_examples_filepath: "./aresdocs/nq_few_shot_prompt_for_judge_scoring.tsv"
        llm_judge: "gpt-4o-mini"
        labels: ["Context_Relevance_Label",]
        gold_label_paths: ["./aresdocs/nq_labeled_output_trimmed.tsv"]
}

Tried models: openai models: [gpt4, gpt-4o-mini], multiple together models
Error:

 File "python3.10/site-packages/ares/RAG_Automatic_Evaluation/LLMJudge_RAG_Compared_Scoring.py", line 729, in evaluate_model
    if total_references.nelement() > 0:
AttributeError: 'numpy.ndarray' object has no attribute 'nelement'

If I use default model provided checkpoint with config, It works successfully.

ppi_config {
        evaluation_datasets: ['./aresdocs/nq_unlabeled_output_trimmed.tsv']
        few_shot_examples_filepath: "./aresdocs/nq_few_shot_prompt_for_judge_scoring.tsv"
        model_choice: "microsoft/deberta-v3-large"
        labels: ["Context_Relevance_Label",]
        gold_label_paths: ["./aresdocs/nq_labeled_output_trimmed.tsv"]
        checkpoints: [./aresdocs/ares_context_relevance_general_checkpoint_V1.1.pt]
}

Do I need to provide the checkpoints every time ?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error while evaluating with PPI with API model LLM judge #67

Error while evaluating with PPI with API model LLM judge #67

islambs231bi commented Jul 27, 2024

Error while evaluating with PPI with API model LLM judge #67

Error while evaluating with PPI with API model LLM judge #67

Comments

islambs231bi commented Jul 27, 2024