You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I was wondering whether there is any difference between Answer Relevance and Answer Faithfulness. Conceptually there is of course, but the code for training LLM judges and actually judging samples seems exactly the same. Would it not make sense for the answer relevance metric to not take into account the context during training and judging?
The text was updated successfully, but these errors were encountered:
I second this, from the paper: context relevance (is the retrieved information pertinent to the test question) answer faithfulness (is the response generated by the language model properly grounded in the retrieved context) answer relevance (is the response also relevant to the question).
Context relevance is clear and is handled in synthetic data generation and model training/testing. Answer faithfulness and relevance are defined in the paper, but in the code treats them exactly the same. Except in some places where answer faithfulness is the only one appearing. I'm wondering if the idea was to implement this but you ended up not implementing answer faithfulness. In the paper no table of result, charts or examples include answer faithfulness.
Hello,
I was wondering whether there is any difference between Answer Relevance and Answer Faithfulness. Conceptually there is of course, but the code for training LLM judges and actually judging samples seems exactly the same. Would it not make sense for the answer relevance metric to not take into account the context during training and judging?
The text was updated successfully, but these errors were encountered: