Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LLM As Judge should describe programs as expected and actual #287

Open
jlewi opened this issue Oct 8, 2024 · 0 comments
Open

LLM As Judge should describe programs as expected and actual #287

jlewi opened this issue Oct 8, 2024 · 0 comments
Labels
good first issue Good for newcomers

Comments

@jlewi
Copy link
Owner

jlewi commented Oct 8, 2024

Right now in its explanations, the LLM as judge is referring to program1 and program2.

This is likely because our prompt is using "" and ""

We should change that to "" and "" to try to get the judge to talk in terms of expected and actual.

Here's an example explanation issued by the judge

The two programs perform distinct operations.

Program 1:

Uses jq to extract the .requestHtml field from a JSON file and stores the result in another file.
Then, it displays the content of the newly created file using cat.
Program 2:

Utilizes curl to make an HTTP POST request to fetch logs data and store the response in a JSON file.
Checks for successful execution and displays an error message if the curl command fails.
The two programs have no overlapping functionality and perform entirely different tasks. Therefore, they are not equivalent.

@jlewi jlewi added the good first issue Good for newcomers label Oct 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

1 participant