You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Here’s a refined version of your GitHub issue description:
Description
The formatted datasets (e.g., datasets/formatted_datasets/summarize/data.summarize.xxxx_xx_xx.json) should be JSONL files, where each line of the file is a valid JSON object.
Currently, these lines are not JSON but rather Python dictionaries (using str and int as keys/values).
When reading these custom “Python/JSONL” files, the built-in eval function is used:
This approach introduces a serious security vulnerability. A user of this framework could unknowingly run scripts on a malicious dataset. For example, the text in a sample, hidden among other 1M datapoints, could be crafted to include valid Python code. When processed with eval, this could lead to arbitrary code execution on the user’s machine.
Proposed Fix
To mitigate this risk, I suggest:
Saving the lines in proper JSONL format.
Using Python’s built-in json library to read and write these files.
Completely avoiding the use of eval for parsing input.
The text was updated successfully, but these errors were encountered:
Here’s a refined version of your GitHub issue description:
Description
The formatted datasets (e.g., datasets/formatted_datasets/summarize/data.summarize.xxxx_xx_xx.json) should be JSONL files, where each line of the file is a valid JSON object.
Currently, these lines are not JSON but rather Python dictionaries (using str and int as keys/values).
When reading these custom “Python/JSONL” files, the built-in eval function is used:
llm-judge-eval/utils/utils_read_write.py
Lines 27 to 41 in 852fef6
This approach introduces a serious security vulnerability. A user of this framework could unknowingly run scripts on a malicious dataset. For example, the text in a sample, hidden among other 1M datapoints, could be crafted to include valid Python code. When processed with eval, this could lead to arbitrary code execution on the user’s machine.
Proposed Fix
To mitigate this risk, I suggest:
The text was updated successfully, but these errors were encountered: