Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

'--psl-read-only' and '--psl-filename=' args do not work. #134

Open
Matthew-Grayson opened this issue May 22, 2023 · 1 comment
Open

'--psl-read-only' and '--psl-filename=' args do not work. #134

Matthew-Grayson opened this issue May 22, 2023 · 1 comment

Comments

@Matthew-Grayson
Copy link
Contributor

🐛 Summary

Trustymail saves new PSLs to './public_suffix_list.dat' even when specifying a different file name using --psl-filename=. Additionally, older PSLs are overwritten even when using --psl-read-only.
This behavior is confirmed both through WSL and through a Docker container.

Because of this, trustymail is difficult to run in bulk, and is throwing errors from failed PSL fetches from publicsuffix.org.

To reproduce

Steps to reproduce the behavior:

  1. Open terminal

  2. Run trustymail --psl-filename=psl.dat cisa.gov

  3. See PSL is saved to 'public_suffix_list.dat' instead of 'psl.dat'

  4. Run trustymail --psl-read-only cisa.gov

  5. Note last modified time of 'public_suffix_list.dat'

  6. Change last modified time by running python3 -c "from os import utime; utime('public_suffix_list.dat', (1330712280, 1330712280))".

  7. Confirm that the last modified time was changed

  8. Repeat step 1

  9. See that the last modified time changed, signaling that PSL is overwritten

Expected behavior

Trustymail should never fetch a new PSL when specifying --psl-read-only.
Trustymail should read from/save to the path specified by --psl-filename=

Any helpful log output or screenshots

Here's an image of a container running trustymail against 22 dummy domains. The container spawns with an out of date PSL. Arguments specify that the file is read only to avoid trustymail producing unnecessary fetch requests. Trustymail rapidly produces 11 fetch requests (most of which fail causing their respective child processes to exit with error). When a fetch is successful, the PSL is overwritten evidenced by it's last modified time changing.

Screenshot 2023-05-22 at 11 47 13 AM
@Matthew-Grayson
Copy link
Contributor Author

Matthew-Grayson commented May 23, 2023

It seems like these two values are global constants initialized in __init__.py, and can't be overwritten. Instead of making these global constants, why not use docopt's default value functionality in cli.py, then import the values to domain.py for use by the get_psl definitions. Here's an example from docopt.org.
image

@Matthew-Grayson Matthew-Grayson mentioned this issue May 24, 2023
12 tasks
cisagovbot pushed a commit that referenced this issue Jul 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant