You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The second one is fixed by spoofing the user agent with a browser, i.e., it's Wiley (the publisher) trying to block automated downloads. I did it using wget to test but we should be able to do the same thing in Python.
As you mentioned earlier the first one is a mess. Not only is it rendering into html, but the data itself isn't in the html it's being rendered by javascript, so I think you'd basically have to cut and paste the text out of the browser. I don't have any good thoughts on this one other than to email the data providers and ask them to provide a better option. We might be able to scrape it out somehow, but I don't think it's worth it for one dataset.
Example packages:
1: Package file: https://github.com/weecology/retriever-recipes/blob/main/scripts/usda_agriculture_plants_database.py
Sample url: https://plants.sc.egov.usda.gov/csvdownload?plantLst=plantCompleteList
2: package file: https://github.com/weecology/retriever-recipes/blob/main/scripts/aquatic_animal_excretion.py
url: https://esajournals.onlinelibrary.wiley.com/action/downloadSupplement?doi=10.1002%2Fecy.1792&file=ecy1792-sup-0001-DataS1.zip
The text was updated successfully, but these errors were encountered: