Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

request limits information #36

Open
sastoudt opened this issue Jul 30, 2019 · 4 comments
Open

request limits information #36

sastoudt opened this issue Jul 30, 2019 · 4 comments

Comments

@sastoudt
Copy link

It would be helpful to have information about rate limits in the documentation.

"We throttle API usage to a max of 100 requests per minute, though we ask that you try to keep it to 60 requests per minute or lower. If we notice usage that has serious impact on our performance we may institute blocks without notification."

Not all functions have a maxresults that helps control this, and users may unknowingly trigger a block.

@jebyrnes
Copy link

I had the same question. Trying to use this for a classroom assignment, but I have so many students I cannot download their info without triggering this. Is there a way around it that's straightforward?

@sastoudt
Copy link
Author

Two strategies I've used as a work around:

  1. Since the limit is 60 requests per second, I break my requests into chunks of 60 and put a Sys.sleep(10) in between every chunk. I usually do this in a simple for loop (the horror, I know).
  2. If the observations are research grade, I just download the full data from GBIF and then subset what I need. The data is big, so I use data.table::fread to read it in quickly. Then I ditch most of the data that I'm not interested in, making the size pretty manageable, so that I can continue on working with data.frame.

Hope that helps in the meantime.

@LDalby
Copy link
Contributor

LDalby commented Sep 11, 2019

Just curious, what is the use case that will trigger this?

@sastoudt
Copy link
Author

I had a bunch of observation IDs and wanted to grab all of the metadata (such as date/time uploaded to iNaturalist, date/time first identified, etc.). I was looping through each ID one at a time using the get_inat_obs_id function which only takes one ID at a time. This is one of the functions that doesn't have a rate limit forced internally, so the loop processed faster than recommended. There is probably a better way to do this, so feel free to share ideas!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants