API (rinat) with for-loop has error that breaks loop in R

I’m trying to use the API to download some data using the rinat package. I’m using the following code to get observations from one species per day, then binding the data together for one month.

require(rinat)

days <- c(1:31)
firstDay <- min(days)
obsMnt <- data.frame()
for(d in days){
  obsDay <- data.frame()
  obsDay <- get_inat_obs(taxon_id = 48662,
                         quality = "research",
                         maxresults = 10000,
                         year = 2019,
                         month = 1,
                         day = d,
                         bounds = NULL)
  if(d == firstDay){
    obsMnt <- obsDay
  }
  if(d > firstDay){
    obsMnt <- rbind(obsMnt, obsDay)
  }
}

This loop seems to work sometimes (more frequently when I reduce the number of days), but often gets one of the following 2 errors:

Error in if (total_res == 0) { : argument is of length zero

or

No encoding supplied: defaulting to UTF-8.
Error in if (!x$headers$`content-type` == "text/csv; charset=utf-8") { : 
  argument is of length zero

When the loop produces an error, the loop breaks and I only get the number of observations up until the point that it breaks. I’m not sure if this is an error in my code or in the API. I appreciate any advice on how to solve this!

this thread probably shouldn’t be classified as a bug report since there’s not much iNat staff can do to address issues in rinat or in your own code.

is there a particular reason you want to use rinat? as far as i can tell, that package seems to hit the deprecated iNat API. you could get data directly from the current iNat API or maybe use a package like spocc, which seems to leverage the current API.

4 Likes

That’s a good point thanks for your feedback. I switched the classification to general. I didn’t realize that rinat seems to hit the depreciated iNat API. I’ll try spocc and see if that works.

1 Like

A little tip I give for debugging loops in R is to manually set the iterator variable (“d”, here) to 1, and then go line by line and run the code inside the for-loop. After running each line you can print the result to console and make sure it looks like what you’re going for.

Although, also looking at your code snippet and the error message, it does look more like the issue is the function(s) from rinat, since you don’t have any code reading “if (total_res == 0)”. I reckon that is coming from the rinat function(s) and is probably the result of something about the arguments being provided not being quite right.

Disclaimer: I haven’t used the package rinat before. :)

-Glen

1 Like

This way you are making 31 requests to get the data. Would it be possible to make only one API request (possibly with paging) and then loop through the observations?

2 Likes

I’ve found that if you make too many requests in rapid succession with rinat, the API starts throwing up errors. It might be better to just do one big call that gets all the observations of your taxon within your desired time frame and then filter that down in R.

1 Like

Yes, that’s a good point about making one big request. I normally end up doing one big request, but sometimes it’s handy to use loops. This example could easily be done in one big request, but I think it’s useful for figuring out the problem because it’s small and doesn’t take long to reproduce.

Thanks for the tip! I actually tried that approach and each iterator seems to work if they’re run independently. I also tried experimenting with different numbers of days and the loop seems to break between roughly the 6th and 20th day, but not consistently with a particular number of days. It seems like my best options are to just run rinat requests on one line or use the spocc package instead.

for what it’s worth, i don’t think using the spocc package will address API throttling by iNat, although i still think it’s better for you to get data from the current API (either directly or via a package like spocc), as opposed to getting data from the deprecated API (via rinat).

you can read more about throttling by reading the “Rate Limits” section of this page: https://www.inaturalist.org/pages/developers. if you make too many API requests within a given period of time, iNat will basically start refusing requests until your trailing request rate falls below a given threshold. so if you’re getting throttled, then you could see this kind of issue:

within your own loop, you could add a Sys.sleep() to delay execution for, say, 1 second between each iteration, but both rinat and spocc (as far as i can tell), also do their own internal looping to get sets of observations requiring multiple requests for multiple pages of records, and neither package seems to do anything to stop themselves from exceeding the recommended API request rate limit when getting multiple pages.

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.