Observation search API not consistently returning the number of records specified in per_page

geechartier · November 11, 2020, 1:02pm

Platform (Android, iOS, Website): API v1

I am accessing the API from a PHP script that runs the observation search with a per_page value of 200, plus other parameters depending on user entered values. This is part of a WordPress plugin that uses the retrieved records to create WordPress posts for each observation and related taxa. To avoid calling the API too many times, this is designed to happen once after initial configuration of the plugin and then once per day to import updates only.

The problem is that the API is not consistently returning 200 observations per call. This results in missing observations because it skips to the next 200. I have just completed a full run of the initial load script and it created 6705 observation posts, when it should have created 9213.

Below is a link to the log file created by my script showing the API calls and how many observations were returned by each. You will see that it is quite slow and there are two warnings about closed connections. These are nothing to do with this issue and are caused by me running on my laptop with a sometimes poor internet connection.

Log file

lawnranger · November 11, 2020, 2:44pm

How does the php script count the observations, simply the length of the json results array, or after processing the data? If you output the raw json for the results less than 200 is the json cutoff or malformed, suggesting an interruption?

I can successfully run the same query and return all 9213 observation, in chunks of 200 except for the last 13 as expected. No processing except to check each id is unique.

geechartier · November 16, 2020, 10:20am

Sorry for my slow response. When it told me I would be notified of responses, I was expecting email notification.
My script is counting the elements in the array but that is after parsing the data returned by the iNaturalist API. I wrote my own parser, which opens the API as a stream and gets the data in 64 KB chunks. I did this to get around “Allowed memory size of 134217728 bytes exhausted” errors when fetching in a single chunk all the JSON returned for 200 observations. Having checked it again and output the chunks it is receiving, all the chunks for incomplete retrievals end with a special character when viewed in Notepad++. The character is not consistent, nor are the positions of the characters in different runs of the script. I have seen xC3, xD0, and xE4. Note that the output correctly contains UTF-8 characters (e.g. Chinese).
Your results suggest that this is not an issue with the iNaturalist API but an issue with how I am handling the response. So I will try to resolve that.
Thank you for your help and time testing it out.

lawnranger · November 16, 2020, 10:39am

If memory is an issue, could you reduce the number of results per page? Of course the number of pages would increase, so keep the 60 requests per minute in mind.

geechartier · November 16, 2020, 10:48am

The 60 per minute limit was why I set it at 200. I could set it to 100 and introduce an artificial delay to throttle it, if necessary.

geechartier · November 24, 2020, 5:05am

Just adding a quick update to wrap up this issue. The cause of the problem was resource limits on my hosting platform - not the API. Thankfully, my hosting provider temporarily increased a bunch of limits for me to enable the function to complete.

bouteloua · November 24, 2020, 10:57am

Thanks for following up! Glad you got it figured out.

Topic		Replies	Views
User observation list on the old site and API is missing observations Bug Reports	2	342	January 21, 2022
API calls always return no more than 30, even when more is requested Bug Reports	16	1282	January 28, 2020
Is there a iNaturalist API observation output limit? General	2	366	March 2, 2022
Not all my observations show up in search Bug Reports	3	266	October 18, 2022
Should I get HTTP status code 422 from API if I don't specify an explicit per_page value when trying to return more records than the default per_page value? General	1	350	March 10, 2022

Observation search API not consistently returning the number of records specified in per_page

Related topics