429 Error from /observations/histogram API when calling at 60 calls/minute

iorek · May 3, 2025, 3:37am

I’ve been using the APIs in the last week and noticed I’m consistently getting “429 Too Many Requests” errors when rate limiting to 60 calls/minute (which is the documented recommendation). Also well within the other limits (running less than 1000/day)

Seems similar to these issues:
https://forum.inaturalist.org/t/discrepancy-between-documented-rate-limit-observed-rate-limit/8612/1
https://forum.inaturalist.org/t/api-inaturalist-org-v1-observations-id-limited-to-30-observations-not-200/54865

I’m going through pyinaturalist, which should in theory rate limit me, but I’ve added my own rate limiting as well.

App version number, if a mobile app issue (shown under Settings or About):
Using API via pyinaturalist (0.20.1)

URLs (aka web addresses) of any relevant observations or pages:
Primary API being used is get_observation_histogram() which is a frontend for /observations/histogram

Description of problem (please provide a set of steps we can use to replicate the issue, and make as many as you need.):

Code I’m using can be found at https://github.com/ewbing/inat_species - currently rate limits are set to 30 calls/minute (which works without error). If you change (via constants) to 60 calls it has been throwing errors.

I’ve double checked the APIs are within limits via timing the program.

pisum · May 3, 2025, 4:23am

you’re going to be in the best position to troubleshoot your own code. without looking at it, i would just guess that there’s something wrong with your code. i would doubt that there’s actually bug in the system.

sort of sounds like you might be making two requests per iteration (2 x 30 = 60), but again, you’re in the best position to troubleshoot your own code.

benarmstrong · May 3, 2025, 6:04am

pyinaturalist rate limits to 60 requests per minute by default, and that has been reliable for a long while. Increasing the number of requests per iteration wouldn’t cause that limit to be exceeded. @jcook and I reviewed @iorek 's code and don’t think it’s at fault.

benarmstrong · May 3, 2025, 8:27am

I tested and reproduced this with your current version (git hash 8f85dc9) both with your custom rate limiting and also simplified to use pyinaturalist’s built-in rate limiting.

First, I tested with 30 calls per minute via your own ratelimit wrapper to establish that it does not fail with the rate reduced. Then I edited CALLS constant to set it to 60 calls per minute. Finally, I unwrapped the API calls to just let pyinaturalist do its own rate limiting (as mentioned in my response to pisum above, it defaults to 60 per minute).

I confirmed that with your own rate limiting set to 30 calls per minute, there are no 429 errors, but at 60 calls per minute it starts to raise 429 errors after running for a while. Furthermore, with the patch below to let pyinat rate limit on its own to 60 calls per minute, it also eventually starts to raise 429 errors.

Steps to test:

$ git clone https://github.com/ewbing/inat_species
$ cd inat_species
$ python -m venv .venv
$ . .venv/bin/activate
$ pip install --quiet pyinaturalist ratelimit
$ pip freeze | grep pyinat
pyinaturalist==0.20.1
$ python inat_species_data.py
…
Data written to inat_species_summary.csv
Total species processed: 635
2025-05-03 05:37:10.580241
Total time taken: 0:21:11.584464
$ sed -i 's/CALLS = 30/CALLS = 60/' inat_species_data.py
$ python inat_species_data.py
…
Fetching histogram for Eumetopias jubatus (ID: 41755)...
API call: get_observation_histogram with params: {'taxon_id': 41755, 'place_id': 51347, 'quality_grade': 'research', 'date_field': 'observed', 'dry_run': False}
Error fetching histogram for taxon 41755: 429 Client Error: Too Many Requests for url: https://api.inaturalist.org/v1/observations/histogram?quality_grade=research&taxon_id=41755&place_id=51347&date_field=observed&interval=month_of_year
WARNING: Still no histogram data for Eumetopias jubatus (ID: 41755)
Fetching histogram for Mephitis mephitis (ID: 41880)...
…

Here’s how I modified your code to let pyinaturalist do the rate limiting on its own:

diff --git a/inat_species_data.py b/inat_species_data.py                                                                                                                           [19/1964]index f2c3fa0..d128d95 100644
--- a/inat_species_data.py
+++ b/inat_species_data.py
@@ -8,7 +8,6 @@ histogram data, and month with most observations.
 import csv
 import argparse
 from datetime import datetime
-from ratelimit import limits, sleep_and_retry                                                                                                                                               from pyinaturalist import (                                                                                                                                                                     get_observation_species_counts,
     get_observation_histogram,                                                                                                                                                             @@ -16,25 +15,12 @@ from pyinaturalist import (                                                                                                                                              )                                                                                                                                                                                          
 # Define rate limit: e.g.30 calls per minute
-CALLS = 30
-RATE_LIMIT_PERIOD = 60  # seconds                                                                                                                                                           PER_PAGE = 200  # Number of results per page                                                                                                                                                PAGE_SIZE = PER_PAGE  # Number of taxa to retrieve per page
 MAX_PAGES = 5  # Maximum number of pages to fetch                                                                                                                                           DRY_RUN = False  # Set to True for testing without actual API calls                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                -@sleep_and_retry                                                                                                                                                                           -@limits(calls=CALLS, period=RATE_LIMIT_PERIOD)                                                                                                                                             -def rate_limited_api_call(func, **kwargs):                                                                                                                                                 -    """                                                                                                                                                                                    -    Execute an API call with rate limiting.                                                                                                                                                -    All pyinaturalist API calls should go through this function                                                                                                                            -    """                                                                                                                                                                                    -    print(f"API call: {func.__name__} with params: {kwargs}")                                                                                                                              -    return func(**kwargs)                                                                                                                                                                  -                                                                                                                                                                                           -                                                                                                                                                                                            def get_species_for_place(args, place_id, quality_grade):                                                                                                                                       """                                                                                                                                                                                         Fetch species counts for a given place_id.                                                                                                                                             @@ -72,8 +58,7 @@ def get_species_for_place(args, place_id, quality_grade):                                                                                                                              )                                                                                                                                                                                           print(f"First 10 IDs from {args.input} are: {filter_ids[:10]}")                                                                                                                                                                                                                                                                                                            -        counts_response = rate_limited_api_call(                                                                                                                                           -            get_observation_species_counts,                                                                                                                                                +        counts_response = get_observation_species_counts(                                                                                                                                               place_id=place_id,                                                                                                                                                                          quality_grade=quality_grade,                                                                                                                                                                per_page=PER_PAGE,
@@ -119,8 +104,8 @@ def fetch_phyla_ids():
     """
     try:
         for page in range(1, MAX_PAGES + 1):
-            response = rate_limited_api_call(
-                get_taxa, rank_level=60, per_page=PAGE_SIZE, page=page
+            response = get_taxa(
+                rank_level=60, per_page=PAGE_SIZE, page=page
             )
             results = response.get("results", [])
             if not results:
@@ -260,7 +245,7 @@ def get_histogram_for_species(taxon_id, place_id, quality_grade="research"):
             "date_field": "observed",
             "dry_run": DRY_RUN,
         }
-        histogram_data = rate_limited_api_call(get_observation_histogram, **params)
+        histogram_data = get_observation_histogram(**params)

         # Initialize histogram with zeros for each month (0-indexed)
         histogram = [0] * 12

See https://pyinaturalist.readthedocs.io/en/stable/user_guide/advanced.html#rate-limiting for pyinaturalist’s rate limiting documentation. You could further patch this to pass a custom session object, rate-limited to 30 calls per minute to do the same job as your own rate limiting, thereby simplifying your code a bit. I suspect that would work just as well.

pisum · May 3, 2025, 12:37pm

at which iteration is it running into 429 errors? i ran 75 GET v1/observations/histogram requests at 1 req/sec through my own thing in Javascript, and i didn’t see any errors.

at 1.5 req/sec, i started to see errors at iteration 61. at 2 req/sec, i started to see errors at iteration 61. at 3 req/sec, i started to see errors at iteration 61. so with that sort of trend, i’m surprised that the system would start throwing errors at 1 req/sec if 75 requests ran fine.

is it possible that you were doing other stuff on the same machine while your code was running, and maybe even if the code is rate limiting things properly, the other stuff caused you to exceed the limit in total?

or i’m not sure exactly how your rate limiting algorithm works, but maybe if your algorithm allows clumping of requests within a period of time rather than spacing them out evenly over the period, that could theoretically cause issues if, say, one request takes a while and causes the rest to clump up and so cause the remaining requests to exceed iNat’s limit? for example, if the algorithm allows 60 req over 60 seconds, but it ends up doing 59 of the requests in the last second, maybe that could be problematic?

iorek · May 3, 2025, 10:58pm

I just did a run and I’m seeing it start at the 190th request - and just about 190 seconds in. Eyeballing it, it seems to go in bursts of 5 with my rate limiting - I haven’t tried yet with the updates @benarmstrong made.

Full error is:
429 Client Error: Too Many Requests for url: https://api.inaturalist.org/v1/observations/histogram?quality_grade=research&taxon_id=50878&place_id=51347&date_field=observed&interval=month_of_year

iorek · May 3, 2025, 11:00pm

I’ll try implementing it as a direct call to the API to make sure it’s not an issue in pyinaturalist

pisum · May 4, 2025, 2:50am

okay… i tested with more requests, and i finally am able to repeat the problem you noted.

i tried sending sets of requests with various incremental delay between initiation of requests, and the below table summarizes what i saw. (it doesn’t seem like it matters when the requests actually complete, only when they are initiated).

delay (ms)	equiv req/min	1st failed req (250 reqs)	1st failed req (500 reqs)
1000	60.0	120 (avg)	untested
1002	59.9	150	untested
1010	59.4	no errors	418
1050	57.1	no errors	untested
1125	53.3	no errors	untested
1250	48.0	no errors	untested
1500	40.0	no errors	untested

at 250 requests in each set, i encountered failures only when delay <=1002ms. at 1000ms, i experienced the first failure on average at the 120th request (118, 117, 125). at 1002ms, i experienced the first failure at the 150th request.

raising my set size to 500 requests, i retested at 1010ms and experienced my first failure at the 418th request.

without doing more testing, it seems like iNat might just be doing some strange rounding or truncating somewhere when trying to figure out whether the rate has exceeded 1 req/sec.

so maybe there is a minor bug in the system after all. i would guess the problem is isolated to /v1/observations/histogram, since i know i’ve sent sets of hundreds of requests at 1 req/seq to /v1/observations, v1/observations/species_counts, and v1/observations/observers without issue in the past.

i also would guess it would be easy enough to work around the problem by targeting something like 55 req/minute instead of 60 req/minute when issuing more than 100 requests in a set.

jeanphilippeb · May 4, 2025, 5:09am

I am not surprised at all. I confirm that the limitation is becoming increasingly strict in practice.

I slow down to 20 (or 15) requests to the API per minute in order to avoid the 429 error (and I let my software run day and night).

It also depends on what you request. The most sensitive is requesting Computer Vision data.

pisum · May 4, 2025, 12:14pm

i did a set of ~250 requests with incremental delay of 1 req/sec on /v1/observations without issue. so i believe this assertion still holds.

iorek · May 4, 2025, 8:19pm

Thanks @pisum and everyone else for the excellent analysis. I’m obviously not going to sweat it over 55 vs 60 calls per minute. But given that the documented maximums are 100 requests/minute I’m going to leave this open to reconcile the actuals with the doc and maybe determine why this API is behaving differently.

pisum · May 20, 2025, 3:33pm

i was doing this again this morning, and i’m running into the 429 errors on /v1/observations. so maybe the problem now affects more than just /observations/histogram now?

Topic		Replies	Views
Constant 429 errors Bug Reports	11	406	March 2, 2024
Is there a iNaturalist API observation output limit? General	2	366	March 2, 2022
Observation search API not consistently returning the number of records specified in per_page Bug Reports	7	454	November 24, 2020
Discrepancy between documented rate limit & observed rate limit Bug Reports api	3	2165	January 28, 2020
Create defined limit for number of observations downloaded to app Feature Requests android-app , ios-classic-app	3	701	July 19, 2019

429 Error from /observations/histogram API when calling at 60 calls/minute

Related topics