I tested and reproduced this with your current version (git hash 8f85dc9
) both with your custom rate limiting and also simplified to use pyinaturalist’s built-in rate limiting.
First, I tested with 30 calls per minute via your own ratelimit
wrapper to establish that it does not fail with the rate reduced. Then I edited CALLS constant to set it to 60 calls per minute. Finally, I unwrapped the API calls to just let pyinaturalist do its own rate limiting (as mentioned in my response to pisum above, it defaults to 60 per minute).
I confirmed that with your own rate limiting set to 30 calls per minute, there are no 429 errors, but at 60 calls per minute it starts to raise 429 errors after running for a while. Furthermore, with the patch below to let pyinat rate limit on its own to 60 calls per minute, it also eventually starts to raise 429 errors.
Steps to test:
$ git clone https://github.com/ewbing/inat_species
$ cd inat_species
$ python -m venv .venv
$ . .venv/bin/activate
$ pip install --quiet pyinaturalist ratelimit
$ pip freeze | grep pyinat
pyinaturalist==0.20.1
$ python inat_species_data.py
…
Data written to inat_species_summary.csv
Total species processed: 635
2025-05-03 05:37:10.580241
Total time taken: 0:21:11.584464
$ sed -i 's/CALLS = 30/CALLS = 60/' inat_species_data.py
$ python inat_species_data.py
…
Fetching histogram for Eumetopias jubatus (ID: 41755)...
API call: get_observation_histogram with params: {'taxon_id': 41755, 'place_id': 51347, 'quality_grade': 'research', 'date_field': 'observed', 'dry_run': False}
Error fetching histogram for taxon 41755: 429 Client Error: Too Many Requests for url: https://api.inaturalist.org/v1/observations/histogram?quality_grade=research&taxon_id=41755&place_id=51347&date_field=observed&interval=month_of_year
WARNING: Still no histogram data for Eumetopias jubatus (ID: 41755)
Fetching histogram for Mephitis mephitis (ID: 41880)...
…
Here’s how I modified your code to let pyinaturalist do the rate limiting on its own:
diff --git a/inat_species_data.py b/inat_species_data.py [19/1964]index f2c3fa0..d128d95 100644
--- a/inat_species_data.py
+++ b/inat_species_data.py
@@ -8,7 +8,6 @@ histogram data, and month with most observations.
import csv
import argparse
from datetime import datetime
-from ratelimit import limits, sleep_and_retry from pyinaturalist import ( get_observation_species_counts,
get_observation_histogram, @@ -16,25 +15,12 @@ from pyinaturalist import ( )
# Define rate limit: e.g.30 calls per minute
-CALLS = 30
-RATE_LIMIT_PERIOD = 60 # seconds PER_PAGE = 200 # Number of results per page PAGE_SIZE = PER_PAGE # Number of taxa to retrieve per page
MAX_PAGES = 5 # Maximum number of pages to fetch DRY_RUN = False # Set to True for testing without actual API calls -@sleep_and_retry -@limits(calls=CALLS, period=RATE_LIMIT_PERIOD) -def rate_limited_api_call(func, **kwargs): - """ - Execute an API call with rate limiting. - All pyinaturalist API calls should go through this function - """ - print(f"API call: {func.__name__} with params: {kwargs}") - return func(**kwargs) - - def get_species_for_place(args, place_id, quality_grade): """ Fetch species counts for a given place_id. @@ -72,8 +58,7 @@ def get_species_for_place(args, place_id, quality_grade): ) print(f"First 10 IDs from {args.input} are: {filter_ids[:10]}") - counts_response = rate_limited_api_call( - get_observation_species_counts, + counts_response = get_observation_species_counts( place_id=place_id, quality_grade=quality_grade, per_page=PER_PAGE,
@@ -119,8 +104,8 @@ def fetch_phyla_ids():
"""
try:
for page in range(1, MAX_PAGES + 1):
- response = rate_limited_api_call(
- get_taxa, rank_level=60, per_page=PAGE_SIZE, page=page
+ response = get_taxa(
+ rank_level=60, per_page=PAGE_SIZE, page=page
)
results = response.get("results", [])
if not results:
@@ -260,7 +245,7 @@ def get_histogram_for_species(taxon_id, place_id, quality_grade="research"):
"date_field": "observed",
"dry_run": DRY_RUN,
}
- histogram_data = rate_limited_api_call(get_observation_histogram, **params)
+ histogram_data = get_observation_histogram(**params)
# Initialize histogram with zeros for each month (0-indexed)
histogram = [0] * 12
See https://pyinaturalist.readthedocs.io/en/stable/user_guide/advanced.html#rate-limiting for pyinaturalist’s rate limiting documentation. You could further patch this to pass a custom session
object, rate-limited to 30 calls per minute to do the same job as your own rate limiting, thereby simplifying your code a bit. I suspect that would work just as well.