How to download all mammals (taxa only, not observations)?

Hey everyone,

I already read some solutions in the forum, but i couldnt identify a solution that works for me.

I’m trying to build a structured dataset for a project that uses iNaturalist’s taxonomy.
Specifically, I’d like to download all mammals, but not the observations — just the taxa data according to my scheme below.

I’ve looked at the API and export options, but I’m not sure what’s the best way to get a full list of mammal taxa. All i know is that INaturalist has the exact data i need.

For context, here’s the data scheme I’m working with (called a GeoHook):

interface GeoHook = {
  // Reference & media

  common_name: "",  // Common Name
  p2699: "",        // URL of canonical Wikipedia article 
  // Classification
  p910: [],         // Main category (Animal) 
  p279: [],         // Subclass/type (Mammal)
  // Geography
  p9714: [],        // Geographic extent (geo range, boundary box or location)
  p2974: [],        // Environment type (habitat, urban, ocean, forest, desert, mangroove, arctic, tundra etc.)
};

So basically the data behind Map, About and Taxonomy on the species cards.

That would be around 5K wild species (i don’t need the domesticated ones for now)

Does anyone know the most efficient way to do this via the iNaturalist API or another export endpoint?

Thanks in advance!

some relevant discussion: https://forum.inaturalist.org/t/is-there-a-tool-code-snippet-that-allows-downloading-of-taxonomy-data-from-the-site/14268

2 Likes

Hi everyone. I am one step further in creating a json or csv of all mammals with geo bounding boxes.

THE GOAL

interface GeoHook = {
// Reference
“id”: 42223,
“common_name”: “White-tailed Deer”,
“wikipedia_url”: “http://en.wikipedia.org/wiki/White-tailed_deer”

// Geography
p9714: “min_lat: …, max_lat: …, min_lon: …, max_lon: …“, // Geographic extent (geo range, boundary box(s) or location(s))
p2974: “urban, ocean, forest, desert, mangroove, arctic, tundra etc.“, // Environment type
};

THE FIRST THREE (id, common_name, wikipedia_url) YOU’LL FETCH EASILY WITH THIS PYTHON SCRIPT

import requests
import json

url = “https://api.inaturalist.org/v1/taxa”
all_mammals =
page = 1

while True:
params = {
“taxon_id”: 40151, # Mammalia
“rank”: “species”,
“per_page”: 200,
“page”: page
}

response = requests.get(url, params=params)
data = response.json()
results = data['results']

# Extract only the desired fields
for species in results:
    mammal = {
        "id": species['id'],
        "common_name": species.get('preferred_common_name', None),
        "wikipedia_url": species.get('wikipedia_url', None)
    }
    all_mammals.append(mammal)

print(f"Page {page}: {len(results)} species loaded")

if len(results) < 200:
    break

page += 1

print(f"\nTotal of {len(all_mammals)} mammal species found")

# Save as JSON file

with open(‘mammals.json’, ‘w’, encoding=‘utf-8’) as f:
json.dump(all_mammals, f, indent=2, ensure_ascii=False)

# Show first 5 entries

print(“\nExample entries:”)
for mammal in all_mammals[:5]:
print(mammal)

BUT HOW DO I GET THE GEOGRAPHY PART???

I could run all fetched ID’s like this with a script https://api.inaturalist.org/v1/observations?taxon_id=42069&geo=true and fetch only all geojson points and connect them to bounding boxes, but that feels like unnecessary effort, and by logic that will not work out either (how do i know which observation points belong in which square like on the INaturalist map for a species)

"geojson": {
        "type": "Point",
        "coordinates": [8.3248510361, 46.5978927612]
      },

I BASICALLY WANT THE DATA TO RENDER THESE OBSERVATIONS SQUARES ON MY MAP (i am aware not to render the exact points to avoid becoming the number 1 map for hunters :sweat_smile:)

It’s very hard to tell from your post what you’re actually looking for – observations or taxa? points or bounding boxes or polygons? all taxa or only observed taxa?

It would probably help to start by describing what you’re actually trying to accomplish instead of jumping right to the implementation.

I would suggest looking at https://forum.inaturalist.org/t/in-pursuit-of-mappiness-part-1/21864 and https://forum.inaturalist.org/t/create-a-live-map-service-for-arcmap-and-or-google-earth-with-inat-data-and-links/1512 and see if those help.

2 Likes

yes, the orange squares are observation locations

1 Like

Is this an LLM response to a homework assignment? iNaturalist has never had habitat data so I’m not sure where this is coming from.

If you’re looking for the documented range of a species and the habitat types, you’re probably better served by the IUCN red list, where you can get both of those things:

Not only do iNat observation locations not represent the full range for almost any mammal, especially at fine scales, but you will also run into significant data use limitations if you try to download the coordinates for all ~5 million RG mammal observations.

1 Like