Downloading CSV of Species Checklist Limited by Taxa

I am trying to download a CSV of all plant species (not observations) in North America. Really, all I need is the Scientific name and iNat taxon ID, although a few other fields like common name and native/introduced status might be nice too.

I found this thread:

https://forum.inaturalist.org/t/can-i-export-a-species-list-and-not-individual-observations/9162/4

and tried to follow the instructions there, but am getting stuck. I was able to navigate to this page:

https://www.inaturalist.org/check_lists/313169-North-America-Check-List?iconic_taxon=47126

And that “iconic_taxon” condition restricts the search to only plants. There are 30,627 records. I’ve noticed, however, that the download CSV link doesn’t include the taxon restriction. I.e. the link is exactly the same as for the generic North America species list, here:

https://www.inaturalist.org/check_lists/313169-North-America-Check-List

which contains 86.651 species. I tried clicking the link and it timed out. I tried waiting a long time and coming back, and the file still had not been created. I’m wondering if it’s even going to be created at all.

Is there any way to export just the restricted list? I also don’t want to waste the computational resources of iNaturalist or put unnecessary load on the servers.

Or is there any other way to download this list?

The reason I want this is that I’m looking to link up individual plant records on bplant.org with the taxon pages on iNaturalist, and it would obviously be much better to do via an automated import than manually.

I am also open to other ways to solve the same problem, such as perhaps by looking up the entries one-by-one. GBIF has the iNaturalist taxon ID, which I think would be sufficient for my purposes, but I don’t know how to limit my search there to the relevant species, nor do I know how to export from GBIF (but if it’s easier to do this through them I can put in the time to learn it, I just couldn’t figure out at-a-glance how to export or how to filter by region there.) Also, if it’s easier to get all GBIF data than the relevant iNat data, I could just work with their bigger dataset and filter it myself, as I’m gonna have to match up the records anyway, and it seems the limiting factor here is the computational resources to prepare the CSV for download, so if there is a larger, pre-formed file that would be sufficient for my purposes.

You can download the species list from GBIF, here’s the link: https://www.gbif.org/occurrence/download?country=US&country=CA&dataset_key=50c9509d-22c7-4a22-a47d-8c48425ef4a7&taxon_key=6# .

You can also do this by using the “species counts” API endpoint, https://api.inaturalist.org/v1/observations/species_counts?place_id=97394&taxon_name=plantae .

The API link you sent is not working as expected, it’s including things that aren’t plants, and not including all plants for the region. In fact, it only is showing 36 total results.

I can try GBIF, I’ll need to create an account there and figure it out. Thanks for the suggestion!

Looks like

Should be: …&iconic_taxa=Plantae

This query returns 500 results per page of the 44083 total results (30,627 on the checklist!?) you would need to request each of the 89 pages.

1 Like

Oops, well, it should only be pulling plants but I forgot to mention that it only pulls 500 species per query so you’ll have to reiterate the query over multiple pages to get the full list which can be done in RStudio or similar.

1 Like

Not sure what your intended use case is, but just a heads-up – the standard place checklists in iNaturalist are not guaranteed to be comprehensive for any taxonomic group. They reflect the taxa of RG observations that have occurred within the place (at least while auto-listing functionality remains in place – see Phase 4 of https://www.inaturalist.org/blog/43337-upcoming-changes-to-lists), plus any manual list curation done by community members or staff (e.g., California). I may be wrong, but I doubt anyone has tried to curate the North America checklist to be complete.

2 Likes

It’s fine if it’s not complete, I am not going to be using iNat to determine which plants do or don’t occur in North America, to that end I’m using other sources like USDA and BONAP. I mainly want to link the records up with iNat so that people can reach the iNat taxon page in one click from the bplant page, and I want to minimize the amount of manual lookups I need to do to do this, so I think this is going to be sufficient!

The issue is that I don’t have a list that seems to be working at all. I.e. everything I’ve tried either doesn’t get generated, or contains things I don’t want, like animals.

So, I managed to get the data from GBIF and it doesn’t help me because it doesn’t include the Taxa ID from iNaturalist. It has a ton of fields but none of them correspond to the necessary key, which is part of the local URL on iNat.

For example, here is the URL for Quercus rubra:

https://www.inaturalist.org/taxa/49005-Quercus-rubra

Note the ID “49005”. That’s what I need, because it’s necessary to generate a URL to link to iNat from another website. I pretty much only need a list of those and the scientific names, because that’s sufficient to generate the URL, and iNat autocorrects if the scientific name is wrong, I only need the names to link them up with my own records.

The GBIF data has a ton of different ID’s, and I don’t know what any of them refer to to be honest, but none of them are this iNat taxon ID, so that data is unfortunately not able to do what I need it to.

Thanks! Fidgeting with this corrected version of what @whimbrelbirder came up with, I was able to get it working! And this gets me everything I need in a format I can work with pretty easily, and the number of records is manageable.

Thank you to everyone too, apologies if I’ve seemed cranky through all of this, I am unfortunately in a lot of pain, unrelated to iNaturalist, and it is causing me to be irritable.

I’m really excited about what this is going to facilitate…hopefully it will also help iNat gain some extra traffic and visibility, because it’s going to result in a ton of links to iNat’s site from pages that people are already viewing. iNat already represents a really valuable resource that I think more people would benefit from consulting!