Downloading CSV of Species Checklist Limited by Taxa

cazort · November 12, 2020, 8:47pm

I am trying to download a CSV of all plant species (not observations) in North America. Really, all I need is the Scientific name and iNat taxon ID, although a few other fields like common name and native/introduced status might be nice too.

I found this thread:

https://forum.inaturalist.org/t/can-i-export-a-species-list-and-not-individual-observations/9162/4

and tried to follow the instructions there, but am getting stuck. I was able to navigate to this page:

https://www.inaturalist.org/check_lists/313169-North-America-Check-List?iconic_taxon=47126

And that “iconic_taxon” condition restricts the search to only plants. There are 30,627 records. I’ve noticed, however, that the download CSV link doesn’t include the taxon restriction. I.e. the link is exactly the same as for the generic North America species list, here:

https://www.inaturalist.org/check_lists/313169-North-America-Check-List

which contains 86.651 species. I tried clicking the link and it timed out. I tried waiting a long time and coming back, and the file still had not been created. I’m wondering if it’s even going to be created at all.

Is there any way to export just the restricted list? I also don’t want to waste the computational resources of iNaturalist or put unnecessary load on the servers.

Or is there any other way to download this list?

The reason I want this is that I’m looking to link up individual plant records on bplant.org with the taxon pages on iNaturalist, and it would obviously be much better to do via an automated import than manually.

I am also open to other ways to solve the same problem, such as perhaps by looking up the entries one-by-one. GBIF has the iNaturalist taxon ID, which I think would be sufficient for my purposes, but I don’t know how to limit my search there to the relevant species, nor do I know how to export from GBIF (but if it’s easier to do this through them I can put in the time to learn it, I just couldn’t figure out at-a-glance how to export or how to filter by region there.) Also, if it’s easier to get all GBIF data than the relevant iNat data, I could just work with their bigger dataset and filter it myself, as I’m gonna have to match up the records anyway, and it seems the limiting factor here is the computational resources to prepare the CSV for download, so if there is a larger, pre-formed file that would be sufficient for my purposes.

whimbrelbirder · November 12, 2020, 10:48pm

You can download the species list from GBIF, here’s the link: https://www.gbif.org/occurrence/download?country=US&country=CA&dataset_key=50c9509d-22c7-4a22-a47d-8c48425ef4a7&taxon_key=6# .

You can also do this by using the “species counts” API endpoint, https://api.inaturalist.org/v1/observations/species_counts?place_id=97394&taxon_name=plantae .

cazort · November 12, 2020, 11:01pm

The API link you sent is not working as expected, it’s including things that aren’t plants, and not including all plants for the region. In fact, it only is showing 36 total results.

I can try GBIF, I’ll need to create an account there and figure it out. Thanks for the suggestion!

lawnranger · November 12, 2020, 11:42pm

Looks like

Should be: …&iconic_taxa=Plantae

This query returns 500 results per page of the 44083 total results (30,627 on the checklist!?) you would need to request each of the 89 pages.

whimbrelbirder · November 13, 2020, 2:50am

Oops, well, it should only be pulling plants but I forgot to mention that it only pulls 500 species per query so you’ll have to reiterate the query over multiple pages to get the full list which can be done in RStudio or similar.

jdmore · November 13, 2020, 9:07am

Not sure what your intended use case is, but just a heads-up – the standard place checklists in iNaturalist are not guaranteed to be comprehensive for any taxonomic group. They reflect the taxa of RG observations that have occurred within the place (at least while auto-listing functionality remains in place – see Phase 4 of https://www.inaturalist.org/blog/43337-upcoming-changes-to-lists), plus any manual list curation done by community members or staff (e.g., California). I may be wrong, but I doubt anyone has tried to curate the North America checklist to be complete.

cazort · November 17, 2020, 4:39pm

It’s fine if it’s not complete, I am not going to be using iNat to determine which plants do or don’t occur in North America, to that end I’m using other sources like USDA and BONAP. I mainly want to link the records up with iNat so that people can reach the iNat taxon page in one click from the bplant page, and I want to minimize the amount of manual lookups I need to do to do this, so I think this is going to be sufficient!

The issue is that I don’t have a list that seems to be working at all. I.e. everything I’ve tried either doesn’t get generated, or contains things I don’t want, like animals.

cazort · November 17, 2020, 5:00pm

So, I managed to get the data from GBIF and it doesn’t help me because it doesn’t include the Taxa ID from iNaturalist. It has a ton of fields but none of them correspond to the necessary key, which is part of the local URL on iNat.

For example, here is the URL for Quercus rubra:

https://www.inaturalist.org/taxa/49005-Quercus-rubra

Note the ID “49005”. That’s what I need, because it’s necessary to generate a URL to link to iNat from another website. I pretty much only need a list of those and the scientific names, because that’s sufficient to generate the URL, and iNat autocorrects if the scientific name is wrong, I only need the names to link them up with my own records.

The GBIF data has a ton of different ID’s, and I don’t know what any of them refer to to be honest, but none of them are this iNat taxon ID, so that data is unfortunately not able to do what I need it to.

cazort · November 17, 2020, 5:43pm

Thanks! Fidgeting with this corrected version of what @whimbrelbirder came up with, I was able to get it working! And this gets me everything I need in a format I can work with pretty easily, and the number of records is manageable.

Thank you to everyone too, apologies if I’ve seemed cranky through all of this, I am unfortunately in a lot of pain, unrelated to iNaturalist, and it is causing me to be irritable.

I’m really excited about what this is going to facilitate…hopefully it will also help iNat gain some extra traffic and visibility, because it’s going to result in a ton of links to iNat’s site from pages that people are already viewing. iNat already represents a really valuable resource that I think more people would benefit from consulting!

system · January 16, 2021, 5:43pm

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
How do I export the species list rather than all observations? General	4	437	August 7, 2024
Species Download General	6	239	November 30, 2024
Can I export a species list and not individual observations? General	9	3154	March 6, 2020
How to periodically download all species from a project using python General web , api , download , python , programming	6	1398	January 24, 2022
Easy way to get taxon IDs for a list of 300ish taxa? General	4	877	June 26, 2023

Downloading CSV of Species Checklist Limited by Taxa

Related topics