Is there a tool / code snippet that allows downloading of taxonomy data from the site?

This is new to me - but I am keen to learn. Does iNaturalist have a page which describes how to use the iNat API? or do you have to have some knowlage of programming to use this?

Do I place your script in one of the boxes in the get_taxa API?, and if so which box? Do I just change the taxon number to the one I want e.g. 47792? and empty quotes for rank?

Will it output all the vernacular names for each species or just one per species?

The 10 000 limit will not be a problem as there are only about 3 000 odonata species. I would like the output in .csv.

I am sorry for all the questions

Typically you would use a programming language to request the data from the API then filter/format/output the data to your needs.

The script above it written in Python, you will need Python installed on your computer to run it. Or use something like Google Colab (Google account required) to run it online.

Yes, taxon id 47792 will give you all Odonata. Empty quotes for rank will include all ranks from order to species, about 7300 results (Including species[‘rank’] may be helpful)

In the script above “…species[‘name’]…” will output the latin name, adding “species[‘preferred_common_name’]” will output the (ie one) vernacular name if available. The language can be set with the ‘locale’ and or the ‘preferred_place_id’ parameters or is determined by the default language on your computer, I think.

Set “all_names=true” to get all the vernacular names, but would then need a way to sort each language into the correct ‘column’ in the cvs file.

[EDIT] This might not make much sense if you’re just starting with programming, but hopefully some pointers when you get the hang of it

2 Likes

Thank you for the advice. I am playing around on Colab. It will be a while till I have it figured out.

How do you get it to output the .csv file? or when I run it where do I find the outputs?

I have set all_names to true. What can I add to get the common names I assume that I will have to add something like - csvwriter.writerow([species[‘name’], species[‘common_name’]]) - and then add several common name columns so that I can recall all the common names?

In Colab, on the left are three icons, bottom one looks like a folder, click on it. Any files created will appear here, you will then be able to download it.

After setting all_names to true, all the names will be in species[‘names’] you would need to loop through them all…hold on I’ll post some code send you a link

Thank you, I can download the .csv and I am getting the long list of the names now!

Thank you for you help, and I look forward to your link!

if you had used the page i wrote above, you would have just opened https://jumear.github.io/stirfry/iNatAPIv1_taxa.html?taxon_id=47792 in your web browser and clicked on export (near the top of the page).

https://api.inaturalist.org/v1/docs/#!/Observations/get_observations

2 Likes

besides the API, there are 2 ways to get a full taxonomy download nowadays, though these are both monthly snapshots, as opposed to live data.

Option 1 (includes common names):

Option 2 (no common names):

2 Likes

It looks like the options discussed here just get you the list of taxa in iNaturalist. Is there a way to download the taxonomy—meaning both the taxa and synonyms?

not sure what problems you’re encountering. getting data from the API via /v1/taxa will definitely provide synonyms (at least as they are define within the system). just for example:

1 Like

Thanks! I’d tried the DWCA taxonomy export and the AWS metadata, neither of which has synonyms. Didn’t see them in your earlier link, either, but it looks like the “is_active=any” fixes that.

Do you happen to have any tricks for dealing with the 10,000 record export limit? What I’m trying to get to is the iNaturalist taxonomy for vascular plants in the continental United States, but due to the export limit it looks like I have to search through and find the list of all taxa that collectively cover the entirety of the vascular plants but individually have no more than 10,000 names. This seems like it’ll end up being several hundred taxa and thus several hundred queries & export files, which is a bit unwieldy.

i think synonyms should exist only on inactive taxa, since the iNat synonyms seem to be the result of taxon changes. so you could get active taxa from DWCA or AWS and the get only the inactives from the API. this is roughly 60000 taxa, which could probably be filtered down and/or segmented by a handful of logical groupings: https://jumear.github.io/stirfry/iNatAPIv1_taxa.html?is_active=false&taxon_id=211194.

1 Like

Thanks! That makes sense. Still a bit obnoxious, but much more manageable.

1 Like

you’re welcome. we all do the best we can with what we get.

@pisum Thank you - this is very helpful. Do you have an idea why the following call
https://api.inaturalist.org/v1/taxa?q=astragalus&is_active=any&per_page=500
does not include the ranks of sections and subsections?

the q= parameter initiates a search by string. if you want to look for all descendant taxa of the genus Astragalus, you need to search using taxon_id=49370 as the filter parameter in place of q=astragalus

1 Like

Thanks again!

1 Like

Synonyms can exist on any taxon. As you say they are generated as a result of a taxon change ie. the name of the ‘old’ inactive taxon becomes a synonym on the ‘new’ active taxon. I think they are sometimes added by the auto importer and can also be added manually by curators. They are just names with the Scientific Names (sci) lexicon with the ‘currently accepted?’ flag set to false.

what i was saying above was that i think as it’s technically recorded within the system, the synonyms exist only on the inactive taxon records, referring to the new replacement taxon / taxa.

do you have an example of this?

If the taxa endpoint is queried with the ‘all_names’ parameter set to ‘true’ eg: https://api.inaturalist.org/v1/taxa?q=dracopis&is_active=true&all_names=true Each taxon in the “results” array will have a “names” attribute with an array of one or more name objects. All names with “locale”: “sci” are the ‘Scientific Names’ lexicon, the “is_valid” attribute is “true” for the current name and “false” for synonyms.

ok. if you want to get the alternate / inactive scientific names like that, then i guess you would have to get the full taxonomy via the API and parse out all the alternate names that way. it may be easy or hard, depending on how many taxa you need to get.

so then to get something similar from the DWCA export, there would need to be (1) a separate names file for sci that (2) includes both active and inactive sci names (which would require modifications to the export, probably).

i still need to find a case where the alternate scientific names are manually added (as opposed to being done as a result of taxon changes) to see what exactly happens to the taxon records in such a case…