Taxonomy download with synonyms

Platform(s): Website, iNaturalist Taxonomy DarwinCore Archive.

URLs: https://www.inaturalist.org/taxa/inaturalist-taxonomy.dwca.zip

Description of need: At present, the DWCA taxonomy download is a taxon list without synonymies. In order to work with iNaturalist data, and in particular to deal with both iNaturalist data and other data using other taxonomies, the full iNaturalist taxonomy including synonymies would be much more useful. At present it’s difficult to get that information by other means—see recent discussion here.

Feature request details: Changing the DWCA taxonomy download—or providing an additional download and leaving that one unchanged, I suppose—that corresponds to an “active=any” rather than “active=true” query and includes the “synonym_id” field would resolve the issue.

I think this should be very easy to implement, even if not used by a large number of people.

An update: as of submitting the feature request, I had figured out how to download the iNaturalist taxonomy, but I hadn’t yet done so. If others are curious, the “how” follows.

First, the easy part, download all “active” names at: https://www.inaturalist.org/taxa/inaturalist-taxonomy.dwca.zip.

To get all the “inactive” names, use this helpful tool created by @pisum: https://jumear.github.io/stirfry/iNatAPIv1_taxa.html?taxon_id=47604&is_active=false. On iNaturalist, navigate to the higher-level taxon whose taxonomy you want to download, e.g.: https://www.inaturalist.org/taxa/47604-Asteraceae. Copy and paste the taxon_id into the URL above. If there are more than 10,000 results, you won’t be able to download them all. Open the iNaturalist pages for all taxa at the next rank down that are included within that taxon. Copy & paste taxon_ids for each of these, and continue moving down the taxonomic hierarchy as needed.

To download the iNaturalist taxonomy for land plants took me a total of 65 iterations of the copy / paste taxon_id & export csv process. Having done it… I think I’d rather not do it again.

i think this could be done more efficiently. knowing that https://jumear.github.io/stirfry/iNatAPIv1_taxa.html?taxon_id=47604&active=false brings back roughly 30,000 records, you could aim to get 3 sets of just under 10,000 records using the id_above and id_below parameters:

this is more what i was thinking when i talked about “a handful of logical groupings” in the earlier post. sorry i didn’t provide more detail, since that may have saved on some extra work.

1 Like

It could also be done more efficiently by me typing the tags correctly… accidentally writing “active” rather than “is_active” means I was actually just downloading the same data as in the DWCA file. Ho hum. Doing it correctly I end up with 65 downloads.

Also, for what it’s worth, that URL was merely illustrative—there isn’t a single iNaturalist taxon that corresponds with “land plants” a.k.a. Embryophyta, rather it would be at the highest level this set:

https://jumear.github.io/stirfry/iNatAPIv1_taxa.html?taxon_id=56327&is_active=false
https://jumear.github.io/stirfry/iNatAPIv1_taxa.html?taxon_id=311249&is_active=false
https://jumear.github.io/stirfry/iNatAPIv1_taxa.html?taxon_id=64615&is_active=false
https://jumear.github.io/stirfry/iNatAPIv1_taxa.html?taxon_id=211194&is_active=false

Also, I wasn’t aware of the “id_above” and “id_below” tags. Those might come in handy, thanks!

note that you can get multiple comma-separated taxa in one set: https://jumear.github.io/stirfry/iNatAPIv1_taxa.html?taxon_id=56327,311249,64615,211194&is_active=false

1 Like

That bit of functionality I have used before, just wasn’t occurring to me in this context. :-)

I am working through a database. The selection is lichenized and lichenophilic fungi. This is a scattered group in the Fungi kingdom. The SQL script allows me to get rid of the taxon ID search, tracking of the number of returned results.
So, yes. Importing synonyms into the taxonomic tree is a good idea.