I wanted to get a list of the complete taxonomy for all iNaturalist plant taxa. Supposedly I can get it from the “iNaturalist Taxonomy DarwinCore Archive” on the developers page.
However, while this file does contain a row for every little piece of the taxonomy between family and genus (i.e., tribes and subtribes) and between genus and species (i.e., subgenera and sections), it does NOT provide any information about which tribes and subtribes the genera belong to, nor which subgenera and sections the species belong to.
So I can see that there’s a section called Erianthera somewhere in the genus Penstemon, but I don’t know which subgenus it’s in, and I don’t know which species it contains.
As far as I can tell, the only way to get this information would be to scrape the API, and I’d rather not do that! Is there any chance you might provide a truly “complete” taxonomy export?
That’s true, one COULD rebuild the taxonomy by working backwards through a chain of parents, but that requires some serious programming or a series of crazy spreadsheet formulas.
As long as this file is being made available, it would be nice if it was truly complete. We’re being given family and genus without having to do any recursive calculations; why not the other ranks too?
(Maybe this could be moved to “Feature Requests”?)
i don’t see how scraping the API would get you information any faster than just transforming the DWCA taxon file to whatever format you like. it’s not clear to me either why jwidness’s suggestion wouldn’t get you what you need, if you’re just dealing with one particular section.
maybe if you explained what exactly you’re trying to do and what tools / languages you’re using, you might get to what you need faster. it’s not clear to me that adding tribes, subgenera, etc. as columns to the DWCA export help you get the information you need any faster than other possible methods out there.
I’m trying to find a database of taxon hierarchies for all vascular plants; it would be a very useful thing to have around.
Let’s say you want to know which subfamily Goodyera oblongifolia belongs to (if any), and also any tribes or subtribes in between. If the full hierarchy were spelled out in the file, you could just look at the “subfamily” column.
Without that, here are the steps you’d need to take:
Look up the ID for Goodyera oblongifolia’s genus (Goodyera): 50718
Find 50718’s parent taxon ID: 790290
Check whether 790290 is a subfamily, since that’s what we’re looking for
790290 isn’t a subfamily (it’s a subtribe), so find 790290’s parent ID: 544791
Check whether 544791 is a subfamily
544791 isn’t a subfamily (it’s a tribe), so find 544791’s parent ID: 738348
Check whether 738348 is a subfamily
Yes, 738348 is subfamily Orchidoideae. Now we finally have the full hierarchy, assuming we saved all the bits and pieces along the way.
(If the taxon in question was NOT in a subfamily, we’d have to go one more step up to make sure we’d found the family.)
I can write a program to do all this if I need to, I’m just saying it’d be helpful if iNat could produce a file that doesn’t have so many missing pieces.
how are you planning to interface with this “database”? are you reading it directly with your human eyes? or are you telling a machine to retrieve the information and present it to human eyes in some particular format?
i guess the flow that you’re describing is only difficult if you’re trying to read the “database” directly with your human eyes. if it’s going to be retrieved by a machine and laid by a machine out for human consumption, it may take some initial work to tell the machine how to do this, but once that’s done, i don’t see what the layout you’re describing gains you. in fact, it’s just taking up extra storage at that point.
A lot of people would prefer to access information in spreadsheets, not databases.
What I don’t understand is, why provide SOME pieces of the hierarchy but not all of them? If it’s true that all you need is the parent taxon ID, then the file could be 4 columns: ID, parent ID, name, and rank. Why are order and family included for every taxon?
but is that how you plan to use this data? if not, why ask for someone to make a change to satisfy a hypothetical need when effort could be spent addressing actual needs?
i don’t know, but i would just guess that the ones that are there are the ones that are always going to be there for any species. the ones that you’re asking for are only going to be useful for some.
but inclusion / exclusion is always going to be some subjective decision based on some perceived need. if you look at the Darwin Core Taxon definition from the Darwin Core Maintenance Group, they include tribe and subfamily but not, say, suborder. why draw that particular line?
I was looking for the same list and didn’t found a solution yet. My computer knowledge is limited somewhere and I don’t want to spend weeks, to figure out, how to do such a list.
I am looking for a list, to complete our national checklist, which is not providing/ dealing with this kind of information.
I would be quite happy to get such a list with “just three klicks” and no programming knowledge.
Thanks @adamschneider for bringing up this topic. Happy to know that I’m not alone with this need.
I think a lot of the reason is that beyon the 7 standard ranks, there is little consistency. Zoology and botany use different sub-hierarchies (sometimes with the same name used at different levels), and the implementation of these intermediate ranks in iNat is very ad-hoc, with curators adding them for some taxa and not others seemingly at whim (but in reality probably in response to taxon flags). See this thread for a discussion of some of the complexities and problems: https://forum.inaturalist.org/t/implement-the-botanical-rank-series-between-subsection-and-complex/17735.
So, Adam, I don’t think you’d get a very satisfying result for your purpose even if the full hierarchy was more easily available.
why would you need things like tribes and sections on a national checklist?
how are you planning to merge your checklist and this other list? i would think that if you have the skill to merge the two lists, it’s more or less the same skill needed to get what you want from this list already.
I don’t need it FOR the checklist, the checklist is instead the basis of my own research, where I want to link more information to.
It’s quite helpfull, to have more information about the relationship between different species, for example grouped together in an aggregat.
There is at the moment no list available, so I don’t need to plan anything about “how to”. I will keep seperate lists for different topics, because nobody wants to share a neat list “all in one”.
Nomenclature is different in every country and I want to try at least for my own purposes to get them closer together. Our national experts won’t do so, because it’s tricky at all, but that’s no reason to not just to try to come close to it.
It would be much easier work, if I don’t need first sort the iNat nomenclature to a neat list, before I can start with my own list. I just want to use a list and do my work. When the information, I needed, is missing on iNat, than I will find it elsewhere or not. And do a lot of manual work in next wintertime. INat could have been a place with easy access to the information alternatively.
I tried to do it some way with a download of the country’s observations, but this will not give a complete list at all. Thats frustrating. There was a link to a site with all nice nomenclature, but with about 300’000 rows to download in packages of 10’000 rows. No, thanks…
For the use of the API my knowledge is not enough. I can’t even follow your topics, where you try to explain a “how to”. O:) I need a simple excel sheet to work with. Filter for my country’s (or other countries/ regions of interest) for its taxonomy. And don’t go through it one by one by iNat-ID.
that is effectively a simple version of the dynamic place list that we were promised many moons ago when dynamic (user) life lists were implemented.
or if you’re talking about an grouping things in aggregate, maybe you’re thinking about representing these in some sort of visualization like the one below?
Cool, that’s a list, I can work with. Many thanks for it! That’s what I was looking for. I hadn’t had the idea, that I can construct the link on that site for my needs.
How did you create the graphic? That would be a very nice tool additionally.
i adapted something i made long ago: https://observablehq.com/@robin-song/inaturalist-observations-by-taxon. this was intended to take a user id, get the user’s dynamic life list data, and then generate one of 4 possible visualizations – the sunburst chart that you saw earlier, a zoomable version of the sunburst chart, an icicle chart, or a zoomable version of an icicle chart. by default, it produces a sunburst chart. (i think the icicle chart is broken now, but the other charts still work.)
as is, the page will not give you data for plants of Switzerland because it expects to work on user ID. however, you can adapt it like i did:
scroll down to the “Get data from iNaturalist…” section of the page:
click on the three dots (see the red circle in the image above) next to the line that begins with “apiResponse”, and select “Edit” from the menu that appears
then modify the line that is highlighted and referenced by the red arrow in the image above. the part after the ? defines the parameters. so to get, say, the plants of Switzerland, you would use let reqURL = 'https://api.inaturalist.org/v1/observations/taxonomy?place_id=7236&iconic_taxa=Plantae&verifiable=true';
after your modification, click the run button (see the red square in the image above).
then scroll back up to the top of the page, and you should see an updated sunburst chart.
If you just need to pull the full taxonomy for a specific list of species, that’s easy to do with the API. Here’s a little script I wrote to do that via a web form: https://github.com/carnegiemnh/iztools/blob/main/taxonomylookup.php. You just paste in your list of species, click submit, and it returns a table with the full taxonomy for each species. You can even configure which ranks it returns by changing the $ranks variable.
when I download the file and open it in the browser (Firefox), and paste a short list of species names in the box, and execute, there is no result. Only the empty first row of the table.
You have to be running PHP on whatever server you load the page from. You can try it out at https://iztools.toolforge.org/taxonomylookup.php, although I can’t guarantee that page will stay up for very long. Note that if you have a long list of species it may take several minutes to process due to iNaturalist API rate limits.