Add ancestor rank_level to GET/observations response (API)

Platform: API

URL: https://api.inaturalist.org/v1/observations

Description of need:
The GET/observations call on the iNaturlist API returns an array of ancestors for the Observation’s identified Taxon. However it is currently not clear what the rank levels are for these taxa. One needs to fetch each individual taxon via the GET/taxa/{id} call.

Feature request details:
A lot of additional calls can be avoided if more information about the ancestors’ rank level where included in the response.

Other fields like the icon, name and preferred name could also be nice to see.

Would be nice to get for /v1/observations/species_counts as well (if that’s not for free with this change).

With this URL, you can get detailed infos for up to 30 observations in 1 API resquest:

https://api.inaturalist.org/v1/observations/16764950 [ ,… + up to 29 other IDs ]

2 Likes

not sure what you’re trying to do exactly with ancestor taxon rank levels. (i can only assume it has something to do with your website.) i also don’t understand why it’s such a burden to get from /v1/taxa/{id}…

maybe it would help to know that you can fetch more than one taxon per request. for example: https://api.inaturalist.org/v1/taxa/1,3.

I wanted it on the species_counts response because I was playing with grouping by levels other than species.

i don’t really understand how having rank_level in the species_counts response helps you that much though. it’s just a couple of extra steps to collect an array of (ancestor) taxa from the species_count response and then retrieve them from /v1/taxa/{id}

That’s a ton of extra requests, depending on the set. Much easier if the taxon rank is already available.

why would that be a ton of requests? you can request multiple taxa per request.

You can request up to 500 results per page for species_counts. As an example, there are 1459 distinct ancestor taxon IDs for a set of 500 of my observations; that would be 49 taxa requests to get the rank for each to do grouping.

if you have 1459 ancestor taxa to look for, then you still should be able to get rank_level more efficiently than 49 requests to /v1/taxa/{id} (at 30 records per request). rank_level is also included in the response from /v1/taxa (which will return up to 500 taxa per page and allow input of strings of 500 comma-separated taxa in parameters). although it doesn’t accept an exact_taxon_id parameter, you can use a combination of parent_id and taxon_id to get specific taxa for each step in the ancestry chain.

so for example, suppose you have the following ancestry chains:

in this case, the 3rd item of the chains above are 211194 and 2, and you can get those 2 items by doing a GET request via https://api.inaturalist.org/v1/taxa?per_page=500&parent_id=47126,1&taxon_id=211194,2.

this can be streamlined a bit by getting the taxa from the first few steps (up to 30 taxa total) from /v1/taxa/{id}. so for example since life + kingdom contain less than 10 taxa, usuaally, you would be able to carve out the taxa from at least the first 2 links in the ancestry chains to be queried via /v1/taxa/{id}

finally, you know that the last link in each chain is the main taxon itself. so those last links can be ignored because you already can get their rank_level values from the /v1/observations and /v1/observations/species_counts responses.

i don’t know what the maximum number of links in the ancestry chains of all taxa in the system is, but just doing a quick scan, i didn’t see chain with more than 15 links (16, if you include the main taxon itself).

so assuming max 15 ancestors, if you go with the kind of approach described above, then you could get your 1459 ancestor taxa in probably 14 or fewer requests, consisting of:

  • a request to /v1/taxa/{id} for at least the first 2 steps (life + kingdom)
  • a request to /v1/taxa?per_page=500&parent_id={parent_id}&taxon_id={id} for each remaining step in the ancestry chains.

Okay, sure. Even that many requests though feels like a smell to me in building an extension feature (which is what I wanted to do, if that was unclear). It’s not that it’s impossible, just that it’s inefficient. I’d rather just have an optional parameter on the first call that allows for the response to contain an array of full models for each taxon in the hierarchy rather than just a list of IDs. Thanks for the helpful suggestion, though.

hmmm… if you’re building an extension and you want to be efficient, then seems to me like you should cache the taxa locally. i don’t know how you’re planning to present the full hierarchy exactly, but it seems like if you started with the hierarchy collapsed at the highest nodes, then you could just get the lower node information as the parent nodes were expanded.

Indeed; taxon data likely change infrequently and should be aggressively cached. Nonetheless, the point of the feature request stands–it would be ideal to be able to opt in to getting the full taxon models in the initial response.

Thanks, I didn’t realise that I can load 30 at once. Although it isn’t ideal, it is a better workaround than loading them one by one. I’ll make the code change on my side accordingly (when I have time to work on it again - its been a bit busy on my side lately).

It looks like your suggestion seems to be the best workaround, but it remains a workaround. I think having the information available in the original response would be much better and eliminate the need for any workarounds.

i don’t think of getting detailed taxon information from a taxon endpoint as a workaround. really, if you wanted a way to more easily get taxon info, the request should be either to allow input of up to 500 taxa to /v1/taxa/{id} or to add an exact_taxon_id parameter to /v1/taxa.

I agree that the v1/taxa/{id} endpoint is what should be used to get details. My issue is that if I only care about lets say the Family ancestor of the observation, then I have to load all ancestors (from the start) because I don’t know which one of those ancestors in the array is the Family one. If the v1/observations retuned a little bit more context about each ancestor then it would make things easier.