Search and filter identifications

Not on iNat, but the API can get you there. Here’s a tool @pisum made for visualizing it better:
https://jumear.github.io/stirfry/iNatAPIv1_identifications.html?&rank=genus&current=true&user_id=someplant&taxon_id=576771

2 Likes

Thank you!

1 Like

I think @pisum’s tool would work well for this too:

https://jumear.github.io/stirfry/iNatAPIv1_identifications.html?current=any&user_id=mftasp&d1=2020-11-01&d2=2020-11-30&per_page=2

then look at the “total records” at upper left in the results. This URL is for November 2020, but just edit the dates in the URL for any other time period.

This should also accept a place_id= parameter if you want to limit it to observations from a particular place (or comma-separated places, if you are really lucky…;-)

EDIT: yep, just tested, you can use multiple place_id numbers separated by commas.

Note: I changed per_page to a small number to speed the return of results, since you are mainly interested in the “total records” number. The default is 30 per page which takes a few seconds, and higher numbers go even slower (presumably since it is querying the API to build the page display). But if you are interested in seeing details in the results, just change the number. (Heads up, the max of 200 took at least a minute to return results.)

3 Likes

if you just need a count of identifications for an identifier, the most efficient endpoint probably will be the Identification Identifiers endpoint.

ex. https://jumear.github.io/stirfry/iNatAPIv1_identifications_identifiers.html?current=any&user_id=mftasp&d1=2020-11-01&d2=2020-11-30

also, when searching for identifications using any of the Identifications API endpoints, make sure you add &own_observation=false to the parameter list in your API request, if you’re looking for identifications where the identifier was not the observer.

2 Likes

I am developping a tool for reviewing and identifying observations (primarily intented for reviewing the “Unknown” observations, but any search query may be used to collect the observations to review). Having your question in mind, I included the generation of a daily and monthly report of the identifications made by the user. This could match your need, provided you do all your identifications with this tool. Coming soon…

See here:
https://forum.inaturalist.org/t/amount-of-unknown-records-is-decreasing/8594/456

1 Like

I like this idea. I think the identifications page should be formatted the same way as the species page, where it lists all your IDs by species, starting with the most observed. Then you could click on a species and be taken to all the individual observations which you put your ID on.

1 Like

Did you ever figure out how to download identifications as a table?

what is an example of a use case for this?

The specific case I’m thinking of at the moment is: Guy Nesom revised Heterotheca section Chrysanthe. I’ve been going through iNaturalist observations based on Nesom’s revision. I’ve also been writing identification keys for states and other regions. I’m thinking I’ll write a short publication basically getting the keys out into the world. I would like to be able to provide a data file of the observations I have identified as each species, as well as maps of those observations. However, I have no way to export the set of observations I have identified as a given taxon.

I can imagine various similar cases in which I might want to do the same for either my identifications or the identifications of others. For my own purposes, the basic minimal requirement to meet the cases I can readily think of at the moment would be: provide values for taxon_id and user_id, return a table with taxon_id, user_id, and observation_id. Then I can use the export tool to get a table with coordinates and whatnot, and join the taxon_id & user_id to it using observation_id. Cases where the community ID on an observation is very different from the taxon_id I’m interested in would not be easily handled, but those are rare enough that I’m not bothered.

So far, I’ve basically figured out that the information I’m interested in is indeed accessible through the iNaturalist API, but that figuring out what to do with it is beyond my current skills. The data I get back in R from something like httr::GET(“https://api.inaturalist.org/v1/identifications?user_id=aspidoscelis&taxon_id=77264”) has a more complicated structure than I know what to do with.

Also, for what it’s worth, I think this is a basic requirement anyone who uses iNaturalist in taxonomy is going to have. In ye olden days, you’d indicate which specimens you consulted by writing out a list like this:

Mexico. Coahuila: Encina & al. 1634 (MEXU 1372693); Palmer s.n. (YU 20021); Pinkava & Reeves R-4329 (HUAP 27834, MEXU 1403971); Wynd & Mueller 318 (US 1639759). Nuevo León: Briones 1883 (BRIT 432135); Copeland s.n. (MICH 1208316); Dorr & al. 2575 (UC 1513612); Estrada 16202 (BRIT 432136); Fryxell & Kirkpatrick 2469 (VT 286371); Gastony & Yatskievych 86-24 (IND 3412); Hinton & Hinton 21460 (MO 3605676); Hinton 21140 (MO 3605321); Kimber s.n. (PH 737306); Knobloch 2017 (MSC 267092); McCulloch 76-71-Mc (MSC 267090); Palmer s.n. (YU 20022); Pennell 16954 (HUAP 27834, MEXU 1403971); Rodríguez 88 (MEXU 821157); Storer 68 (MICH 1208287). San Luis Potosí: Gastony & Yatskievych 86-27 (IND 3409); Pringle s.n. (HUAP 27834, MEXU 1403971). Tamaulipas: Bartlett 10183 (MICH 1208222); Bartlett 10313 (MEXU 88785, MICH 1208286, US 1490578); Bartlett 10658 (MICH 1208223); Bartlett 10707 (MICH 1208291); Bartlett 10802 (MICH 1208219, US 1490603); Briones 1234 (MEXU 844575); Knobloch 2245 (F 633209, MSC 267094); Runyon 717 (BRIT 432123); Walker & Baker 2088 (WIS 113330); Windham & al. 500 (UT 99958); Yatskievych & Gastony 86-44 (IND 136927).

So, phrased more generically, the question for digital biodiversity data in taxonomy is: What’s the best way of filling the role of a specimen list, and how do we ensure it isn’t nearly as tedious as typing in hundreds of accession numbers?

Heterotheca subaxillaris is one my favorite plants in my local area. i love introducing people to it, especially the smell when you rub leaves.

there’s not an easy direct way to do this right now, but you can get maybe 85% of the way there using an observation search that includes the ident_user_id filter parameter, possibly in conjunction with the ident_taxon_id parameter.

for example, the below query would give you all observations where the observation ID is H. subaxillaris to which you have contributed an identification (your ID may or may not be H. subaxillaris):
https://www.inaturalist.org/observations?taxon_id=77398&ident_user_id=aspidoscelis

and the below query would give you all observations where someone contributed an identification = H. subaxillaris (not necessarily you) and where you contributed an ID (not necessarily H. subaxillaris):
https://www.inaturalist.org/observations?ident_taxon_id=77398&ident_user_id=aspidoscelis

neither of the queries above necessarily returns results where you made an identification of H. subaxillaris, but a lot of those results should fall in that category. and the advantage of this approach is that it’s relatively uncomplicated, and you can view things on a map relatively quickly and export the result relatively easily, since the observation search provides both those functions with just a few extra steps.

if you need to be more precise about getting only results where you made an identification of H. subaxillaris, then that’s a more complicated process. generally, you would have to get a set of your identifications and then get the associated observations. this can be done in various ways.

myself, i would probably do it mostly programmatically with some sort of scripting language + the API. in general, the main thing would be to GET /v1/identifications and extract json.results[i].observations.id, and paging based on json.total_count vs json.page * json.per_page (or until no more results are fetched). but if your preferred scripting language is R, and you’re having trouble going down that path, then i would suggest next trying a tool like (Microsoft) Power Automate to help you extract (scrape) the identifications data from either the Identification page or the page that i made that puts the API results in a human-friendly format.

in the couple of posts noted below, i’ve described an approach that will use Power Automate to scrape a few hundred observation IDs from the Favorites page and then query those in the Observation search page. the same kind of approach could be applied, replacing the Favorites page with the Identifications page as the source:

for more than several hundred observations (but fewer than 10,000), you could scrape/extract the observations ids (either using a script or via something like Power Automate) into a spreadsheet and then compare those against a downloaded/exported set of https://www.inaturalist.org/observations?ident_taxon_id=77398&ident_user_id=aspidoscelis.

if you have additional questions, or need more help either writing an R script or a set of actions for a Power Automate flow to handle data extraction, let me know.

as i understand it, this is where downloading data from a source like GBIF shines, since it provides you a DOI for your export set. (research grade observations from iNat which are properly licensed eventually get pushed to GBIF.) unfortunately, i’m not aware of similar functionality in iNat, and i assume the best thing to do in this system would be to capture everything in a project and then reference all observations in the project, or else make a giant query that references a comma-separated list of observation IDs, and then use a link-shortening service to provide a shortened reference to your giant link.

… that said, i’m not a researcher. so i have no idea what real researchers do exactly.

2 Likes

Thanks!

Using taxon_id + ident_user_id gets me more like 1/3 of the way there, unfortunately. Using ident_taxon_id + ident_user_id gets pretty close, because most of these are new names that have rarely been used by other iNaturalists. However, that also makes it a pretty fragile solution. With both of these options I’d have to think pretty carefully on a species-by-species basis about how badly I should expect them to perform and what to do about it.

Poking at the API in R a bit more, it looks like if I’m just trying to get the observation_id out, I can get it to work. Then I just add the taxon_id and user_id after the fact. Earlier I was thinking I should pull all the relevant fields out of the json result, but I’m used to thinking in tables and don’t know how to handle a massive hierarchy of nested lists, especially when I have no idea what the data structure is beyond what I can see from poking it with a stick. I don’t know how to handle paging well, either, but I think I can get something to work in the least elegant way possible. :-)

For what it’s worth, I think GBIF was designed with a “get a pile of data and hope someone upstream did good QA / QC” workflow in mind. If I wanted a bunch of point data for a species that is taxonomically unambiguous and easy to ID, I expect it’d work great. If I want to look at and identify the observations, iNaturalist has a very good UI designed with a different target audience in mind, and GBIF is basically just a dark closet where hope goes to die.

2 Likes

Also, as it happens, Heterotheca subaxillaris is in Heterotheca section Heterotheca rather than Heterotheca section Chrysanthe. So, though it’s a lovely species, for my purposes it’s by-catch. :-)

assuming that observation id is json.results[i].observation.id, then if you want observation taxon and observation user, they would be json.results[i].observation.taxon.name (or json.results[i].observation.taxon.id) and json.results[i].observation.user.login (or json.results[i].observation.user.id), respectively.

identification taxon name is harder to get without some extra logic.

I don’t think I did! But maybe I can.

Sorry, I meant the user_id and taxon_id associated with the identification, not with the observation. But since those are things I’m basing the query on, well, I already know what the values are…

1 Like

Thanks to @pisum pointing me in the right direction, I think I’ve figured it out–but if you ever come up with an easy method, let me know!

2 Likes