Search and filter identifications

Heterotheca subaxillaris is one my favorite plants in my local area. i love introducing people to it, especially the smell when you rub leaves.

there’s not an easy direct way to do this right now, but you can get maybe 85% of the way there using an observation search that includes the ident_user_id filter parameter, possibly in conjunction with the ident_taxon_id parameter.

for example, the below query would give you all observations where the observation ID is H. subaxillaris to which you have contributed an identification (your ID may or may not be H. subaxillaris):
https://www.inaturalist.org/observations?taxon_id=77398&ident_user_id=aspidoscelis

and the below query would give you all observations where someone contributed an identification = H. subaxillaris (not necessarily you) and where you contributed an ID (not necessarily H. subaxillaris):
https://www.inaturalist.org/observations?ident_taxon_id=77398&ident_user_id=aspidoscelis

neither of the queries above necessarily returns results where you made an identification of H. subaxillaris, but a lot of those results should fall in that category. and the advantage of this approach is that it’s relatively uncomplicated, and you can view things on a map relatively quickly and export the result relatively easily, since the observation search provides both those functions with just a few extra steps.

if you need to be more precise about getting only results where you made an identification of H. subaxillaris, then that’s a more complicated process. generally, you would have to get a set of your identifications and then get the associated observations. this can be done in various ways.

myself, i would probably do it mostly programmatically with some sort of scripting language + the API. in general, the main thing would be to GET /v1/identifications and extract json.results[i].observations.id, and paging based on json.total_count vs json.page * json.per_page (or until no more results are fetched). but if your preferred scripting language is R, and you’re having trouble going down that path, then i would suggest next trying a tool like (Microsoft) Power Automate to help you extract (scrape) the identifications data from either the Identification page or the page that i made that puts the API results in a human-friendly format.

in the couple of posts noted below, i’ve described an approach that will use Power Automate to scrape a few hundred observation IDs from the Favorites page and then query those in the Observation search page. the same kind of approach could be applied, replacing the Favorites page with the Identifications page as the source:

for more than several hundred observations (but fewer than 10,000), you could scrape/extract the observations ids (either using a script or via something like Power Automate) into a spreadsheet and then compare those against a downloaded/exported set of https://www.inaturalist.org/observations?ident_taxon_id=77398&ident_user_id=aspidoscelis.

if you have additional questions, or need more help either writing an R script or a set of actions for a Power Automate flow to handle data extraction, let me know.

as i understand it, this is where downloading data from a source like GBIF shines, since it provides you a DOI for your export set. (research grade observations from iNat which are properly licensed eventually get pushed to GBIF.) unfortunately, i’m not aware of similar functionality in iNat, and i assume the best thing to do in this system would be to capture everything in a project and then reference all observations in the project, or else make a giant query that references a comma-separated list of observation IDs, and then use a link-shortening service to provide a shortened reference to your giant link.

… that said, i’m not a researcher. so i have no idea what real researchers do exactly.

2 Likes