How to search for first observations in my state (mine)

pisum · May 27, 2021, 7:36pm

i’m not a SQL-explaining mood today. so maybe some other day i’ll explain the SQL process for this specific case a little more. i wrote something previously (https://forum.inaturalist.org/t/getting-the-inaturalist-aws-open-data-metadata-files-and-working-with-them-in-a-database/22135#work-with-the-data-in-a-sqlite-database-6) that explains how to load data to a SQLite database and run some queries. so for those who are interested in databases, you could look at that for now.

(tip: if you don’t actually need to save the database, you can create a new in-memory database rather than creating and saving a new database.)

you can’t do a distinct count in SQL queries in Excel’s version of SQL (or at least not without some convolutions, i think). so i’m not going to explain how to do that.

it’s possible to do something similar in basic Excel + Power Pivot, but it’s more manual work than just importing data and running a query. i’ll explain the key steps below:

export data from iNaturalist. here are the settings i’m using in my example:
- Query: quality_grade=any&identifications=any&place_id=33&taxon_id=47157&hrank=species&verifiable=true
- Columns: id, user_id, user_login, created_at, scientific_name, common_name, taxon_id, taxon_species_name
open up the data in Excel:

image1920×269 29.8 KB
add a column to identify when a row has user_login=becksnyc. (below, I added a “becksnyc” header in cell I1, added a formula =IF(C2=I$1,1,0) in cell I2, and then copied the formula down the column for all rows in the dataset.)

image1920×271 30.8 KB
save your data as an .xlsx file at this point. (you need it saved to properly initiate a Power Pivot.)
now select your data, and go to Insert > Pivot Table
when prompted, make sure you check the “Add this data to the Data Model” box at the bottom, and click OK. (checking the box creates adds the data as a table in Power Query and creates a Power Pivot against that data. otherwise, you would get just a plain Pivot Table, which doesn’t handle Distinct Count aggregations.)
in your Pivot Table field thing, drop taxon_species_name into Rows, and drop becksnyc, id, and user_id into Values. then change the Value Field settings for id to summarize by Count instead of Sum. for user_id, summarize by Distinct Count instead of Sum

image1412×870 60.6 KB
at this point, the data that shows up in the Pivot Table is basically what you’ll need to get the 2 things you were looking for in your original post. if sum(becksnyc)=count(id), then that means the species was observed only by you. distinct count(user_id) tells you how many users (including you) observed the species. so just perform additional manipulation of the data to get your desired result.

EDIT: actually, now that i think about it again, you don’t need count(id) in the Pivot Table. if sum(becksnyc)>0 and distinct count(user_id)=1, then that means you’re the only one to have observed the species.

Topic		Replies	Views
URL search for new species of Lepidoptera this year General	6	250	July 2, 2021
How can I find certain observations? General	6	381	October 22, 2022
How to find solo observations in an area? General question	11	456	December 27, 2024
Accurate species counts General	4	354	March 6, 2024
Way to search for species with less than a certain number of observations General	5	209	August 10, 2024

How to search for first observations in my state (mine)

Related topics