How to search for first observations in my state (mine)

i’m not a SQL-explaining mood today. so maybe some other day i’ll explain the SQL process for this specific case a little more. i wrote something previously (https://forum.inaturalist.org/t/getting-the-inaturalist-aws-open-data-metadata-files-and-working-with-them-in-a-database/22135#work-with-the-data-in-a-sqlite-database-6) that explains how to load data to a SQLite database and run some queries. so for those who are interested in databases, you could look at that for now.

(tip: if you don’t actually need to save the database, you can create a new in-memory database rather than creating and saving a new database.)

you can’t do a distinct count in SQL queries in Excel’s version of SQL (or at least not without some convolutions, i think). so i’m not going to explain how to do that.

it’s possible to do something similar in basic Excel + Power Pivot, but it’s more manual work than just importing data and running a query. i’ll explain the key steps below:

  1. export data from iNaturalist. here are the settings i’m using in my example:
    • Query: quality_grade=any&identifications=any&place_id=33&taxon_id=47157&hrank=species&verifiable=true
    • Columns: id, user_id, user_login, created_at, scientific_name, common_name, taxon_id, taxon_species_name
  2. open up the data in Excel:
  3. add a column to identify when a row has user_login=becksnyc. (below, I added a “becksnyc” header in cell I1, added a formula =IF(C2=I$1,1,0) in cell I2, and then copied the formula down the column for all rows in the dataset.)
  4. save your data as an .xlsx file at this point. (you need it saved to properly initiate a Power Pivot.)
  5. now select your data, and go to Insert > Pivot Table
  6. when prompted, make sure you check the “Add this data to the Data Model” box at the bottom, and click OK. (checking the box creates adds the data as a table in Power Query and creates a Power Pivot against that data. otherwise, you would get just a plain Pivot Table, which doesn’t handle Distinct Count aggregations.)
    image
  7. in your Pivot Table field thing, drop taxon_species_name into Rows, and drop becksnyc, id, and user_id into Values. then change the Value Field settings for id to summarize by Count instead of Sum. for user_id, summarize by Distinct Count instead of Sum
  8. at this point, the data that shows up in the Pivot Table is basically what you’ll need to get the 2 things you were looking for in your original post. if sum(becksnyc)=count(id), then that means the species was observed only by you. distinct count(user_id) tells you how many users (including you) observed the species. so just perform additional manipulation of the data to get your desired result.

EDIT: actually, now that i think about it again, you don’t need count(id) in the Pivot Table. if sum(becksnyc)>0 and distinct count(user_id)=1, then that means you’re the only one to have observed the species.

2 Likes