'Needs ID' pile, and identifications

Use pisum’s url from tables at the start of this topic and add date parametrs in url.
https://jumear.github.io/stirfry/iNat_obs_counts_by_iconic_taxa.html

Thank you, but unfortunately I am so unskilled at using the API that I cannot figure out how to reach into the past, so to speak, and use it to figure out what the ratio of Needs ID observations to Verifiable was for a date in the past. I can use the Explore tab in iNat proper to figure out the current ratio for observations made before a certain date (and the ratio is less than 39%, for the couple of dates I checked), but I would expect that ratio to decline over time anyway, as identifiers get around to older observations.

Maybe we should ask @pisum to contribute their considerable expertise with the API to this question?

1 Like

me too. But I pull up 10 obs at a time. If it’s the dubious, I glance thru, pick out what I can - and mark all as reviewed.
For the new, or few - I will try my best to move each ID on as far as I can.

1 Like

Hmm, hard one. I’m not sure you can pull the past status of observations, only their current status? But I haven’t tried it yet. It might be easier to hunt down the links for previous Year in Review’s, although I don’t know how many years staff has been making those.

Good point. Hey, @tiwane, any chance you could weigh in here on how the ratio of Needs ID to Verifiable observations has changed over the years?

At least to me, it seems pretty stable at least over the last 5 years. I mean it does fluctuate by up to 6% but there’s no wild upward swing. Here’s from the Year in Review each year
2017 36.7
2017, 36.7% Needs ID


2018, 40.18%

2019, 40.57%

2020, 37.74%

2021, 42.02%

Note: I am not completely sure whether these display observations that were Needs ID at the time or they are observations from that year that still need ID even now. I suspect it’s the latter.

3 Likes

It seems that 2014 is the first year for which the Year in Review link functions. Interestingly, unlike the last five years, Needs ID does seem to be lower if you go back more than 5 years ago, showing high 20s or low 30s rather than high 30s or low 40s:
2016, 32.89%
2015, 29.71%
2014, 28.63%

Edit: here’s a graph. However as I said above, I am not sure if these are observations from that year that still need ID even now or observations that were Needs ID at the time.
needs ID

3 Likes

it is possible to answer these kinds of questions only if you have a snapshot of a metric you’re trying to compare, from the point in time that you’re trying to compare against. for needs ID vs verifiable, i’m only aware of a few snapshots of data i’ve captured (ex. https://forum.inaturalist.org/t/needs-id-pile-and-identifications/26904/4).

iNat staff will have to say if they’re collecting more systematic snapshots of any of these measures.

with the year end / new year coming soon, it may be worth thinking about a few such metrics you’re particularly interested in comparing over time, and taking snapshots of those so that they can be referenced in the future.

3 Likes

https://www.inaturalist.org/stats/ will show that, but only for the past 12 mo:


The number of identifiers seems much more static than the number of observers.

1 Like

Ah, thanks, @pisum! That’s what I thought, but I wasn’t sure. That’s why I pulled in @tiwane, because I assume iNat is keeping track of such things behind the scenes.

And thanks to you, too! If the number of identifiers has not grown at the same rate as observers over the past decade, then maybe the underlying issue is that we need to recruit more identifiers.

Perhaps I shall badger my botanist friends in the new year…

1 Like

Did you conclude that based on what I said? I don’t think the info I provided supports that conclusion. The graph from https://www.inaturalist.org/stats/ is one year, not a decade, and the graph I made based on Year in Review numbers was just number of Needs ID observations, not referencing number of observers or number of identifiers.

1 Like

As far as I know we haven’t done much delving into this, but I’m definitely not the stats guy.

1 Like

@ddubois2 You got me curious about this too so I’ve attempted some number crunching. My quoted statement above, although it feels true to me as I look through plant observations, probably isn’t true.

Observers in my county:
17,660 who have at least one observation in my county
13,830 who have at least one verifiable observation in the county
(Therefore I assume 3,830 people have only casual observations.)
8,764 who have at least one observation in the county that Needs ID.
(Therefore 5,066 people have only research grade observations in this county, wow.)

Observations in the county:
317,785 total
277,201 verifiable
73,684 Needs ID

The top 500 observers of verifiable observations individually range from 59 verifiable observations to 17,603. Collectively they have 192,756 verifiable observations or 69.5% of total verifiable observations.

The top 500 observers of Needs ID observations, which are not necessarily the same people as above (259 overlap but the rest don’t) individually range from 18 Needs ID obs to 4665. Collectively they have 46,802 or 63.5% of the Needs ID observations.

If I consider only the 259 people who are on both lists, they have 63.7% of verifiable observations and 59.0% of Needs ID observations.

3 Likes

Nope, I agree that what you provided from https://www.inaturalist.org/stats/was limited to one year, and that the Year in Review numbers was just the number of Needs ID observations.

What I did conclude from the one-year stats was that it was possible that the ratio of identifiers to observers has decreased over the years.

So who should we ask for those sorts of stats?

It’s certainly interesting how the observer:identifier ratio fluctuates from about 2:1 to 5:1 just depending on the time of year.

4 Likes

i thought a bit more about whether this can be answered without snapshots of earlier data. it won’t reflect things exactly right at past points in time, but you can estimate data from past points in time using some API calls. here are the results (for verifiable observations), as retrieved on 2021-12-28:

i’ll let you interpret the data for yourselves, though i will caution against using this data alone as the basis for saying whether or not growth in identifiers is sufficient.

in case anyone wants to try to replicate or re-run my numbers above, i created this in Excel. assuming you have Excel and your version / setup allows you to run the LAMBA and WEBSERVICE functions, you could create the same thing by creating these custom named functions (go to Formulas > Defined Names > Define Name…):

  • getCountFromJSON
    get the numeric value associated with the first element (total count) in the API response (in JSON format)
    =LAMBDA(j,MID(j,FIND(":",j)+1,FIND(",",j)-FIND(":",j)-1))
  • getObserverCount
    Get number of Observers based on verifiable observations created up to a given datetime
    =LAMBDA(d,getCountFromJSON(WEBSERVICE("https://api.inaturalist.org/v1/observations/observers?per_page=0&verifiable=true&created_d2="&d)))
  • getIdentifierCount1
    Get number of Identifiers based on identifications up to a given datetime (for others’ verifiable observations)
    =LAMBDA(d,getCountFromJSON(WEBSERVICE("https://api.inaturalist.org/v1/identifications/identifiers?per_page=1&quality_grade=research,needs_id&current=any&own_observation=false&d2="&d)))
  • getIdentifierCount2
    Get number of Identifiers based on identifications up to a given datetime (for verifiable observations)
    =LAMBDA(d,getCountFromJSON(WEBSERVICE("https://api.inaturalist.org/v1/identifications/identifiers?per_page=1&quality_grade=research,needs_id&current=any&d2="&d)))
5 Likes

It’s understandable with summer urge of observing and stable iders who work year around.

4 Likes

For me, the big turn off (concerning ID’ing plants) are the way too many observations of garden plants, pot plants, plants in botanical gardens etc. Why people think what they grow in their garden is relevant for Inaturalist (and similar platforms) is beyond me.

1 Like