Accuracy Experiment: Increase in Uncertain IDs over time

gcwarbler · March 6, 2026, 3:03pm

I’ve been studying the results of the most recent Observation Accuracy Experiment (0.6). There is a lot to unpack there. Some discussion has taken place under the Blog post about the same experiment, but delving into the detailed outcomes of such an experiment is difficult to accomplish or follow in comments on such a blog post. So I’m posting this topic separately.

My Observation: In the graph showing the accuracy results based on the date of observation, there appears to be a gradual increase in the proportion of both “Incorrect” and “Uncertain” IDs over the 15-year span of iNat data. This is evident for “All” observations (pink and gray portions of each bar):

and particularly notable for the subset of “Verifiable” observations:

These bar charts are useful up to a point, but since they are standardized on a 100% y-axis, it leaves me to wonder what the trend really represents and why it might be happening.

Clearly, everyone is aware that there has been an exponential increase in the overall number of observations on iNaturalist during this time frame and thus we should expect the number of accurate, uncertain, and/or misidentified observations to increase in absolute numbers, but I have to ask: What are the factors contributing to the increased proportion of Incorrect and Uncertain IDs over time?

The simplest explanation is that the increasing taxonomic and geographic coverage of iNat observations necessarily comes with the addition of taxonomically more challenging groups and observations from less-well-documented regions (e.g. outside of North America and Europe).

But I’m wondering what other factors might contribute to this trend. I may do some simple statistical analyses of these trends and post them to a journal article. I’ll be interested to hear other theories.

GeolTel · March 6, 2026, 3:24pm

Hypothesis, based on the Year ‘accuracy’ values found under the ‘Research Grade Results’ tab (whereby a few early experiments saw a ‘100% Correct’ result, e.g. in 2015): the ‘accuracy’ is partly influenced by the design of each yearly experiment. As the design evolves (gets more refined, better targetted at a changing base of increasingly regional+knowledgeable reviewers, etc.), I would expect the ‘accuracy’ to drop.

lynnharper · March 6, 2026, 3:25pm

Another “simple” contributing factor might be that, as iNaturalist becomes better known, more and more observers are not experienced naturalists and thus more likely to make inaccurate, uncertain, or misidentified observations.

sedgequeen · March 6, 2026, 4:24pm

Does your graph show results for ONE experiment with many years of observations included?

martyndrabik · March 6, 2026, 4:54pm

Does anyone else see a label missing for one of the bars under Continent?

gcwarbler · March 6, 2026, 4:54pm

Those are not my graphs; those are from the Observation Accuracy Experiment 0.6 page from staff.
https://www.inaturalist.org/observation_accuracy_experiments/10
And, yes, those results are for the most recent experiment (0.6). Their random sample of 10,000 observations included observations dating back to 2012. The accuracy experiments themselves only date back to the past few years. I was just examining the results of the most recent such effort.

schoenitz · March 6, 2026, 6:15pm

So this is then simply an age effect, no? Plot observation age instead of observation year, and accuracy increases as you may expect.

Now comparing observations of the same age (maybe one year old for example) across different years, that would be interesting to look at.

onyxrat · March 6, 2026, 6:30pm

Maybe we are reading too much into this.

Look at the results by year.
2015 looks to have great results. But it has only 24 observations.
2016 seems to have terrible results. But it has only 45 observations.
More recent years have 500 observations.
2026 seems to have poor results. But it has only 5 weeks of data which means the average observation has been in the system only 3 weeks or so. How often do observations reach research grade within 3 weeks versus after 3 weeks?
2025 is a little low so maybe this just means it takes several months many observations to reach RG.

Now look at 2017 through 2024. The numbers are close enough this is a flat line.

By the way, that unlabled continent is Antarctica. It has the results for 1 observation. A data point of 1 is meaningless.

Then look at North America. It has the highest chance of a correct result. I’d also guess North America has the highest number of users. Maybe all this proves is more users gives you better results.

The more useful data is also somewhat obvious to some.

100 or fewer observations of a taxon gives you the lowest chance of having a correct id.
Greater than 100K observations gives you the highest chance of being correct.

Fungi and Protozoa have lower chances of correct ids.

I guess I’m surprised that mammal and bird results are so low.

DianaStuder · March 6, 2026, 6:44pm

It is my choice to ID the obs that need ‘disambiguation’.
From that, I would add to your list - more observers who dip in - sometimes literally for one day. CNC and GSB leave a residue of obs which are difficult, if not frankly impossible, to ID - from observers who left iNat years ago. If the residue includes wrong IDs, then it is a mission to keep @mentioning yet another identifier till we can convince the CID algorithm. To reach the more than two thirds. And against Ancestor Disagreement - that is not a dicot, oh yes it IS !!

Damned difficult dicots, with disagreement to whittle down the volume, only for the Western Cape ditto. I can clear 1, 2 or 3 pages a day depending - so another month, or two, for 66 pages ?

The backlog of Unknowns or Needs ID grows and grows. New taxon specialists trickle in. But we are not waving, we are drowning.

gcwarbler · March 6, 2026, 8:11pm

This is a really good point I had overlooked. The older an observation, the more opportunity the community has had to address it. I was just starting to look at observations (all of them, not just the experiment sample) in 2-year windows (2012-13, 2014-15, etc.) and compare their taxonomic and geographic characteristics. I wonder if there is some way to not only look at the age of the observation but also the “age” of the community ID across a large sample of observations. That would probably take some tricky API work if it could be done at all.

I have begun to look at the overall composition of the observation realm in 2-year increments with this type of general taxonomic breakout. It’ll take some time to compile some stats.

gcwarbler · March 6, 2026, 8:34pm

The years of those charts refer to the year of the set of observations, not the year of a given accuracy experiment. The accuracy experiments have only been going in the past few years; I don’t know when the earliest one was. Certainly the design of the experiment has been modified over time–the blog post for Exp 0.6 describes a few of the recent modifications and future plans. The pool of (all) observations from which the 10K sample is randomly selected for each experiment is more or less fixed for previous calendar years (notwithstanding some portion which are from later additions); the random sampling from that pool is a new random sample of 10K for each experiment.

GeolTel · March 6, 2026, 9:05pm

Agreed.
Still, both the experiment design and the user/identifier base evolve. Better-focused experiments + larger pool of experts = more inaccuracies caught? (I’ve noticed a broadly similar trend while reviewing all RG obs at the region scale.)

gcwarbler · March 6, 2026, 10:32pm

Qualitatively, I can assert for taxa I have been working on in recent years (e.g. moths) that we have seen the addition of several experts of continental and world-wide reknown for selected groups including multi-published authors, checklist editors, etc., etc.

[How about if we, the iNat community, demand that every university that graduates anyone in the biological sciences with any connection to species identification (whether directly or indirectly) be required to add as part of their degree plan a mandatory 5-year stint as an Identifier on iNaturalist! Perhaps we could get some entity (governmental, financial, or environmental) to prorate a reduction in their student loans for every 100 IDs contributed! ]

AdamWargon · March 6, 2026, 10:47pm

Obviously Chuck is joking; hence the wink. We know that “duress users” cause a lot of headaches, especially during CNC, GSB, etc.

But it’s also true that iNaturalist is becoming essential for anyone working with wild organisms. For bio‑sci professionals, IDing on iNat builds important professional skills, as well as carrying increasing weight on a résumé. Learning to navigate observations, sampling bias, community review, statistics, creating public evidence of taxonomic competence via curation, connecting with other taxon specialists . . . the list goes on and on!

afdexter · March 7, 2026, 1:19am

From personal experience with this, I will say that Uncertainty can be misread. I am aware of some perfectly good RG observations where frequent identifiers of the species, when recruited for the Accuracy Experiment, confirmed only at genus because the subspecies was not their local one. A bit of checking effort would have given a different outcome.

asteroidowl · March 7, 2026, 3:24am

I’m personally not that concerned about an incorrect ID so long as it has been withdrawn. I’m not clear on whether this data accounts for that.

and also, yes, there are certain areas of the world, and particularly North America where there are a ton of people who will look at observations - the PNW is one area.

DianaStuder · March 7, 2026, 6:57am

But joking aside. Somewhere is a comment from Marina - she explained that every Russian university student is required to do a unit of biology. And those iNat obs were ‘university student’ quality - not the childish grudge I HAVE to do this quality. Another university might require a language credit - which is where I learnt my first German with a friend who needed the credit for her chemistry degree.

We certainly have some academics who require their students to use iNat, and monitor their obs.

vandalsen · March 7, 2026, 9:52am

Yes, Diana, but in South Africa, the students either descend on the closest botanical garden (Stellenbosch), photograph every tree on campus en masse (Pretoria), or obsessively document their neighbours’ hedges (Makhanda). Students in Russia traipse weeks across the Urals or Kamchatka every summer. Very serious lot. [Off topic, I know.]

DianaStuder · March 7, 2026, 12:39pm

We had a group of Russian students come thru Cape Town. On their way to Antarctica ! Who needs a botanical garden

But the ‘problem’ students are, how shall I say this tactfully ? A reflection of their lecturers (and their use / abuse of iNat). Bit wary of SU entomology students, given their annual footprint here.

texas_nature_family · March 7, 2026, 2:28pm

Yes, I do as well

Topic		Replies	Views
Rampant guessing of IDs General	136	8820	September 19, 2021
Gamify accuracy? Award value to quality, not just quantity General	41	5389	September 17, 2020
False "research grade" observations General	37	4288	November 1, 2020
Identification Quality On iNaturalist General	20	8277	January 16, 2020
Why do some observation receive plenty of agreeing IDs? General question	150	5530	April 9, 2023

Accuracy Experiment: Increase in Uncertain IDs over time

Related topics