Add a flag for frequent computer vision identification errors

tonyrebelo · April 7, 2019, 5:27pm

Is there any way for us to flag recurrent or common problems with the AI ID? Quite a few Melaleuca (Bottlebrushes - Myrtaceae) are being identified as Banksia ericifolia (Proteaceae): even though they dont look much alike, although the AI seems to think that they do.
((on second thoughts, perhaps I should visit the Melaleuca viminalis observations and check if they are not perhaps incorrectly identified and thus affecting the training of the AI?))

bouteloua · April 7, 2019, 6:15pm

Are you requesting a warning flag on taxon pages for taxa that have computer vision identifications that are frequently manually corrected? A label on IDs? Or are you asking for a way to manually flag taxa that get misIDed frequently to alert other identifiers? (Or something else?)

What are the short term goals of the proposed flagging system–to reduce IDs from occurring in the first place, attract identifiers to help fix them, both, something else?

Some ways people have dealt with it (on an individual basis) is to assemble a list, whether mental or written, e.g. @treegrow’s List of Computer Vision Traps, or to use the existing flagging system to discuss and attract identifiers to help out, e.g. Trombidium holosericeum.

tonyrebelo · April 15, 2019, 7:08am

I am requesting a way of being able to add a flag when as an identifier I note a common recurring misidentification that should not be occurring. Not an ambiguous or understandable one (such as indistinguishable species, or user confusion) - one most likely due to bad training or inadequate material used in training (or perhaps a bad algorithm).
How it is used after flagging - for instance to alert future users - is not an issue I had thought about or entertained.
I was more thinking about it being used for future machine recognition training sessions - including checking the current IDs before training (to prevent “tautologous” IDs), checking that there are adequate pictures for training in both taxa, and checking for possible glitches in the training algorithm.

tiwane · April 17, 2019, 7:47pm

I can ask our computer vision team about this.

jdmore · April 18, 2019, 7:22pm

I wonder if something like this could be done just based on ID statistics already in the data? Analyze how often CV identifications end up not being the community ID for each taxon in CV, pick a threshold along the resulting “bell curve” (or whatever shape it manifests), and “flag” in some way(s) those taxa with especially unreliable CV results. Maybe a little “caution” flag when displayed among CV suggestions. And maybe a filter parameter to support/inform the manual flagging option that @tonyrebelo has in mind? Just thinking out loud here…

mtank · May 6, 2019, 8:01am

As you say, this should be in the data already. ie, 51% chance it’s this, 49% it’s that, between the two similar organisms. It shouldn’t be too hard to detect and flag those situations. That is even without taking into account the eventual Community ID. As usual, the devil is in the details. What are the criteria that determines when the warning appears, and how does that work if there’s 3 or more similar species, like in a mimicry complex, etc., or under other odd circumstances?

Edit …or do you just present the confidence percentages and let the identifier decide what they mean?

zygy · May 6, 2019, 6:33pm

@mtank - One simple and unobtrusive way you could make the confidence data available would be to add it as a title attribute to the “Visually similar” span in the suggestion interface, so that if someone hovered over it, it would show “89% confidence” or whatever:
<span class="subtitle vision" title="89% confidence">Visually Similar</span>

agmcmll · May 8, 2019, 10:07pm

I love the idea of seeing some indication of certainty from the computer vision. I am often amazed at how well it does, and occasionally amused at how poorly. when entering observations I routinely ID with a suggestion even when I know what the species is, occasionally I see it getting the result wrong. and it would be interesting to see just what the odds were for each of the alternate suggestions.

bouteloua · October 21, 2019, 11:24pm

I’m going through some old feature requests, just wanted to note that this one was suggested:
https://forum.inaturalist.org/t/computer-vision-should-tell-us-how-sure-it-is-of-its-suggestions/1230

bouteloua · October 26, 2019, 7:41pm

I checked with the staff on this one and it doesn’t look like they’re going to move forward with adding a different kind of flag or alert system for computer vision training errors. I’d recommend adding species here: https://forum.inaturalist.org/t/computer-vision-clean-up-wiki/7281 and using the existing @ mention and flagging system to get help with people reidentifying incorrect observations, which in turn helps retrain future versions of computer vision.

Ken-ichi had a nice post yesterday partly about computer vision and the thresholds for accuracy/number of training photos. Might be a good place to ask some follow-up questions: https://forum.inaturalist.org/t/identification-quality-on-inaturalist/7507

Topic		Replies	Views
Computer Suggestions: use disagreements as a measure for difficult taxa Feature Requests	7	896	April 6, 2021
Artificial intelligence and misidentifications General	19	2072	February 3, 2020
Enable identifiers to add flags to problem taxa to encourage observers to be wary of potential issues Feature Requests	28	2597	June 14, 2023
Add "often confused with" warnings Feature Requests	12	1634	December 4, 2019
Can the Computer Vision be stopped from this mis-ID? General	29	1160	December 25, 2021

Add a flag for frequent computer vision identification errors

Related topics