Computer Vision should tell us how sure it is of its suggestions

alexinoui · July 10, 2020, 8:48am

I’m new to this community, so I won’t weight much here, but I think it might be a bad idea.
We have some Formicidae screenshot, they are a good example of why this is a bad idea.

Currently the CV is very bad at suggesting and identifying ants. I don’t recall the algorithm ever suggesting the right species as its favorite choice. I even had “we are pretty sure it is a [genus]” on an insect from a totally different order. Also almost every black ant are suggested as “Camponotus” Because they are all black, 6 leg 2 antennae 1 head one thorax one “wasp waist” and a big gaster. Most don’t have striking pattern like butterflies or birds can have for example.

Such a feature would give a false sentiment of confidence, as stated by other members. But it is already the case with “We are pretty sure it is”.

But worse, a red highlight would give the impression that it is unlikely the correct ID is actually among these “uncertain” suggestions.

Simply put : it’s a confusing system, especially for beginners.

pisum · July 10, 2020, 3:09pm

yes. i think that’s exactly why showing the underlying scores would be enlightening – because the #1 choice in a list of bad choices is not the same as the #1 choice in a list of good choices. so if you can see the actual scores, you have better insight into whether you’re being presented good choices or if you’re being presented bad choices.

just for example, here are 3 of my own ant observations, along with the actual computer vision scores:

https://www.inaturalist.org/observations/43079992 (needs ID as Myrmicine Ants, possibly Bicolored Pennant Ant):

we’re pretty sure it’s in this family:
- (99.9) Ants (Formicidae)
our top suggestions (combined score, vision score):
- (40.2, 53.4) Dolicohoderus genus
- (10.2, 4.5) Bicolored Pennant Ant (Tetramorium bicarinatum)
- (5.6, 7.4) Acorn Ants (Temnothorax genus)
- (4.6, 6.1) African Big-headed Ant (Pheidole megacephala)
- (4.4, 1.9) Graceful Twig Ant (Pseudomymex gracilis)
- (4.2, 1.8) Florida Carpenter Ant (Camponotus floridanus)
- (3.2, 4.3) Arboreal Bicolored Slender Ant (Tetraponera rufonigra)
- (3.2, 1.4) Crematogaster laeviuscula
https://www.inaturalist.org/observations/20826798 (research grade Red Imported Fire Ant / Solenopsis invicta)

we’re pretty sure it’s in this family:
- (100) Ants (Formicidae)
our top suggestions (combined score, vision score):
- (50.0, 30.5) Forelius genus
- (8.3, 15.4) Asian Weaver Ant (Oecophylla smaragdina)
- (6.0, 11.1) Mediterranean Acrobat Ant (Crematogaster scutellaris)
- (5.7, 3.5) Argentine Ant (Linepithema humile)
- (5.5, 10.1) Azteca genus
- (3.5, 2.1) Crematogaster laeviuscula
- (2.6, 4.9) Tropical Fire Ant (Solenopsis geminata)
- (2.6, 4.8) Yellow Crazy Ant (Anopolepis gracilipes)
https://www.inaturalist.org/observations/44126816 (research grade as Eastern Black Carpenter Ant / Componotus pennsylvanicus):

we’re pretty sure it’s in this genus:
- (77.9) Carpenter Ants (Camponotus)
our top suggestions (combined score, vision score):
- (64.3, 61.3) Shimmering Golden Sugar Ant (C. sericeiventris)
- (7.0, 1.8) Eastern Black Carpenter Ant (C. pennsylvanicus)
- (6.9, 7.3) Giant Turtle Ant (Cephalotes atratus)
- (4.0, 4.3) Eciton genus
- (2.6, 2.8) Bullet Ant (Paraponera clavata)
- (2.0, 2.2) Diacamma genus
- (1.9, 2.0) Giant Forest Ant (Dinomyrmex gigas)
- (1.8,1.9) Hairy Panther Ant (Neoponera villosa)

if using sessilefielder’s red-to-green gradient, remember that:

so most of the “top suggestions” above would be red to yellow, whereas the “we’re pretty sure” suggestion would be more green. hopefully in such cases, that would push most folks to select the green rather than the yellow or orange, if they were simply choosing blindly based on the system’s suggestions.

i also think if people could see that, say, bird suggestions tend to be very green, while, say, spider suggestions tend to be very red, then they would also be much more careful about relying on the computer vision for spiders.

or if they see two equally green birds suggestions, they might pause for a moment to consider why both are equally green before just blindly selecting the first choice.

of course, computer vision suggestions will never be perfect. there will always be mistakes, but i think showing the computer vision scores will help reduce (rather than increase) the likelihood that the community will adopt those mistakes.

tpollard · July 11, 2020, 4:14pm

I would have assumed that the scores are the classification scores in the final “softmax” layer of the neural network. In that case, every taxon would be given a score from 0.0 to 1.0, and all scores would sum to 1.0, like probabilities.

pisum · July 11, 2020, 5:37pm

i think that’s hard in this case because of the hierarchical nature of the taxa. suppose your had an observation of a blue jay, and the existing algorithm assigned a score of .90 for blue jay, .95 for bird, and 1.0 for animal, what kinds of scores would your assumed implementation assign those taxa?

EDIT: nevermind – alex’s response below changes the way i have to look at things…

alex · July 12, 2020, 4:51am

Hi folks,

Alex here, I’m one of the people who trains the computer vision system for iNat.

As tpollard mentions, the softmax function outputs in a format that is shaped like a probability. However, the output of the softmax function is strongly influenced by its input distribution. Given the imbalanced nature of the iNat dataset (we have a lot more images for some taxa than for others), these scores should absolutely not be interpreted as statistical probabilities.

I know tpollard didn’t suggest that they should be interpreted this way. I just wanted to make sure that the format similarity didn’t encourage someone to think about the scores in a way that isn’t warranted.

Thanks,
alex

dan_johnson · September 13, 2021, 2:07pm

If the CV thinks it’s 75% sure, the system should probably recommend identifying at a higher level. Guessing species IDs when not sure, is not helpful in my opinion. It just frequently leads to wrong IDs which often have to be overridden by multiple identifiers if the original ID is not corrected, which all too frequently happens.

As seen in cicadas, a more frequent scenario would be that the CV is 95% sure of the top pick, yet the user picks something else presumably because they are matching unimportant details (like color rather than certain pattern elements). If they see their pick has a 1% chance of being correct, they would probably be less likely to select it.

dan_johnson · September 13, 2021, 2:21pm

It would improve things a lot if when the CV is pretty sure an ID is a particular species it would say so. Currently for the cicada Neotibicen superbus, which is easily identified, it just says it’s pretty sure its Neotibicen. See for example here: https://www.inaturalist.org/observations/94684158. Does the CV ever say its sure of a species? It definitely should in my opinion.

chrisangell · September 14, 2021, 3:08am

No it doesn’t, and I think that’s a good thing for two related reasons.

The CV doesn’t know all species. It doesn’t even know all species with records on iNat! It only knows the species which had at least 100 observations when the model was last trained. This means that the AI’s calculated level of certainty may be totally inappropriate, if a picture matches only one species in the training data set, but would also match 10 or 100 similar species that didn’t make it in (e.g., they may be rarely observed or hard/impossible to ID from photos).
“Pretty Sure” suggestions don’t take location into account at all. This means that these recommendations can be based on a match with a species that doesn’t live anywhere near you. Usually it’s pretty good for North America and Europe, but sometimes it’s still wrong. I can only imagine that places with relatively few observations, like South America and Africa, are much worse. I’m sure that leaving “Pretty Sure” at genus level greatly reduces the number of geographically inapproprate IDs on iNat.

Inappropriate CV-based IDs are a perennial problem here. They can flood CV-included taxa with clearly incorrect junk, and trigger-happy agreers can easily push them into Research Grade, where they get forwarded to GBIF’s database, and, critically, no longer show up to IDers by default. I understand your desire for “Pretty Sure” species suggestions, but I don’t think we’re close to being ready for that. I’d much rather have correct but vague IDs than precise but wrong ones.

DianaStuder · September 14, 2021, 10:23am

What disappoints me is that - Seen Nearby - can mean - a single obs - with ONE ID!

That skews the distribution map, and cascades into - Seen Nearby, so mine is also that.

Computer vision is a suggestion. It is still up to us to evaluate and decide.

fffffffff · November 8, 2021, 7:27pm

Old wrong ids are not helping that, I see how new observations get new wrong ids (and checking cv, yes, it shows species as seen nearby), when this “nearby” is actually three observations across whole USA, all in different states, none in the state where observation in question is. All are wrong, but seen nearby feature doesn’t help in using one far away obs as a sign of species being presented in an area (it can work that way with endemics too, or sp. found on one coast only).

DianaStuder · November 9, 2021, 12:22pm

That ‘Seen Nearby’ needs some quality control.
At least one or two Research Grade community consensus IDs before brightly suggesting ‘this is seen nearby’. Not! Actually.

alesbucek · December 3, 2021, 1:40am

100% agreement. ID suggestions based on “Seen nearby” should be more critical and require research grade observation or multiple observations with community ID. I think multiple community IDed observations is better because it is often difficult to reach a research grade (this is from my experience with insects, for different groups of organisms this will be likely different).
From my recent experience when I was cleaning some ID mess for insects which live only in northern America but were observed tens of times across the tropics of the world, these IDs were never even community ID - always just a single auto-suggested ID given by the observer. Hence, excluding observations without community IDs from “seen nearby” suggestions could be good enough.
@DianaStuder @fffffffff this seems like a relatively easy thing to implement with large effect on the accuracy of IDs - shall we formulate it as a feature request? Or was it already formulated as a feature request?

EDIT: I actually though that “Seen nearby” is a parameter used for ID suggestions which does not seem to be the case. But I still agree it should be more restrictive about what is considered an evidence of nearby presence.

DianaStuder · December 3, 2021, 3:19pm

Yes to a feature request.

When that ‘seen nearby’ rings my alarm bells, I search the distribution map - and try - to clear what I can.

williampaulwhite17 · August 30, 2022, 10:26pm

I’m not sure it’s worth the time and effort. I see enough really bad AI generated ID’s that I don’t trust them much. If anything I’d like to see more of a push for there to be real disclaimers about the AI when it pops up recommendations

https://www.inaturalist.org/observations/132769331

The AI said it wasn’t particularly confident but the suggestions were all toads (top one was a narrowmouth toad).

regnierda · June 29, 2024, 2:08am

It looks like the upcoming app has something maybe that might be a score when you’re viewing suggested computer vision IDs?

I’m referring to the dots under each suggestion. It seems that when something is presented as a “Top ID Suggestion” it has 5 green dots, but when it’s presented as a “Nearby ID Suggestion” it has 1 green dot.

I’m not sure if those are just static images though. I looked through a half dozen or so of my own observations and none of them had suggestions bearing 2, 3, or 4 green dots. All suggestions contained either 5 or 1 green dots.

I don’t see the “score” present on any of my “CV-aided” identifications, though I didn’t make any from the new app either. I do, however, see the CV icon on those identifications.

pisum · June 29, 2024, 2:22am

i like what the new design is trying to do, but unless something has changed recently, i think the underlying data is bad: https://forum.inaturalist.org/t/chrome-extension-showing-computer-vision-confidence/35171/46. (for example, notice how the two Taraxacum species suggestions in your screenshot get 1 dot, just like the beetle suggestion is 1 dot. i suspect no species suggestions will ever get more than 1 dot until the problem is fixed, and no higher-than-species suggestion will get less than 5 dots.)

DianaStuder · June 29, 2024, 7:37am

I do still use that extension. The colours have changed, but the info remains useful to consider.

jnstuart · June 29, 2024, 2:26pm

In this particular example, a little editing of the photo would probably have helped: crop it to better show the scorpion, maybe sharpen and increase the contrast. The CV is good but we can help it do better by giving it better photos.

regnierda · June 29, 2024, 2:31pm

I thought about mentioning that too. :3 Note that the comment is almost two years old though. When I look at the linked observation today, CV suggests the same genus as the Community ID. The next suggestion is the same species as the Community ID. The remainder are all arachnids. Seems like it’s come a long way since that comment was made.

hedaja · July 6, 2024, 6:05pm

the dots definitely reflect the confidence level.
And they also don’t just have either one or 5 dots but has a gradient depending on the confidence (otherwise would have been strange).

The confidence will also already be displayed in the live ID window as well.

Topic		Replies	Views
Lack of confidence values in CV suggestions is a problem General	6	955	July 22, 2021
Log computer vision taxon guesses Feature Requests computer-vision	13	974	September 23, 2023
Computer vision suggestions behaving strangely Bug Reports	8	443	December 31, 2021
Opinion on the computer vision icon? General question	61	1971	October 14, 2023
Computer Vision should take into account fraction identified to species General	17	552	March 10, 2025

Computer Vision should tell us how sure it is of its suggestions

Related topics