Add a "visually identifiable" scale parameter to set user expectations on confidence of computer vision IDs

naturesarchive · June 30, 2020, 11:59am

One of the challenges that the experts constantly point to is that the AI/ML here on iNat sometimes gives a false sense of confidence of an identification.

Problem statement: many times I and other newbies think we can identify an organism to species level, because the AI/ML makes a suggestion, a visual review of the suggestion looks accurate, as well as the map of sightings and time of year indicators. I later find out that the organism can only truly be identified under microscope or by dissection. This fact seems to be common knowledge among the experts. If it is common knowledge, we should be able to signal this difficulty to users.

Possible solution: Add a “visually identifiable” parameter at the appropriate taxonomic level, which would be seeded by experts. This could indicate whether an organism can truly be identified by photo, or requires some degree of magnification of specific parts, up to requiring a microscope, or dissection.

A launch-and-iterate approach could start with only this parameter, which would help set user expectation (perhaps combined with a simple UI flagging about difficulty of the ID). Over time, the AI/ML algorithm could be adjusted to consider this new parameter in conjunction with location, time of year, etc. i.e. you could infer that it is very likely to be species X, even if species X and X’ look identical, but X’ only occurs 2000 km away in a different habitat. As a final iteration, this could be turned in to a confidence level for the ID.

marina_gorbunova · June 30, 2020, 12:22pm

I feel it would be very hard to add it to each species as “visually identifiable” can mean many things and different experts will think different things, and possibly many groups won’t get one “tag” at all.

astra_the_dragon · June 30, 2020, 1:24pm

This is definitely an intriguing idea. In some cases, such as complexes, I definitely support it.
However, it would require a lot of additional work to create and maintain the additional data.

Also, it is worth noting, it IS possible that the AI, or certain people but not others, can identify from a regular photo which is traditionally considered “impossible to ID”.

earthknight · June 30, 2020, 1:38pm

Interesting suggestion.

I like the idea of marking what species cannot be identified only by a photo. That seems good, and not horribly difficult to implement, although it would take time.

The following bit, I’d remove though:

Species ranges are not nearly as well known as people, even experts, assume they are, plus you have a constant movement of species around the globe as a result of human activity, both intentional and unintentional.

In my experience, range maps are a good guideline to get an idea of what the species most likely is, but should remain only that, rough guidelines, not determining factors for species identifications (with a few specific exceptions).

nawaters · June 30, 2020, 3:20pm

I think one issue is that a species could have different status with regard to visual identification in different parts of its range. Neotamias Merriami (Merriam’s Chipmunk), for example, has portions of its range where it is the only chipmunk, portions that it shares with other chipmunks from which it can be visually distinguished, and portions that overlap with the range of neotamias obscurus, from which it’s generally agreed that it can’t be visually distinguished in the field.

So I think that were something like this to be pursued, it would have to be something like a relationship between two taxa rather than a label attached to an individual taxon.

bazwal · June 30, 2020, 3:25pm

What exactly do you mean by this? As stated, it sounds a bit mysterious - or at least, unscientific.

marina_gorbunova · June 30, 2020, 3:28pm

Why? You sometimes can’t id an insect on the photo using the key, but the expert can skip some of them and id the species. Not sure how good AI is now.

astra_the_dragon · June 30, 2020, 3:31pm

for instance, there are species of plants that I can ID when days old, which show zero traditional identifying characteristics (which would be used in a key).
also, the AI might pick up on subtle morphological differences that human experts have not yet.

bazwal · June 30, 2020, 3:33pm

But then the identification isn’t based on evidence, and doesn’t use a methodology that can be repeated by others. So it’s not scientific. That’s why we have keys - so that they can be used by anyone and reliably produce the same results. Anything else is just an opinion.

bazwal · June 30, 2020, 3:42pm

What is the evidence for this? How could AI analysis of variable-quality photos be better than keys developed by experts who were able to examine real specimens with a microscope?

Star3 · June 30, 2020, 4:12pm

@naturesarchive, just FYI, you can ( and arguably should) upvote your own feature requests. :)

Star3 · June 30, 2020, 4:20pm

I think @naturesarchive was suggesting it as something added to the suggestion (like the current “visually similar/seen nearby” text, or the feature request for a confidence indicator), rather than saying it would prevent users from selecting the suggestion at all.

If you meant “impossible to ID” might convince people it cannot be IDed from photo, though, I apologize for misinterpreting you. Maybe instead of “impossible to ID”, the wording could be something like “often difficult to ID based on photos alone”…but more succinctly, because even though there’s a bit of white space on the suggestions, it’s still limited real estate.

cmcheatle · June 30, 2020, 4:34pm

You would also need to deal with the issue that a species may be easy to identify at one point in its lifestage and very difficult at another. An adult dragonfly may be distinctive,but its nymph not. A flowering tree in summer vs. a bare tree with no foliage in winter etc.

arboretum_amy · June 30, 2020, 4:59pm

I know what you mean here and it is something I have thought about a lot. Hypothetically it is totally possible a computer could find and analyze visual cues an average human could not distinguish or would not find remarkable enough to note/remember. I’ll just make up an example here and say, what if yellow daisy A is consistently a slightly different shade of yellow than yellow daisy B, but the average botanist can’t tell and uses the seed ridges (not photographed by a typical observer) to tell them apart? The catch 22 is that the computer is trained by the humans, so if no humans are putting IDs on the organism, the computer can’t either. If a paper wanted to study this they would have to generate their own several hundred photos of daisies A and B, from plants they had already verified via seed ridges, and train the AI on those.

JayAvery · June 30, 2020, 5:00pm

The biggest obstacle with this suggestion is the enormous amount of work it would take to implement. Just keeping the taxonomy structure accurate and up-to-date requires constant work from a large team of dedicated curators. The pool of people who can determine with certainty whether certain taxa are visually identifiable (never mind the problems with making that determination at all, or with determining who is qualified to make that determination!) is comparatively tiny, and would never be able to complete the work.

marina_gorbunova · June 30, 2020, 6:33pm

Everything is an opinion, if you study the group for 40 years you may see differences between genuses without seeing one or another clue you find in keys, keys are optional and not everything is found in keys.

ekmes · June 30, 2020, 6:56pm

Another difficulty is that sometimes something can be identified by a photo in one part of the world but not in another. For example a crow in Iowa is going to be an american crow, and a crow in Alaska is going to be a northwestern crow, but if you take a picture of a crow in the overlap zone it’s essentially impossible to tell between the two.

DianaStuder · June 30, 2020, 9:18pm

Another way to adjust the confidence level would be to reword
iNat is pretty sure this is Genus species.
If you say so …

twainwright · July 1, 2020, 5:22am

Maybe change “iNat is pretty sure this is …” to “Our robot thinks this looks like …” – that would clarify to new users 1) that it’s not a consensus opinion of iNat staff/users, and 2) that nobody’s really very sure about it.

lm77 · July 1, 2020, 3:19pm

I’ve encountered this problem a fair bit when IDing spiders. Here in the UK, for example, members of the genus Eratigena cannot be reliably ID without microscopic examination of the spiders epigyne (if female) or pedipalp (if male). Yet iNat provides suggestions to a species level with “Visually similar/seen nearby”. New users or those unfamiliar with spider identification often select the suggested species level ID when in fact it is inappropriate to do so from a photograph alone. Other iNatters who again might not be familiar with IDing spiders, pass through and go and confirm the ID. Resultantly, I speculate that there may be a lot of bad data in the results for Eratigena spp.

I think this is a great solution. One system would be to use simple symbols. eg. Eye = IDable by naked eye. I imagine that would suffice for a large swath of taxa. Magnifying glass = IDable with a hand lens/jewellers loupe or macro photo; microscope = microscope examination/dissection required. This is the system deployed in Bee et al, Britain’s Spiders (2017) and it works very well.

Topic		Replies	Views
Change computer vision suggestions to only above species level Feature Requests	32	2743	May 29, 2019
The Pleasure of Identifying General	25	843	September 19, 2024
Don't use computer vision General	169	8859	September 18, 2020
Enable identifiers to add flags to problem taxa to encourage observers to be wary of potential issues Feature Requests	28	2626	June 14, 2023
Problems with Computer Vision and new/inexperienced users General	134	4887	December 27, 2021

Add a "visually identifiable" scale parameter to set user expectations on confidence of computer vision IDs

Related topics