Better use of location in Computer Vision suggestions

DianaStuder · December 2, 2020, 9:49pm

There are two competing streams for ID on iNat.

A garden plant, not marked as Casual, so iNat offers visually similar and nearby - which is not necessarily helpful or useful. Especially to someone new to iNat.
A (genuine) wild plant - where nearby is more useful than visually similar.

I see weird things on the distribution maps - which I try to tweak where I can.

dlevitis · December 3, 2020, 3:54am

One simple thing that may help in the interim is removing the language “We’re pretty sure this is…”

To inexperienced users, seeing that from the AI is difficult to distinguish from “This is…”

Pretty much every time I upload a pile of images I get a wildly incorrect “We’re pretty sure this is…”

Maybe something like, “Our top suggestion” would encourage fewer blind acceptances.

All that said, I greatly appreciate how quickly iNat has improved, and can hardly imagine how impressed we all would have been with the AI not so many years ago.

thebeachcomber · December 3, 2020, 5:14am

100% agree with this; I think a lot of people associate any kind of AI or computer software with infallibility, so when given a list of 10 suggestions they assume that one of them has to be correct

alisonnorthup · December 3, 2020, 5:52pm

@loarie, thanks for sharing about this update. This seems to me like a reasonable solution, and I’m looking forward to seeing the implementation. It will help a lot! I like the idea of having the button to override location restrictions - that may also help for IDs of common (mostly introduced) species in parts of the world that haven’t accumulated many local observations yet. Just hopefully random people won’t be pressing that button too frequently.

chauncey · December 4, 2020, 4:09pm

Thank you. It is funny to me when I see an ID for Sierra Juniper (J. grandis) in Virginia or in Poland as well as a suggested ID of J. chinensis in the middle of Idaho.

loarie · March 4, 2021, 6:57am

Hi all - we just released a minor change today on the web to remove non-nearby suggestions from the computer vision suggestions by default. We released something similar on Android a week or two ago and a similar change on iOS is coming soon.

We hope will reduce the frequencies of people choosing suggestions that don’t occur nearby and are thus likely wrong. But this is a minor change and doesn’t really get at the heart of whats driving these suggestions so you still may see weird things like common ancestor suggestions (ie “we’re pretty sure its in this genus”) that aren’t seen nearby or may not be optimally calculated. We realize we’re probably due to revisit our whole approach to how we present suggestions (which is coming up on being 4 years old!) - ie the common ancestor and 10 suggestions and how we combine distribution (‘seen nearby’) with the computer vision probabilities (‘seen nearby’). We’ve started some experiments to help guide this work over the coming months, and please bear with us.

But let me explain how this change works with a very simple example using the clade of Giant Southern Pill Millipedes. This order has 5 families with very distinct geographic distributions

But the computer vision model that’s in production right now only includes two taxa in this clade (ie the only possible ‘visually similar’ suggestions). They are Family Zephroniidae and the principle genus in the Family Sphaerotheriidae (green circles below). Note: if we could snap our fingers and have a new model trained up on the current database today it would also include Arthrosphaeridae and the only genus in the Procyliosomatidae (Procyliosoma) - which points to a parallel solution to these issues which is to continue growing the training dataset!

In any case I hope you can see that this set up where we have several geographically disjunct choices that look similar but only a few of them available as choices in the model is ripe for the kind of issues discussed in this thread where non-nearby taxa are continually being suggested by the computer vision algorithm and clicked on. In fact this is what we’re seeing with identifications of Zephroniidae and Sphaerotherium being suggested far outside their ranges in places like southern India etc.

For example, this observation in South Africa was ID’d as the Southeast Asian family Zephroniidae as a result of non-nearby computer vision suggestions.

But with the new change you’ll see that the suggestions now exclude such non-nearby taxa by default. The remaining pill millipede suggestion, the Visually Similar / Seen Nearby Genus Sphaerotherium is in fact the correct choice:

If you click on the new ‘Include suggestions not nearby’ toggle, you’ll now see these non-nearby suggestions as before, including the suggestion of Zephroniidae which led to this mis-ID

Similarly, this observation from Sri Lanka is now properly ID’d as Arthrosphaera.

But it has an early ID resulting from a computer vision suggestion of the non-nearby Zephroniidae. Again with this change Zephroniidae isn’t suggested even though the correct ID Arthrosphaeridae/Arthrosphaera isn’t suggested because its not available as a suggestion in the current model:

Again if you click on “Include suggestions not seen nearby” you’ll see these non-nearby suggestions of Sphaerotherium and Zephroniidae as before:

Even though hiding these non-nearby suggestions by default was a very easy fix to make we were hesitant to do it because we suspect these changes may lead people astray when they’re in settings where the distributions don’t help a lot with identifications such as gardens. For example, if someone is looking at a garden plant that is commonly observed enough globally to be a suggestion in the computer vision model but isn’t represented by nearby observations (maybe its rarely planted in that part of the world) we are now hiding the correct visually similar suggestion by default. The user will have to know to click on “Include suggestions not seen nearby” to reveal it. We expect that in these use cases the user might click on an incorrect visually similar/seen nearby native plant instead of the correct visually similar garden plant. So hopefully we aren’t just replacing one set of mis-identifications with another.

As mentioned above though, we suspect this will be the the last tweak to our existing setup before a more major revamp to how we go about mashing up distribution data and computer vision suggestions and rolling probabilities up and down the tree to come up with a suggestion or set of suggestions. So please let us know if you think this minor tweak is an improvement to the status quo or not. And stand by for hopefully deeper tweaks to how iNat makes suggestions in the coming months.

deboas · March 4, 2021, 11:22am

This looks very promising @loarie, thanks to you and the iNat team! Let’s see how it plays out. Is there any way to compare its effect systematically, such as proportion of computer vision suggestions with later disagreements before and after? Disagreements would have to be within x months because older observations have had more time to accumulate additional IDs.

Marking this as “solution” although it’s great to see there are some even greater changes in the pipeline.

matthias55 · March 4, 2021, 5:52pm

I agree, looks promising thanks for doing it. To address Loarie’s concern above that “captive/cultivated” species may be misIDed at a greater rate with this change, what about having the default toggle to include visually similar taxa when the observer checks the box that the observation is captive/cultivated?

From my experience, when I have an observation that is captive/cultivated, the CV doesn’t work as well anyway, and I know to discount suggestions that are far away geographically, but I can easily see how a new user might not know this and I would expect that the error rate to go down with that change that you made in effect.

Finally - unrelated question: How many taxa make up the current iteration of the CV model? Would be be possible to add an icon showing that a taxa is included in the CV, maybe on the taxon page under taxonomy? Would that be useful?

teellbee · March 4, 2021, 7:05pm

There have been a few odd AI behaviors noted today:

https://forum.inaturalist.org/t/did-something-change-with-the-ai/20834/14

deboas · March 5, 2021, 12:00pm

It looks like the cause of that has since been identified and fixed, and was unrelated to this thread on use of location :)

triplett · March 6, 2021, 11:33pm

It is not just cultivated plants where “Include suggestions not nearby” is useful. Recently I posted a observation of a bug which the AI suggested was a European Firebug. I rejected this because the species did not occur in Australia according to national databases. Turned it was indeed a European Firebug, which has been naturalized here since 2018. Very embarrassing - I should have investigated further since there were already many local observations, and it was not likely they were all wrong. But newly invasive species will unfortunately now be missed unless you click the “Include suggestions not nearby” button, which most people will not do. Perhaps species which are a close match should be included in the suggestions but with a warning that they are outside of the previously known range? It could also activate a siren and flashing lights in the office of the local bio-security authorities. The early discovery of one invasive species would by itself justify the existence of inaturalist.

jdmore · March 7, 2021, 12:20am

In that case, I would think that the species would already be detected as being “seen nearby” - unless that has to wait until the next iteration of the AI model.

alexb0000 · March 13, 2021, 9:17pm

I think this is a really good change, and the benefit it will have with native species outweighs the drawbacks with cultivated and introduced species. In the reptile world it’s really common for things to get misidentified since a lot of species within a genus look really similar, and a lot of people never come back and fix their initial identification so it ends up taking three correct identifications to get it to research grade instead of just one agreeing with the AI.

I would think common garden plants will probably have nearby observations just about everywhere, so that should help with there already being nearby observations. It’s just going to be getting that first observation of a cultivated/invasive species that’s a little harder.

alisonnorthup · April 13, 2021, 1:38pm

It looked like this change happened but has now been undone? Is there another thread where that is discussed?

tiwane · April 13, 2021, 8:37pm

Working for me on web, iOS (version 3.2, which isn’t at full release yet), and Android, eg

tiwane · April 13, 2021, 9:28pm

Aaand, version 3.2 of our iOS app has been fully released. So current versions of the iNat Android app, and iOS app (plus the website) show only “Seen Nearby” computer vision results by default. See @loarie’s explanation here. I’m going to close this request.

Topic		Replies	Views
Species Suggestions for the Wrong Continent General	91	10314	September 24, 2021
Some 'interesting' AI suggestions of late General	42	1787	November 9, 2022
Change computer vision suggestions to only above species level Feature Requests	32	2746	May 29, 2019
CV suggestions are bizarre, not "visually similar" at all Bug Reports	65	1740	February 2, 2023
Problems with Computer Vision and new/inexperienced users General	134	4901	December 27, 2021

Better use of location in Computer Vision suggestions

Related topics