Hi all - we just released a minor change today on the web to remove non-nearby suggestions from the computer vision suggestions by default. We released something similar on Android a week or two ago and a similar change on iOS is coming soon.
We hope will reduce the frequencies of people choosing suggestions that donât occur nearby and are thus likely wrong. But this is a minor change and doesnât really get at the heart of whats driving these suggestions so you still may see weird things like common ancestor suggestions (ie âweâre pretty sure its in this genusâ) that arenât seen nearby or may not be optimally calculated. We realize weâre probably due to revisit our whole approach to how we present suggestions (which is coming up on being 4 years old!) - ie the common ancestor and 10 suggestions and how we combine distribution (âseen nearbyâ) with the computer vision probabilities (âseen nearbyâ). Weâve started some experiments to help guide this work over the coming months, and please bear with us.
But let me explain how this change works with a very simple example using the clade of Giant Southern Pill Millipedes. This order has 5 families with very distinct geographic distributions
But the computer vision model thatâs in production right now only includes two taxa in this clade (ie the only possible âvisually similarâ suggestions). They are Family Zephroniidae and the principle genus in the Family Sphaerotheriidae (green circles below). Note: if we could snap our fingers and have a new model trained up on the current database today it would also include Arthrosphaeridae and the only genus in the Procyliosomatidae (Procyliosoma) - which points to a parallel solution to these issues which is to continue growing the training dataset!
In any case I hope you can see that this set up where we have several geographically disjunct choices that look similar but only a few of them available as choices in the model is ripe for the kind of issues discussed in this thread where non-nearby taxa are continually being suggested by the computer vision algorithm and clicked on. In fact this is what weâre seeing with identifications of Zephroniidae and Sphaerotherium being suggested far outside their ranges in places like southern India etc.
For example, this observation in South Africa was IDâd as the Southeast Asian family Zephroniidae as a result of non-nearby computer vision suggestions.
But with the new change youâll see that the suggestions now exclude such non-nearby taxa by default. The remaining pill millipede suggestion, the Visually Similar / Seen Nearby Genus Sphaerotherium is in fact the correct choice:
If you click on the new âInclude suggestions not nearbyâ toggle, youâll now see these non-nearby suggestions as before, including the suggestion of Zephroniidae which led to this mis-ID
Similarly,
this observation from Sri Lanka is now properly IDâd as Arthrosphaera.
But it has an early ID resulting from a computer vision suggestion of the non-nearby Zephroniidae. Again with this change Zephroniidae isnât suggested even though the correct ID Arthrosphaeridae/Arthrosphaera isnât suggested because its not available as a suggestion in the current model:
Again if you click on âInclude suggestions not seen nearbyâ youâll see these non-nearby suggestions of Sphaerotherium and Zephroniidae as before:
Even though hiding these non-nearby suggestions by default was a very easy fix to make we were hesitant to do it because we suspect these changes may lead people astray when theyâre in settings where the distributions donât help a lot with identifications such as gardens. For example, if someone is looking at a garden plant that is commonly observed enough globally to be a suggestion in the computer vision model but isnât represented by nearby observations (maybe its rarely planted in that part of the world) we are now hiding the correct visually similar suggestion by default. The user will have to know to click on âInclude suggestions not seen nearbyâ to reveal it. We expect that in these use cases the user might click on an incorrect visually similar/seen nearby native plant instead of the correct visually similar garden plant. So hopefully we arenât just replacing one set of mis-identifications with another.
As mentioned above though, we suspect this will be the the last tweak to our existing setup before a more major revamp to how we go about mashing up distribution data and computer vision suggestions and rolling probabilities up and down the tree to come up with a suggestion or set of suggestions. So please let us know if you think this minor tweak is an improvement to the status quo or not. And stand by for hopefully deeper tweaks to how iNat makes suggestions in the coming months.