Mass biogeographic idiocy of the AI

I often see it mentioned on here how amazing the Computer Vision AI is at making identifications. I’ve even seen it suggested how the AI will ultimately replace the need for human identifications, to which I laugh. On the other hand, I don’t see nearly enough complaints about the rampant misidentifications that plague this site, much of which stems from this very same AI… so I’ll give an especially egregious example.

Parentia is a genus of dolichopodid flies with 70+ taxa. Judging from the map of iNat observations, one would assume that this is a common and cosmopolitan group, but, nay. The genus is largely endemic to the SW Pacific (Australia, New Zealand, New Caledonia, Fiji), with a couple species described from southern Africa that may or may not belong. Yet the AI insists on using this taxon for specimens observed practically worldwide, despite there being nearly identical genera that are far more appropriate. I’d love to know what pattern recognition is being used by the AI here, seeing as these genera are primarily based on genitalia differences that aren’t visible in photos.

In my opinion, the AI, as it is currently implemented, is the greatest flaw of this site. I’m an entomologist. I enjoy identifying insects, but I rarely touch the insect observations on here due to the overwhelming number of misidentifications. Consider how much effort it would take to curate just the Parentia observations. Hundreds of extralimital observations that need to be bumped to subfamily or family-level, all to correct a mistake created by an overconfident AI. This problem will only continue to get worse as more and more observations build up here, which is going to make the task of curation that much more onerous to other experts who might otherwise want to contribute their expertise.

My suggestion: the AI needs to provide a thorough breakdown of the confidence interval for it’s IDs, and these should be limited to an acceptably high number (say, 90%). That means that if a species or genus-level suggestion doesn’t meet that 90% threshold, then a family or order-level ID is suggested. And if that doesn’t meet the 90% threshold, then a class or phylum or kingdom-level ID is provided instead.

Do what you love. :)


there are plenty of other forum posts listing and compiling issues with the computer vision

they’re definitely not being ignored or remaining unseen


I see the same problem with Orthoptera, and I’m sure it’s ubiquitous in any group where the diversity in species is much greater than the diversity in appearance. It’s very time consuming to untangle all the mistakes and hard to convince people that the observation is out of range when they can see what look like confirmatory records all around. It would be nice for the AI to know and consider range, but the problem isn’t that the AI is overconfident; the problem is observers being over-eager in accepting the suggestions. I’ve even heard that some people take the selection knowing that it’s probably wrong because egregious errors will get attention from experts. Others just don’t really understand the process and think they’re doing what they’re supposed to. Maybe when you select from the computer vision, you need a pop-up asking you to confirm that you’re familiar with the species and you know that it occurs in your area. But that would annoy the high-volume identification users who are the ones fixing the mis-IDs.


Before submitting feature requests or starting new discussions about large topics like computer vision, please search the forum for existing topics. Here are a few where you can weigh in. Thanks!