I often see it mentioned on here how amazing the Computer Vision AI is at making identifications. I’ve even seen it suggested how the AI will ultimately replace the need for human identifications, to which I laugh. On the other hand, I don’t see nearly enough complaints about the rampant misidentifications that plague this site, much of which stems from this very same AI… so I’ll give an especially egregious example.
Parentia is a genus of dolichopodid flies with 70+ taxa. Judging from the map of iNat observations, one would assume that this is a common and cosmopolitan group, but, nay. The genus is largely endemic to the SW Pacific (Australia, New Zealand, New Caledonia, Fiji), with a couple species described from southern Africa that may or may not belong. Yet the AI insists on using this taxon for specimens observed practically worldwide, despite there being nearly identical genera that are far more appropriate. I’d love to know what pattern recognition is being used by the AI here, seeing as these genera are primarily based on genitalia differences that aren’t visible in photos.
In my opinion, the AI, as it is currently implemented, is the greatest flaw of this site. I’m an entomologist. I enjoy identifying insects, but I rarely touch the insect observations on here due to the overwhelming number of misidentifications. Consider how much effort it would take to curate just the Parentia observations. Hundreds of extralimital observations that need to be bumped to subfamily or family-level, all to correct a mistake created by an overconfident AI. This problem will only continue to get worse as more and more observations build up here, which is going to make the task of curation that much more onerous to other experts who might otherwise want to contribute their expertise.
My suggestion: the AI needs to provide a thorough breakdown of the confidence interval for it’s IDs, and these should be limited to an acceptably high number (say, 90%). That means that if a species or genus-level suggestion doesn’t meet that 90% threshold, then a family or order-level ID is suggested. And if that doesn’t meet the 90% threshold, then a class or phylum or kingdom-level ID is provided instead.