See this observation: https://www.inaturalist.org/observations/265715372. CV suggests this is Procambarus clarkii. Any knowledgeable identifier would know this is not that species. On adults, which this clearly is, P. clarkii usually have red tubercles on the claws, or sometimes yellowish ones. Not black ones. This is pretty easily identified as a member of the subgenus Ortmannicus, which includes P. acutus and P. zonangulus, the two of which are indistinguishable via the photos provided. Although I have IDed as the latter, which is not in the CV model, P. acutus is, and I think CV should identify is as that one. It rather gives it as second choice. P. clarkii has 15,987 research grade observations and P. acutus has 893. This seems like more than enough to train CV to correctly distinguish the two. I occasionally have noticed that incorrectly identified observations seem to be due to incorrect CV suggestions. I did a test of 30 research grade P. acutus observations and CV correctly suggested 76.6% of them, while 20.0% were incorrectly suggested to be P. clarkii, and 3.3% another species. To me, all of these observations are readily identifiable. Is a 23.3% rate of incorrect CV suggestions reasonable? I would hope it would do better.
None of us can pinpoint exactly why computer vision struggles in specific cases, but FWIW in the first image (the one computer vision uses to suggest a taxon), a lot of the organism is more in shadow which makes the contrast between colors limited.
If you use CV on one of the brighter images that clearly shows the claws’ colors, the more closely related P. acutus is suggested first.
@bouteloua’s comments highlight that CV suggestions are not only at the whim of the “black box” which is CV, but also the inputs from the OP. In focus, well lit, close-cropped first images will always yield better CV results. It’s the ongoing process of user education.
And then there’s Seek identifying this as a crayfish…
I had to try this one myself! Luckily it seems to ID it correctly in iNaturalist. I wonder what’s going on with Seek. Maybe it has an older CV model.
CV can be good at pointing you in the right general direction for ID’s, but it is often pretty bad at picking up finer details, which are often necessary to distinguish crawfish. In my experience, often times CV will simply suggest the most common crawfish species in the area as the top result (in my current area this is Cambarus bartonii), even for very clear photos of other species. I also imagine the vast number of misidentified crawfish (including P. clarkii and P. acutus being confused with each other) on inat, many of which are research grade, are screwing with the CV training as well, which leads to a self fulfilling cycle as incorrect CV leads to misidentified observations, which leads to worse CV. Ultimately we really need more human crawfish identifiers, because I don’t believe the AI will ever be very good at that task
76.6 % is worse than for birds but heaps better than for Australian grasses.
I still look at CV suggestions. It snaps me out of the “same genus as previous” bias sometimes and might offer the correct ID which saves me typing.
I set the order of photos for expert identifiers in mind. CV will be trained on all photos anyway.
It is up for some consensus to say what percentage is good enough. Currently CV is only spot tested by identifiers. Once that’s automated, CV can excuse itself from offering taxon level suggestions and revert to genus level or higher, as it does now when the visuals do not quite match.
I looked at the example. The CV correctly identified the crayfish to genus. I think what should change is that the following taxa level identifications should be labelled as “Here are our top guesses”
The other option for a lot of cases is to introduce sub-genera or sections. That would refine what correct IDs come out of CV.