Computer suggestion being ''too precise''

rupertclayton · January 23, 2024, 3:00am

This has been suggested (in various ways) a lot of times before, and I think it has great potential, although it would likely require quite a bit of testing to figure out the best way to manage the process of flagging a particular family, genus, etc. as being the lowest level amenable to CV ID. But we also need to think through how/why we’re so sure that CV will never be able to offer accurate suggestions below particular levels.

I think it’s reasonable for taxon experts to say something like “even we cannot identify spiders in this subfamily any more precisely based on typical photos, because we know the distinguishing characters require microscopic examination of genitalia”. However, that does not automatically mean that CV cannot in principle distinguish some genera or species within the subfamily. Let’s imagine that CV is trained on a database of typical smartphone photos of spiders that have already been accurately IDed via microscopy (a big task). CV might then discover aspects of those photos that reliably predict particular genera and species, even though the photos don’t contain the microscopic detail the experts would need to key out those taxa. That’s because not every subtle character is contained in a species description or a key. Species differ in ways we haven’t yet recognized.

How serious is this issue? My feeling is that there’s a lot more information that may distinguish taxa than we appreciate. But unlocking that would require a large training dataset of accurately identified images that don’t usefully show any of the characters an expert would currently use to distinguish these taxa. And even if this was successful we would have no way of confirming that the species-level CV identification of an unknown spider was actually correct, so the value would be very limited. Plus, CV doesn’t (yet) tell you why it favors a particular ID, so there’s no logic to interrogate.

So… I think the limit of precision for human identifiers and CV in making IDs from photos may be different in principle, but not in practice. Which means (I believe) that it would be reasonable for the community to establish effective limits of visual identification that could be used both to guide CV and to inform less familiar users—“Spiders in this subfamily can’t be distinguished without microscopy” or “Fungi in this complex can only be distinguished by DNA sequencing”.

As a bonus, avoiding spurious accuracy for CV suggestions in some taxon groups might then allow CV to be opened up to offer finer IDs for other. Eventually, CV will suggest the right subfamily for Ageleninae spiders, the right hybrid name for Crocosmia x crocosmiiflora and the right subspecies for common flowering plants.

Topic		Replies	Views
Change computer vision suggestions to only above species level Feature Requests	31	3062	March 26, 2019
Computer Vision should take into account fraction identified to species General	16	768	January 9, 2025
Force computer vision to back off on the specificity of suggested IDs in regions with cryptic or hard-to-identify species Feature Requests identification , computer-vision	79	2479	June 24, 2026
Problems with Computer Vision and new/inexperienced users General	132	5860	October 28, 2021
Better use of location in Computer Vision suggestions Feature Requests	53	8583	April 13, 2021

Computer suggestion being ''too precise''

Related topics