Not sure if this is exactly a bug, but the auto-ID sometimes suggests monotypic genera. For example, this observation: https://www.inaturalist.org/observations/24639253 suggests the genus Sanguinaria. There is only one species of Sanguinaria. This seems to imply that it is more confident in the genus than the species which makes no sense:
It’s to do with the way Computer Vision technologies like this one work. You can only effectively train them at one level. In this case, it’s species level.
Where possible, the system then takes that species data, and mashes it together to try and predict a genus.
So what you’re seeing here is two different processes. The first is the raw confidence data for the observation, displayed as a list of species. The second is the result of a calculation of that data to work out whether it passes the required threshold to predict a genus.
When doing the genus calculation, the system only uses the relatively short list of possible matches (the species list you can see), so it knows there’s one possible match in the Sanguinaria genus, but doesn’t bother to see if there are any that didn’t make the list. Obviously though, it was quite confident on S. canadensis, because that one species was enough to pass the threshold for genus prediction
I believe there is a way for curators to mark any taxon as “complete” (containing all of its descendants) in iNaturalist. Seems like it wouldn’t be too hard to have the system check for this status, check the number of descendant species in the system, and always provide the species taxon if the suggested taxon is complete and monospecific.
If an identifier doesn’t happen to be aware that the genus is monospecific, however, and decides to be prudent and choose genus, then that still requires an extra ID to move it along to species level later. Might just save some superfluous ID steps…
This does seem like a bug to me. The list of suggestions strongly implies that an uncertain user should choose the monotypic genus. But choosing the species in that genus is always better. I understand that there is some reason why the computer vision algorithm happens to work this way, but that doesn’t make the end result any less of a mistake. Also, it seems straightforward to add post-processing to the computer vision algorithm’s current behavior to achieve the clearly better result of offering the species as the “pretty sure” choice.
I don’t think this is a particularly important bug, but I do think it’s clearly a bug from the UI point of view.
I was going to argue against this, but after a quick re-read, it made a lot of sense. Even the numbers seem pretty good. I would add the following:
Keep the species list underneath, but de-emphasise it (make it smaller, or greyed-out or something). I like the idea that if the AI gets it wrong or isn’t confident, an experienced identifier can still quickly spot the right species in the list and click it (many IDers like to do this rather than typing for efficiency)
Common Ancestor suggestion should have a ceiling. “We’re pretty sure it’s Life” shouldn’t be a serious prediction (which would happen in your model if a top-two of sea sponge and mushroom was <0.7). Maybe Class - above that should be obvious to almost everyone in most cases.
I can imagine a researcher who is planning on splitting a monotypic genus might prefer the ability to only put it to genus until it gets published, but that seems like a real edge case, whereas the Sanguinaria situation has probably come up many, many times.
Nothing would prevent them from doing that. They just couldn’t use Computer Vision as a shortcut, and would have to type in (part of) the genus ID.
I do find that I am frequently adding IDs to observations where monospecific genera have been specified, just to get them to species level. Would be more efficient if I could just agree to an ID that was already at species level in those cases.
I heartily agree with this suggestion. The monotypic species issue is a point of needless complication, in my opinion. Both for this specific issue and people leaving monotypic species on genera or even the family (or higher) level.
For what it’s worth, I support this idea a lot. Though it might take some effort to ensure iNat can detect “monotypic” genera correctly, as opposed to the genus having only one out of several species listed.
There are also identifiers who are aware of taxonomic revisions, and that a monotypic genus may have been split at any time since the last time they looked it up. Or who do not know that a formerly multi-species genus has become monotypic since the last time they looked it up. Cladistics seems to have made taxonomy a lot more unstable that it used to be. I sure was surprised to find out that cacao is considered to be in the mallow family now; all these years it was Theobromaceae for what had seemed like good reasons. My awareness of this makes me less confident of identifying something to species.
This really does need to be addressed. The taxon that comes to mind for me is Malosma laurina, the only member of a SoCal genus Malosma. I can’t tell you how many times I’ve IDed genus-level Malosma observations to the single species, and then had to write out a comment saying “this is actually the only species in this genus, so no need to ID to just genus level”. It’s just aching for a fix. Although, I’m told by forum moderator that they aren’t taking CV-related feature requests, which seems odd. Maybe a lack of in-house expertise, or concern over having to take it offline for some short period of time to make changes?