CV not suggesting species-complexes

The current strategy of “roll back observations to keep any constituent species from getting to 100 photos” is not tenable. Sometimes there are species complexes that have a distinct member (whether visually or by range), and we’re supposed to nuke that low-hanging fruit just to help out the odds for the rest of the group? People generally want their stuff to be identified as well as possible, and this sounds like bad science and bad vibes.

I’ve noticed the CV suggestions actually getting worse recently for several common groups of North American robber flies because of this issue. Read on for details of this case study:

The Laphria canis complex is the second largest (by observation count) “taxon” of North American Laphria, with over 4000 observations. The eastern species generally aren’t identifiable from photos, but there’s one disjunct species out west which finally accumulated enough observations (77) to get itself included in the CV model. So now the model won’t recommend the complex to anybody and flails around suggesting a variety of species (bad) or sometimes genus (fine though suboptimal), which has increased our work to triage the scores of Laphria observations coming in each day under other names.

A similar and linked problem also recently happened in the Laphria index complex, where we’ve gotten enough observations from Canada (where only one species occurs) that one species is now in the CV model and the other is nowhere near for good reason. What’s more, this group is similar enough in gestalt to the L. canis complex that I’m now seeing the model even suggest L. index for photos that used to be safely pipelined into “canis complex”.

I could imagine that some might suggest tweaking such complexes to exclude any species that get themselves in the model on their own. But since iNat species complexes are by definition supposed to be monophyletic groups, that isn’t always doable.

I’m gauging “model suggested” from how many IDs I see come in with the CV tag. Even if it still doesn’t actually give an above-the-fold “Pretty Sure” recommendation for complexes yet (not sure if that’s changed), it is at least including complexes in its “Top Suggestions” list which is useful.

We need a better long-term solution. Let the CV model recommend complexes even if they’re not the leaf, the way it can with genera.

6 Likes