North American Sinea ID and the Sorcerer's Apprentice problem

Yeah, the CV falls flat on its face when it comes to cryptic species that can’t be distinguished from a typical iNat observation, particularly when one species is more common than the others, or if there are specific areas where one species can be easily IDed due to that area legitimately only having the one species (which will absolutely create a positive feedback loop in other regions without janitorial work from identifiers, because I see it happening in my taxa all the goddamn time). My surface-level understanding of the design decisions behind the CV is that the way it generates suggestions just doesn’t really account for the fact that cryptic species exist.

I can conceive of some ways to modify the algorithm to be less terrible in this regard, some requiring more manual intervention from users than others:

  1. ID feedback - if a suggested ID keeps getting disagreed with back to a higher rank (genus, species complex, subfamily, etc), then the algorithm takes the hint and stops suggesting that species at the species level
  2. Manual flags - put in taxon-by-taxon instructions that say things like “hey, don’t suggest to species level in this region“
  3. If a ton of observations are stuck at a higher taxon in general (especially if RGed at those higher taxa) while there’s a proportionally smaller amount of species IDs in the species within that taxon, have the CV take that into account and start suggesting genus/species complex/subfamily more often.

The problems with the CV get easier to contextualize and visualize if you consider that it probably works best (and is at least in some way optimized for) charismatic vertebrate fauna & charismatic flora that are easy for even an amateur to ID.

Edit: And also, for taxa that are cryptic at some points of their development but not others, it would be wise to implement a seasonality-based variation in how specific the iNat devs let the CV get for those taxa.

13 Likes