Computer Vision should take into account fraction identified to species

Fair, and with this model perhaps if the CV was being overconfident on a particular species then simply adding the unknown conflicting species to iNat’s taxonomy would improve its suggestions. I’m not sure how you’d guide it on how confident to be though.

Like let’s say you had one species of mushroom (e.g. a red Russula, per the comments on the blog quoted above) that a handful of observers in a certain area regularly observe and identify to species based on DNA evidence. However the fungus identifiers believe that there may be a visually identical mushroom that no one has confirmed in the area yet, so they won’t be willing to identify new observations to species if they only contain photo evidence. Based on these observations, the CV will be highly confident based on “Visually Similar / Expected Nearby” that new mushroom observations can be identified to species, while human identifiers will be very confident that they can’t be.

If there is another species in the mushroom genus with 0 observations anywhere in the world, should that kill the CV’s confidence for identifying any species in that genus?

An equivalent situation from the CV’s perspective might be a genus of birds that contains some highly identifiable species, plus a couple lost species with 0 observations that are probably extinct (e.g. Campephilus woodpeckers). I wouldn’t want to kill the CV’s ability to ID those distinctive extant species.

Something based on this approach would probably work here, since there would likely be many mushroom observations stuck at genus and very few woodpecker observations stuck at genus.

1 Like