In cases where the CV is not "pretty sure" of anything, offer a suggestion of a higher taxa

Not sure if this should be described as a bug or a feature request… but it always seems strange to me that if not “Pretty sure of” something, the CV will nevertheless suggest something as specific as a species level ID.

E.g.
Here it isn’t pretty sure of anything…yet the top two suggestions are species level and both incorrect.

Here it is pretty sure its something, and thats only a superfamily, but thats correct.

Wouldn’t it be better if it always showed the lowest common denominator of the top suggestions? Even if that ends up being a “Pretty sure its a Fly!” … if it comes to it, rather than autosuggest something species level but uncertain?

As with human identifiers, and the suggested “Identification etiquette”, ideally, I think the top AI autosuggest should represent the level it has an actual degree of certainty about. My hunch being that users, like myself at times, will just select the option at the top of the list if they have no idea, to keep things quick.

Species level IDs with very little certainty increase the chance of incorrect RG observations slipping through the net, creating the kind of feedback loops we see noted in the CV clean-up wiki.

It’s also more work for identifiers if their IDs involve a conflicting branch, than it is to advance a coarser ID to a lower rank.

Alternatively
I think I noticed that in the app, it actually states " we are not certain of anything" or something along those lines in this instance?.. I think it would be good to at least have this kind of wording on the website as well as the app, if the above is not possible / desired.

Yes. There needs to be a significant review of this functionality. Novice users have no idea how untrustworthy these AI suggestions are. Species and genus-level IDs should be reserved for instances where there is strong evidence (something akin to a 95% confidence interval). Otherwise, the standard should be to offer IDs at a much higher taxonomic level, or perhaps none at all if there is no obvious suggestion.

I’d also argue for AI identifications not being factored into Research Grade observations, at least for those that fall below a certain confidence interval.

3 Likes

Yeah I wondered about that as well… this is presumably another factor in the kind of things we see making it into the CV clean-up wiki…creating a kind of confirmation bias in the system…

Another even more extreme example - 2 x autosuggests taking it to RG ! …( incorrectly I would guess judging from blurry photo and that both users seem to be new to platform )

Screenshot 2020-07-26 at 14.14.06

1 Like

I disagree with the principle, because I may submit an ID using the AI after having verified thoroughly that this is a “good” ID (checking the details visible on the photo and comparing with the taxon page, checking the other species in the same genus already identified in a large region surrounding the observation, checking the distribution map provided on POWO).

Using the AI to find and select an ID to submit does not imply that the act of identifying is automated and of poor value.

3 Likes

In theory though, if autosuggested IDs were not factored in, that wouldn’t prevent you from then making the same ID again on the observation without use of autosuggest in order for it to have full power.

I think the point of this option to my mind would be that it would empower identifiers over less experienced observers. As an observer who also uses the autosuggest, I wouldn’t be opposed to the onus being more on me to (for example) resubmit an ID than on identifiers to overcome my incorrect autosuggested IDs.

Ideally though there must be a way of creating an interface that supports both users who simply want to drop an observation into the correct ball park…as well as users who are using AI with care / as an autofill. Something like a toggle during upload / account preferences would also work in that regard perhaps.

I think this would be best, with the caveat that it must work/not interfere with the way CV currently works and trains. Basically the value for this request is similar to the general principle that the most specific/finest ID isn’t necessarily the best or most warranted ID (e.g. species IDs aren’t always needed), for example also for when users are identifying without using CV.

2 Likes