This is a good point, although I think the significance is reduced somewhat by the fact that taxa which the CV can easily identify are also easily identified by human identifiers.
It doesn’t make that much difference whether I use the CV for my first ID of an American Alligator observation, because 4 other identifiers are still going to easily confirm it either way (although maybe not as much as they used to). And when the CV does correctly guess a more difficult species that only one identifier on the platform knows how to identify, I don’t trust the RG designation nearly as much. I value the fact that usually 2 humans are involved, because it decreases the risk of incorrect RG obs that are more difficult to locate and correct.
Yeah, though I also find the very easy identifications a bit more tedious than the trickier ones, especially when I have to actually type in the correct ID and can’t just give it a quick agree (which is exactly the kind of identifying that this would add). I guess I just think a blanket approach is poorly suited when some taxa seem much more problematic than others even just comparing between kingdoms, and I just disagree with sentiments like
because while it’s probably impossible to come up with a list of every single problematic and unproblematic fungus species it seems pretty simple to, say, treat birds differently from fungi.
It may be the average factors in total obs - we don´t know, as it´s not stated…but something else is definitely going on beyond that which is immediately visible. The average clearly isn´t just an average of the bars to the right as is. Either way it doesn´t matter, as it´s not intended to be used in the way you interpreted as far as I can see…
I wasn´t trying to imply they were lying or mistaken in any way… it´s just that they are creating metrics for judging the model´s accuracy against another model, it´s more of an internal analysis than the purpose you use it for I think.
I agree it´s totally plausible on RG obs though, yes.
But your statements were :
The core part of the statement is true based on the data provided if you include mention of RG… It just yeah…becomes a bit meaningless in this context. If we accept as a whole, ~90% of the species level CV IDs are correct on Research Grade obs. It still might be that only 10% of the species level CV IDs are correct when looking at all obs!
We just don´t know based on the data provided.
In my experience ( ~50000 obs / ~80000 IDs, in a large range of habitats and taxa, across continents ) …it just obviously varies wildly by taxon, location and image quality. In the areas where we need more obs and IDs the most, it´s naturally the weakest…as those are the same taxa and locations where we lack training data. By focussing on successful RG IDs as a false broader metric for accuracy, we risk reinforcing the existing observer/identifier bias, gaslighting those in complex taxa and less documented locale… and as a result, putting people off contributing in areas where we need them the most.
I agree with this to some extent.
I certainly agree optimal would be to have different approaches to different taxa.
If it were a binary choice though as it is now, I would rather see extra work placed on the shoulders of the easier taxa with more identifiers than on the shoulders of the frustrated few in more complex taxa and less documented locations.
The risk of genus only autosuggest is simply that it slows things down and adds a tiny bit more rigor. The risk of species level autosuggest is that it speeds things up and sends more incorrect content to RG, making the iNat dataset look embarrassingly bad to outsiders and discouraging more niche expertise from contributing or taking it seriously.
I also think identifiers prefer to take an obs from genus to species than to keep pulling it back from species to genus…
If you want to find the average across all 1000 randomly selected observations you can’t take the values you got for each continent and average out those six numbers, especially when the number of observations in each continent are on different orders of magnitude. I don’t think there’s anything “going on beyond that which is immediately visible here”. The averages on the left are just the averages from the full 1000 observations. As we’d expect a much higher proportion of those 1000 observations to come from North America and Europe than Africa and South America (as that’s how the observations on the site are distributed) it makes sense that the average across the whole sample is much closer to the Europe/N. America numbers than the Africa/S. America ones.
Given that >60% of verifiable observations are research grade what you’re saying here is impossible. Even if every species level CV suggestion on every non-research grade observation was wrong that would still mean 90% of 60%=54% of species level CV IDs are correct when looking at all obs.
All I’m saying is species level CV IDs overall probably save IDers a lot more time than most people realize, and there are barely enough IDers to keep up with the growing wave of observations as it stands. If there are some taxa where species-level CV suggestions are causing huge issues I don’t see why the solution shouldn’t just be aimed at those taxa.
There are a couple of options (risking that I will sound like a broken record)
Accept that observations of some genera will be misidentified to taxa that have the most observations
Spend your waking life correcting the misidentifications
Improve suggestions
Let’s just explore the last one.
The CV model is currently trained and tested on observation photos of the same taxonomic entity, meaning all tests are positive.
Testing against sister taxa or genera, or any other negative test cases are never mentioned and might not be feasible.
The only other way to include RG observations, that the model is not trained on, is to train on the parent. This way the model would give a visual match rating for both.
CV is not aware of taxonomy nor geolocation. That is done by the suggestion logic of the apps. I think it is possible to suppress taxa suggestions when the genus comes in as a better visual match.
What if even the genus comes at a low visual score? Same as before, suggest the parent with a warning message.
I saw a notification on how many new taxa were added to the model. I wonder how mane genera are dropped as the result.
I am sure there are other ways to improve suggestions I just can’t think of any that could be applied generally.
I’m going to make a question here before the problem arises so I can brace myself if the problem is incoming.
Basically, for a long time no Partula species has been able to achieve 60 observations, but recently this year Partula radiolata has surpassed 60 observations.
If I understand correctly, this means that the CV will stop suggesting Partula, and basically it will think Partula radiolata is the only species in the genus. This wouldn’t be a problem, if it weren’t for the fact that P. radiolata is one of the more distinctive-looking species in the genus (negligible problem), and is also a geographical outlier, being one of the 5 species in the Mariana Islands and not one of the 59 species in the Society Islands (extraordinarily bad problem).
Does this mean that on a Society Islands Partula the CV will consult the geomodel, think “oh, radiolata doesn’t occur here so this can’t be Partula as Partula only has species radiolata” and then proceed to suggest random stuff?
It’s very possible. This is why it can be such a problem. Sometimes I just choose not to ID a species unless I can also ID another species in the genus because I know more damage will be done for minimal gains. Though it very much is case by case.
Ultimately we have these basic options.
Just deal with it
Ignore any taxon in the genus nearing the threshold, purposely not IDing it.
Try as hard as possible to get other species in the CV to teach the CV its not a monotypic genus restricted to x area.
Find literature to source the creation of a complex or other sub grouping that can be IDed within the genus so the CV can learn that rather than another species.
Contact somebody or you yourself go out and observe enough to get another species eligible. Can be complicated and time consuming if the taxon needs microscope.
Continue to bring it up and try to push for a change from staff. To my knowledge staff have not actually responded to this issue despite it being brought up multiple times by different people sometimes with tags for staff attention.
None of these besides 3, 5 and 6 are close to being good solutions in my opinion. 6 is complicated, but ideally would be the best solution as this specific problem really comes down to the way the CV was built to perform. This isn’t a bug, but a feature. Only staff more specifically the engineers can change features.
But yes, a hypothetical taxon with 1000 correct CV IDs a week at genus level can turn into 100s of incorrect IDs a week, if the CV learns one species in that genus endemic to one small part of the world. Essentially it unlearns the genus and learns the endemic species.
Thankfully snails in French Polynesia are quite rarely observed, so I can do option 1, filter for all observations of Stylommatophora (common land snails and slugs) from French Polynesia weekly to correct any that end up in the wrong bin.
This is only possible because observations of snails in French Polynesia are uncommon enough (many days have 0 new observations) that I can catch any misIDs as they pop up.
I don’t envy people who have to deal with this kind of situation on a more commonly-observed taxon
Also, question, does the CV unlearn the species and re-learn the genus if you kick back all the RG obs. of the “bad” species to genus? (And no, I don’t plan on kicking all the radiolata back to genus)
Yes, the CV is retrained roughly once a month and it checks if each taxon is eligible before training on it each round.
Last winter I pushed ~10,000 Narceus americanus observations back to complex after it being in the CV for years and finally got it out. It’s frustrating because on the peripheries of the complex’s range (e.g. Texas or New England) they can theoretically be identified to species by range, and there is one identifier who still insists on doing so. But having the species in the CV would cause so many issues in the majority of the range where the two species overlap that I don’t think it’s worth it.
The line you’re quoting is part of the section on rules for which observations are included in training rather than which species are included (it is a bit confusing to follow). Unresolved flags on taxa don’t affect whether they’re included in training currently. If they did, these flags would solve the problem.
That is however included as a potential solution in this feature request:
At the rate we’re going, pretty soon people will be advocating for the CV just to suggest “Life” for everything. And then there will be complaints when rocks and clouds are misidentified.
I find this comment somewhat insulting. Advocating for a better more accurate CV does not in any way shape or form equate to that. I actually find it annoying when Chironomidae are IDed higher above family. It makes them practically lost and harder to find.