Once something is incorrectly in the CV model, there's no removing it without diligent effort

At some point, years ago, cicada exuvia identified as Hadoa texana got into the CV model via wrong identifications that weren’t caught quickly enough (exuvia are notoriously difficult to ID). Since then, CV has been suggesting the species for exuvia, despite periodic efforts to clear out newer batches of misidentifications that were undoubtedly due to the CV suggestions, many of them research grade. It seems that clearing out the bad identifications only works if you can keep at it frequently enough that eventually when a new CV model is trained, there won’t be any more of the wrong IDs. But apparently, that has never happened. I have stopped doing the level of cicada identification that I used to do and there really doesn’t seem to be anyone else doing these specific IDs, so the wrong IDs keep piling up. Seems like there should be a way for the CV system to eventually figure out that if its suggestions keep getting overturned year after year, to stop making them.

27 Likes

This problem has been brought up over and over (recently a long thread about “common earthworm”), but your description of battling it is particularly crisp and insightful. We ought to be able to generate a list of “species” that the CV would be blocked from IDing until some quality control barrier is passed. One thought – I’ve been dealing with this issue with respect to the genus Sphagnum, peat mosses. For a long time the CV identified nearly every peat moss as Crome Sphagnum, Sphagnum squarrosum. I tried to back these off, but there are thousands of them. Meanwhile I and others (especially @schneidried) were identifying other species of Sphagnum, which isn’t easy to do from photos, but we seem to have reached a threshold where the CV now recognizes (with some credibility) several species, and this has caused it to stop the rampant “Crome Sphagnum” IDs, so we have some possibility of getting them back under control eventually. I don’t know if this would help with your exuvia, but if there is some species besides Hadoa texana that is reliably identifiable, getting it into the CV might help stop that tide.

13 Likes

I feel your pain. Most of my iNat activity over the past half decade has been spent battling the feedback loop of incorrect identifications getting added to the CV model, which then creates even more incorrect identifications and so on. While I suspect iNat might need a paradigm shift to keep original observers from adding an incorrect identification, helping to educate other identifiers who are identifying incorrectly might be a more manageable approach as there are often less of them and they often seem more open to new information. At least if you can keep people from adding a confirming ID, then maybe there is less of a chance of an observation getting to research grade. I have written journal posts about the species I deal with that people can link to so I don’t have to write the same explanations repeatedly. After that, I try to keep on top of the pool of research grade observations, which is usually much smaller, and make sure those are correct. It is at least some measure of quality control.

14 Likes

I’ve been working pretty hard to clean up springtails lately, and yeah, adding other species to the CV is really the only way to stem the tide. Of course, far fewer springtails get observed than cicadas so it’s not quite the same scale of problem. I also can’t imagine it’s easy to ID cicada shells in the first place, especially without microscopic analysis.

7 Likes

I feel this. There is quite a chaos with cicada exuviae.

Lately it seems like I’m correcting an increasing number of wrong CV IDs (especially endemic plants and cicadas).

The problem is that people click on the suggestions without checking what they click on. And the CV suggests very crazy things sometimes (probably because the geomodels are not always correct due to some wrong Research Grade observations). So besides the people’s decisions the CV suggestions are significantly responsible for this problem as well.

2 Likes

It feels like there’s something with how the CV is trained where it’s very heavily rewarded for labelling an observation as Examplegenus speciesname if there are any similar photos with that ID, but not penalized enough (at all?) for labelling a photo as E. speciesname even if there are hundreds of similar photos which all have many disagreeing IDs explicitly saying its not E. speciesname

5 Likes

This sounds pretty similar to the source of this thread, where the CV is suggesting species for unidentifiable Sinea nymphs: https://forum.inaturalist.org/t/north-american-sinea-id-and-the-sorcerers-apprentice-problem/68337

I think this isn’t a training issue, but instead just something inherent with not all species being in the model. In your hypothetical example it definitely gets penalized for those wrong IDs, but unless the species of the correct IDs is also in the model, it has no likelier alternative to suggest to the end user once the model is done training. It reduces how “certain” it is for sure, but if there are no more likely suggestions, it’ll still be #1 on the list.

So for a simplified example, if there are like 10 species that have similar-looking exuviae, but only 1 of those species is in the model (say that species has 100 observations and the rest only have 50 each), it’ll only be ~18% sure that that species is the correct ID for one of those exuviae photos, but ~18% is still the highest species-level suggestion.

1 Like

Hmm yeah, that’s a good point. I guess really the issue is the model should ideally be able to learn at some point that it should just not be making species level suggestions for certain taxa. If the model is 99% confident its in Examplegenus but only 10% confident its E. speciesname (but that 10% is higher than any other species) I think we’d just be better off if the dropdown menu only had Examplegenus and not any species.

Maybe I’m not phrasing it very well but I just mean this feels like the sort of thing that should be solvable. We are feeding it giant amounts of data that says “you cannot assign a species to photos like this”, it should be able to use that, though I’m no expert.

2 Likes

The CV training process has gotten more complex over time, but change is very slow because each change in how it works has unexpected ripple effects. There was an attempt to introduce hybrids a couple years ago and it went poorly, and only recently have there been slow experiments to reintroduce them. So its options in what taxa can be involved in training and suggestion are still fairly simple and constrained, and they’re manually selected by the developers. There are certain things that it’s predictably incapable of learning because they’re loopholes in the rules it has to follow.

You can read roughly the process used for selecting taxa in training here, but the most relevant detail is that it only trains on leaf taxa. If no species in a genus are included in the CV then it will train on a random sample of all observations within the genus. But if any species within the genus are eligible for inclusion, then it will only train on a random sample of observations of those species. It won’t train on any observations identified only to genus level.

So in this example, if they all have unidentifiable exuviae, then the CV won’t know about the exuviae even if all 10 species are in the model, because the exuviae are all stuck up at genus level and the CV has no idea of their existence. There’s a feature request to also train on non-leaf taxa here.

So far the CV doesn’t get any feedback about mistakes it makes. There are a couple options which have been proposed here already:

7 Likes

I don’t think it’s possible to come up with a list of problematic taxa, nor do I think the solution involves creative changes to the algorithm. Some taxa are only problematic in certain geographical areas where look-alike taxa are present. In other areas, the look-alike taxa are absent and so the algorithm apparently works well. This problem is getting worse (at least with plants) since taxon splits are rampant.

Likewise user education is not a viable solution. I can add informative comments and supplemental material until I’m blue in the face but it won’t make a difference. The only solution I can see is to stop giving certain users a list of species to choose from. As it is now, the platform encourages the kind of guessing that leads to incorrect Research Grade IDs.

Most users will never temper their guessing based on comments, articles, user guides, published flora, identification keys, or external web sites. That’s okay…it’s unreasonable to expect otherwise. However, those users should be prompted to guess a genus (or some courser taxon), not species. A list of species must be earned, or at least explicitly chosen by the user.

For example, the new phone app has an advanced mode that must be turned on but it misses a Golden Opportunity. Until advanced mode is enabled, the algorithm should refrain from displaying a list of species. Even if advanced mode remains “free”, this will reinforce the desired habit of reasonable guessing. Personally I don’t think advanced mode should be “free” but I guess that’s a conversation for another time.

4 Likes

While I agree some super perfect granular species-level list of problematic taxa is probably impossible, I don’t see why we couldn’t have some much coarser imperfect list, and why this wouldn’t be an improvement over the status quo. I don’t see why they couldn’t set things up so that, say, the CV will happily recommend species for birds but not for fungi (and you could even throw in exceptions for, say, empidonax flycatchers and some very distinct common fungi if you’d like!). I worry about the load a measure as drastic as what you’re suggesting would be. IDers are already quite overwhelmed, and the first CV suggestion is overall right close to 90% of the time. Doubling the amount of IDs needed for many obs to get to research grade would take up quite a bit of identifier time spent on clicking “yes that’s a great blue heron” which seems like it could be a great deal of added strain on identifiers.

Also anecdotally a lot of casual users just use iNat for the CV and want to be able to take a photo of a butterfly and know what it is right away, they may not even submit it as an observation. If they were never given any species suggestions I imagine they would be quite disappointed.

4 Likes

I assume this is the reason why the app usually suggests a species ID at the top when the web uploader always suggests genus level first. I have a pretty good intuitive sense of when I should distrust the app’s species suggestion but other than broad generalizations (insects should never be species except like… Monarch, common lady beetles) I’m not sure how they could be codified. I think if the CV incorporated some of the proposed suggestions (training on non-leaf taxa, knowing what ratio are ID’d to species, etc.) then it could figure it out on a case-by-case basis.

3 Likes

Let’s compare two types of observations. The first is the (typical) leading ID at species-level while the second is the (atypical) leading ID at genus-level. In my experience, the latter is correct significantly more often than the first. (I can’t quantify this, your mileage will vary.) Now it is well known that an incorrect leading ID is much more work for identifiers. So in my experience, I spend less time and hassle on leading IDs that are correct in the first place. I’d much rather review those types of observations than struggle with disagreeing IDs.

Yes, there’s no doubt CV genus level IDs are on average more likely to be correct than species level ones. But the issue is one of quantity, not quality. As a whole ~90% of the species level CV IDs are correct, and all of these only need one supporting ID to leave the identification queue. If nothing was ever initially IDed to species you’d be doubling the overall identification effort required on that (very large) proportion of observations on this site.

1 Like

It is probably worth mentioning that it’s not necessarily going to double the number of IDs needed, as removing CV-suggested species suggestions shouldn’t stop people adding their own species-level IDs if they know what it is - so for example I would manually ID most of my plant observations to species, which would be a pain for me but wouldn’t increase the requirement for others to ID. And if I can’t ID to species myself, I don’t use the CV species-level IDs anyway, so that wouldn’t change either!

Whilst I agree with some of your core points above, I don’t know where you’re getting this stat from? …based on the link it seems to be a misreading of the page which graphs results showing a 90% accuracy based on 1000 x RG obs….e.g. not at all showing the sort of obs in discussion here.

If you took 1000 obs regardless of whether or not at RG…
and say 50% of total obs were RG and 90% of that subset had a correct autosuggest….
but the other 50% of the total obs could be Needs ID and 100% incorrect autosuggest …

If one takes only the RG obs subset that might give a result similar to the one graphed and would lead to the conclusions on the page linked… but it would be very different to the statement you make which would need to include those not at RG to be a truer representation.


Also, even that 90% on RG average seems strange to me fwiw - at least it doesn’t add up with the bars to the right…so I guess its something independent of them in a way(?) …

Crude and flawed as well, but if you do your own rough average by continent given the data there, it’s a lot less. e.g. 70+75+80+80+90+90 = average 80% accuracy …. But yes, that’s a really crude and kinda pointless statistic anyway, given both the complexity of different continents and the different numbers of users on each…. etc etc etc

It would be good to see how these graphs are generated exactly. But the stated intention of them was to compare accuracy of one model against another anyhow…not to be repurposed in the way I think you are interpreting them.

2 Likes

I doubt we can tweak the CV to solve some of the most widespread issues. I think having identifiers slowly accumulate a list of taxa that are nearly always wrong and then stopping the CV from suggesting that name would be more effective.

6 Likes

Agreed. Especially in cases (like earthworms) where organisms may not even be identifiable to family without dissection, or where the majority of photos do not show any identifying characteristics, the CV operating entirely on visual similarity are just guesses that cannot be backed up with anything solid.

There are also some taxa in the CV like the pheretimoid earthworms and genus Aporrectodea that are non-monophyletic–explicitly so for the pheretimoids, where Metaphire and Amynthas and others are separated based on genital pore type, which has both changed between some close relatives and is conserved between distantly related taxa. Visually and genetically similar worms may belong to different genera as currently defined by morphology. That does not sound like a situation that the CV should be applying any IDs past family (except for extremely distinctive species that can be sight ID’d at least in parts of their range).

3 Likes

I don’t see how this is true. Most of the taxon bars are on the right are lower, but plants are higher and alone make up ~40% of all observations, whereas many of the lower bars (e.g. fish) represent an almost negligible proportion comparatively.

There are 15x as many North American observations as African observations, so weighting them equally is not going to give anything close to the true average accuracy. I have no reason to think they are lying in the chart, which seems like an extremely plausible result given what a high proportion of obs come from North America and Europe

and yes, I agree only looking at RG observations is very imperfect and I also agree with points like

but my point is just that I think people are underestimating the huge amount of extra identifying work having a blanket “no species-level CV suggestions ever” would create compared to just having one for some rough list of problematic taxa

2 Likes