iNaturalist's earthworm problem and how to fix it

I wasn’t referring to users manually entering the name (this would not show up with a CV symbol next to the ID). I was referring to the choices picked by the users from among the CV suggestions.

Users can choose any of the suggestions, or they can choose not to use them at all and add something generic instead. If the CV provides multiple options, only one of them (at most) can be correct – the others will be wrong; and yet I don’t think it would be an improvement to have the CV only ever suggest one taxon. It is quite possible that the CV is providing a reasonable top-level (broad) suggestion and users are not selecting this. If so, that’s not really the CV’s fault. That is a user issue – i.e., users are uncritically and/or randomly choosing suggestions or they are looking at the taxon images and imagining that they see some distinction that in fact is not relevant for identification. Either way, improving the CV will not fix this.

As I noted, I see users choose wrong suggestions fairly often even in cases where the observation is identifiable to species and the correct species is suggested by the CV. What this indicates is that for some taxa users are not skilled enough to evaluate the suggestions but they either believe they are able to do so or they do not know how to enter a broader ID of their own.

2 Likes

The top level IDs are wrong, too, is what I mean. There are so many lumbricids identifiable from photos and so many non-lumbricids unidentifiable from photos that a majority of family-level IDs are just wrong. Nearly all of the 59.6% correct nonspecies IDs are order Crassiclitellata (which contains basically all earthworms) or to Lumbricidae from places that actually have Lumbricidae. And the most common reason I identify to a taxon higher than the chosen CV ID is because the photos are just too bad to get any information from, even to family. It’s not just that users aren’t informed about earthworms or that the CV is causing all the issues, the real root of the issue is that nobody is taking photos of earthworms that allow CV or me to make distinctions. Unlike me, though, the algorithm is really bad at giving vague IDs because not enough taxa are in the CV to give it options to pick from.

3 Likes

It isn’t clear to me how you know this from looking at the suggestions chosen by the users – this is why I was asking about your methodology when presenting these numbers.

There are a few other taxa that the CV is similarly bad at and that are similarly difficult to ID from photos (say, mites or ichneumonids) where it seems to give OK broad suggestions at the level of family or order. But this is not necessarily reflected in the IDs that users enter.

1 Like

You raise a fair point, I could pick a set of others’ earthworm observations and see what the top suggestions are and score those results for a more accurate picture of what the CV suggests, not just what people choose.

But I think I am not explaining why I’m keeping track of this data well enough. The tally isn’t just for explaining that either the CV is bad or that observers pick species when they shouldn’t, it’s more to ask the question “Can earthworm identification ever be left to the combination of untrained observers and a trigger-happy AI, i.e. can I ever take a break from this?” I would like to know if identifying earthworms on here is worth the time I put into it, essentially. It would be nice to know that I can leave things to the observers and the AI for a little while so that I can take a break and maybe concentrate on some other taxa I am knowledgeable about. So far, sadly, this does not seem to be the case.

3 Likes

As of October this species was still getting CV-assisted IDs from app submissions despite being out of the web CV since February:

Even without the CV there is a steady flow of species level IDs coming in because of people having learned the species either from iNat or other sources.

1 Like

Hi @thirty_legs i only learnt this this after seeing your ID and comment on one of Indian earthworm ID that I IDed to family level iirc from AI (i rarely touch them). Thank you for your efforts.

and I would be happy to help in fixing IDs in (south) Indian earthworms if I have some resource to guide - I have read your post and I understand what is needed for better ID, but I dont know what levels of ID is better/confident in observations at present in this region, aka I dont know to what family/genus/order level I can add agreeing/disagreeing IDs now. I tried finding a decent source for Indian earthworms - only few pics in random research papers exist online, and so I am still totally clueless (i believe sometimes a genus/family can be apparent from habitus and habitat and other such clues with high confidence as i see in other taxa but ofc those IDs can be possible with other evidence of distributions and other evidence too).

If you ever manage to find or do ID guides for this region, i will hop in with you.

5 Likes

That’s reasonable and I sympathize with your struggles. I deal with similar frustrations with bees (which are somewhat more readily identifiable than earthworms but pose other rather stubborn problems for an image recognition algorithm).

The reason why I was persistent about this is because if the purpose is to troubleshoot problems with the CV with the goal of being able to make changes that would reduce the number of wrong identifications happening in the first place, then it is important to correctly diagnose the source(s) of the problem – otherwise any solutions may not prove effective.

3 Likes

This may be due to the use of the “lighter” CV model that the apps use for offline ID and Seek uses as well. I think it gets updated less frequently, though I don’t know the schedule.

2 Likes

Never knew that was a thing. Wow, so the CV lite must be used a lot more often than the real CV. That’s a little gross.

India is tough. Many observations cannot be identified to order, since both Crassiclitellata and Moniligastrida are present there. I have some resources that might be useful, but the vast number of species in Acanthodrilidae and Megascolecidae are not identifiable without dissections.

I don’t know if this

is correct. The “lite” CV model would be used by the new app when it doesn’t have enough internet signal to use the full model and by users of Seek. Other users would be using the “full” model. I would guess that use of the full model is higher than the offline version. Maybe staff have access to some data about that, not sure. You can tell if observations were made with Seek, though, so you could assess the impact of that at least.

1 Like

Oh, I missed the “offline” part. Yeah, that does make sense.

1 Like

You can filter by what app an observation is uploaded from. Doesn’t account for offline new app though.