Problems with Computer Vision and new/inexperienced users

As I’m sure many of you are well aware, there are a great many problems with the Computer Vision suggestions for anything except the most common, easily recognisable species. I’m well aware that AI image recognition is still in its infancy, and the iNat AI is really one of the better ones out there, but it still just causes so many problems. This is especially so with new and inexperienced users who don’t really know how to use iNat yet. I’m not entirely sure what a solution to this would be (that’s why I’m discussing it here) but something about the wording and/or the interface just encourages people to pick the CV suggestion even when it’s very clearly wrong. I know that CV is often quite good for common and large species, but as someone who works with insects it just seems like the AI is always against us and is yet another problem without offering much benefit. Apologies if it has been brought up before but I couldn’t find any relevant forum posts.

I think a lot of the problem is that new users are immediately prompted with a CV identification suggestion when they go to upload a sighting, and they pick it thinking that it’s correct. Sometimes I cannot understand why people have chosen something beyond “it’s what the computer suggested so it must be correct” though. I came to iNat directly from another similar citizen science website after it shut down (BowerBird), and I had been identifying observations here for a while before that, so I didn’t really have the experience of joining without understanding how citizen science works and how iNat specifically works. It would be great to get some comments from people who have just joined, or comments from people about their experiences of first joining.

The fact that we do have a serious problem is very obvious to me at the moment - I used to curate and identify a large number of Australian taxa but I was forced to stop last year due to health reasons. I am only just starting to get back into this, reviewing all of the sightings that I missed, and the number of incorrect computer vision-suggested identifications is astounding. Most of these taxa are not groups that other people identify because frequently they do require a lot of technical knowledge and experience, but that means that nobody is there to pick up on the mistakes. Most of these just sit there at Needs ID until I come along and correct them, but sometimes another inexperienced or new user will use the CV suggestion as well and confirm the identification, taking it to research grade. I can only imagine what the situation is like for taxa that have no experts here to help correct mistakes, or to even find mistakes in the first place.

Here are a couple of examples of Orthoptera from inland Australia that help illustrate some of the problems:

Here is a subadult Raspy Cricket (Gryllacrididae), probably something like Hadrogryllacris:


The observer has suggested an ID of Gryllidea (Crickets) without using CV, which is not accurate but is an understandable mistake to make. However, the CV suggestions are very far from accurate:
None of these are even in the correct subfamily (none are in Gryllidea either though), and only one of them has any sightings anywhere near here, and only one other even have any iNat sightings in Australia! Hemiandrus is found in Australia, but we have no sightings of it here yet, so why is the AI so confident?

Here is another, of Macrazelota cervina:


The user initially IDed it as Arphiini, which is what CV suggested, but there are no Arphiini anywhere near Australia. Even now that it has been corrected, the AI continues to suggest Lactista in Arphiini:
Pycnostictus is at least in the same family and is found in this area, but the same could be said of more than 200 other species as well. And why does the AI suggest only nearby top suggestions for this sighting, but not for the other one? I have not changed any settings. Importantly also, why is the ‘pretty sure’ suggestion not a nearby taxon as well?

I think a lot of the problem here comes from the wording of the suggestions, and from how much people overestimate the ability of AI. Plenty of people I know will take Google Lens suggestions as if they are certainly correct, because what reason do they have to doubt it? In essence I strongly believe the wording needs to change. The current “We’re pretty sure this is in the genus [something]” quite drastically overstates the AI’s ability, and because of that people accept it as correct without checking and without even knowing that it’s probably not correct. There are millions of species and we can’t expect the AI to be able to differentiate every single one, but the current wording doesn’t reflect this. There is no indication that the AI can only identify common species, and there is no indication that it is frequently very wrong. This is confusing for new users I’m sure, because if you have a photo of an insect that you have never seen before, and the AI used by such a successful and well-known platform like iNat says that it’s “pretty sure” that your photograph is of something not even found on your continent, why would you doubt it? But even for experienced users it seems to be a problem - e.g. the one who used the CV suggestion of Arphiini in the above sighting has a couple of hundred sightings.
I’m not sure what a better-worded approach would be, but the one we currently have simply doesn’t work. Suggestions for better wording are most welcome!

For a while I thought this was the only problem, but here is a situation that still confuses me. A photograph of Gea heptagon showed up in a spider group on Facebook, and after a bit of discussion everyone was happy with the ID (it was initially IDed as G. theridioides). The exciting thing is that these are the first photographs of the species in Australia (as far as I know), so I asked the person who posted it if they would be happy to put it on iNat. They added the sighting which is fantastic, but, using the CV suggestion, they IDed it as Argiope keyserlingi. Why??? The user knew it was Gea, and they even wrote in the observation description that there was discussion on Facebook as to whether it was G. theridioides or G. heptagon. So why did they ID it as A. keyserlingi?? The two do look similar and they are closely related, and it would be an easy mistake to make for someone who didn’t know what they were looking at. I certainly don’t expect the AI to be able to ID it as G. heptagon, because all of the other iNat sightings of the species are from North America, where they look fairly different (and it’s the other side of the world anyway). Indeed the AI does not really have any other suggestions:


But the user knew what it was before they even made an iNat account, so why did they default to the CV suggestion?? Is there something in the “add observations” page that encourages people to use it even when they know it’s wrong? Even the more experienced users will go against their own gut feeling and choose the CV option, and then when I correct them they tell me that they shouldn’t have trusted the AI.

I’m honestly not sure of a way to remedy this, but it causes so many problems. It needs to explicitly and clearly state somewhere obvious that it’s a suggestion only, that the AI is frequently incorrect, and that users should not think that the computer knows better than they do. Perhaps a little pop-up window for new users the first few times they use the CV suggestion to identify their sightings, or perhaps even completely disable the CV suggestions until users have completed a little tutorial or something. I know that probably sounds a tad extreme, but it’s a huge huge problem. There’s an entire forum wiki devoted to cleaning up CV-suggested misidentifications, and that’s just the ones that we know about. For all those taxa that don’t get regularly checked, there are probably countless misidentifed sightings and nobody who can fix them. Suggestions on how to fix these sorts of problems are welcome, but it just gets very overwhelming continually telling new users that the AI is usually wrong, and that they should check the suggestions before selecting them, and then going back through all of the CV misidentifications repeating the process again and again to try to get things in order (and that’s just in those few groups that I am knowledgeable on).

21 Likes

As I understand it, the way to fix this is to have more correctly identified research grade observations of said species that can be used to train the CV system.

If there aren’t enough observations of a species, then the CV system can’t be trained on it, and it will “recognize” it as something else that is visually similar.

More observations (properly taken to research grade) = better CV.

That said, there have been a lot of conversations about changing the wording of the CV system to highlight that it’s not 100% what the species is, or to have have an initial CV identification marked as “provisional” or some other approach.

The issue of new users placing an inordinate amount of trust in the CV system is a common topic of conversation, but I don’t think anyone has come up with a solution yet.

7 Likes

ZA transferred from iSpot a couple of years back because of some issues, but what I liked about iSpot was the more gradual approach to a correct ID:
I could already select between:

  • [I think it is]
  • [I am pretty sure it is] and
  • [I know it is]
    Then my experience in IDing was assessed as well and gave me a rating. People with a good rating had more weight in swinging an ID

How difficult would it be to build something like this into the decision to make an observation [Research] grade? This could improve the ability of the AI.

11 Likes

There are thousands of wrong IDs as a result of this, especially in groups with not many active experts on iN to correct this. For example there are over 1500 IDs of a land snail Succinea putris in North America, but this species is absent there and confused by AI with local species that nobody can ID on iN, so Succinea putris from Old World is an only AI-suggestion for similar-looking snails. See discussion here:
https://www.inaturalist.org/observations/84909219
I’m sure there are hundreds of similar unnoticed cases. Months of work to clean it up. For sure AI-suggestions should be more careful and there should be much more on family-level, order-level and even above. “We’re pretty sure this is…” and “here are our top suggestions” are both way too confident.

12 Likes

iNat staff is against any weight id system. We need good onboarding system, there’s really no fault of current id system, but how people see it.

2 Likes

Explicit instruction or indication of some sort in the drop-down menu to encourage the selection of higher-order taxa when using the AI would be good I think. Or the opposite - 'Here are our CV species level suggestions, only select if very confident.

6 Likes

Thanks for the comments everyone. I don’t think anything really needs to change in terms of the functioning of iNat - e.g. weighted IDs or changes to the AI itself - but the wording especially just needs to be better. Maybe at some point we will get to a point where the AI can recognise almost every species, but we are currently a very long way from that and people don’t seem to realise it. Back last year we even had someone actively disagreeing with experts on other platforms because they thought the iNat AI was infallible, and nothing we said could convince them otherwise. At the moment there is just very little nuance with the wording of the CV suggestions, and people don’t realise that they shouldn’t use them. With common taxa I’m sure it’s pretty good, but with the groups I mostly identify the success rate is well under 10%.

10 Likes

It might not just be the wording that encourages people to blindly trust the CV. I believe we have to think about the average user of this platform. Since iNat is becoming more and more popular it also attracts many people who are not really interested in “correctness” or “contributing to science”. I assume the average user just observes some living being and wants to know what it is. So they upload it, use the CV to get a name for what they saw and they’ll be happy. If it’s wrong or correct might not matter to them, they just want a name. Most of them probably don’t even know that the IDs are used as scientific data. I might be very judgemental here and maybe also completely wrong, but I guess it’s safe to say that the people who are really invested in iNat and interested in correctness are a minority. At least that’s the impression that you get as someone who contributes with identifications.

The only thing that came to my mind was also a pop-up window that appears whenever you are about to just blindly trust the CV. Something along the lines: “Did you compare this species with those 5 other species that occur nearby and look similar”? Maybe even make it a bit annoying… that you have to click through all of the suggested species and confirm that you have checked them. This way it would only affect the new/inexperienced users, who don’t put an order, family, genus or species by themselves. This would also encourage them to put more higher level IDs (beetle, butterfly etc.) by themselves because it’s faster than clicking through all of the suggested species.

8 Likes

It will affect all of us, we use cv to fast things up, I don’t wanna type in ids for hundreds of obs when cv shows me a correct id in its list,

7 Likes

Oh okay, I didn’t know that also experienced users use it. I guess it always depends on the workflow. I usually already give my files on my computer a name (and if it’s just “Coleoptera” or “Bird”) which is then automatically transferred to my observations while uploading.

2 Likes

If I don’t know much about the given type of organism and the AI tells me that it’s “pretty sure”, I pick it. Where is exactly the harm in that? The only problem is if someone blindly confirms it. But what is harmed by having a “needs ID” observation with a wrong ID? It says it in the label - it needs ID. It’s far better to pick something than to let it as “animals” or “arthropods” or something this vague, because many IDers clearly watch only specific sets of taxa and this way the observation has a far better chance to reach the right person to have a look at it. The experts at the taxa will be quickly able to say “no it’s not this” as well.

There is no problem with CV here. The only problem is people who refuse to accept that “needs ID” observations aren’t reliably IDed.

5 Likes

I’m a relatively casual user (400 obs). I wonder how many other users like me see iNat as a game. For me, observing new species is the fun of it. It’s basically like Pokemon Go but with real creatures.

As such, there’s a temptation to give a more ambitious ID than is actually possible. For example, I like the genus Euphrasia a lot. It’s a pretty complex genus where you probs need some expertise and good gear to properly get an ID down to species. But the CV just tells me “Euphrasia nemorosa – Seen nearby!” So it’s tempting to tap that and thus log a new species to my profile.

I’ve also done some ID’ing for other people’s obs, but I’m only confident on a very limited set of taxa. So as @opisska says, it would be cool to be able to say “I know it’s not X” without having to state what it is instead. (I guess that’s where you’re supposed to suggest a higher clade?)

11 Likes

The problem is that sometimes people blindly confirm. I will not name names, but there are several people that consistently confirm all of my IDs on things, because they see me as an expert. That’s good for confirming things when I’m very confident but I know nobody else has the skills to ID it, but it’s not so helpful when I would prefer to have confirmation from others. And I know a lot of people (new people especially) will agree with something because they think the computer is right, etc. Incorrectly identifying something is perfectly fine if it’s a group that is monitored a lot because as you say, the experts will quickly tell you your ID is wrong. But there are just so many taxa that don’t have any experts at all that this isn’t feasible. I’m currently going back through all of the sightings that I missed while I was ‘away’, and the number of incorrect Research Grade sightings that were caused by computer vision suggestions is still very very high (and some have been like that for almost two years now). If we had an expert in every group, this would be a viable strategy, but we just don’t have that many people identifying.

9 Likes

Well, it’s brought up in pretty much every topic about cv, it’s like a 10th one with the same motive of “cv isn’t always right”, personally it’s getting quite boring to read all the same things again many times. https://forum.inaturalist.org/t/dont-use-computer-vision/13327

3 Likes

But this just against shows that the pushback against using CV for initial ID is misguided when the actual problem is the people who confirm it. If anything, there should be more education for giving IDs to the observations by others and stressing that an ID should only be given to something if you are actually sure about it (except for the initial IDs on your observations). In the ideal world, I could easily see a reliability score for IDers - and if your agreements get disputed often, you could lose the ability to ID for example - but I can see how that’s not entirely practical for iNat as it is now where there is a huge shortage of IDers compared to observations (except for a handful of really popular orders).

2 Likes

The problem is that your observation could also easily get stuck somewhere where it does not belong. As the iNaturalist suggestions also say: The “best approach” is to just ID as far as you confidently can on your own. If you just put “arthropod” in there, then there are enough people working on that group that will push it further into the right direction so that it will get to the level of where the “experts” are (if there are any for the specific group).

5 Likes

The iNat team are not 100% against a weighted “reputation” system, but it’s something that takes a LOT of time, work, and careful planning.

4 Likes

It’s the first time I hear that it’s so really, so maybe yes, it’s just too hard to create to functionate correctly?

2 Likes

Because if you select a species level identification you are still outsourcing your confidence if you do not know and one suggestion is still a contribution to research grade. If you use computer vision to get your observation to a higher order this does not contribute to the research grade labelling and also will narrow it to something less vague to reach the right people.

The blame is not entirely on the false confirmer if it reaches research grade if you have also contributed to that. At the very least if you don’t know yourself you can still require the observation to get two other identifications.

5 Likes

I have always believed that a little pop-up window for new users about the limits of the CV would be a great idea. It would have to happen for at least the first 10 or so times for the message to sink in.

8 Likes