Problems with Computer Vision and new/inexperienced users

leafybye · August 24, 2021, 2:45pm

I don’t think it is unreasonable to suggest the wording of suggestions be changed from 'we’re pretty sure" to something else.

Also, why are some taxa never suggested, even though there are many nearby observations? For example, there is a species in my local area with about 70 observations, but every time I use CV on a photo of the species, it only suggests other species of the same Genus. Basically in the location in question, the species adapted to that habitat is never suggested, even though it is the only one that looks like it that is found in the habitat.

bkatzenberg · August 24, 2021, 2:54pm

Great post matthew_connors. I am more focused on plants that insects–which are so much harder considering there are twice as many species and they tend to be a lot smaller and harder to photograph. From a user experience perspective, I’ve noticed it’s less satisfying in iNat to just select the genus–but it’s often the right thing to do as the difference among species can be extremely difficult to discern. That said, the difference between a native and invasive version of same plant is important! Maybe more documentation in iNat on how to tell the difference?

lera · August 24, 2021, 2:57pm

Yes, “I know it’s not X” is one of the options for disagreeing at a higher taxon - you can even say “I know it’s not X but it is an organism”, by assigning Life

lera · August 24, 2021, 3:01pm

Both the original observer (you in this case) and the person that confirms are identifiers making an equal contribution to a Research Grade status. You should both be confident that you have assigned the correct identification. Otherwise just use a higher taxon. Of course - we all make mistakes. But no guessing, please. See also https://www.inaturalist.org/pages/help#id-agree

trscavo · August 24, 2021, 3:48pm

Because a “Needs ID” observation contributes to the training process just like a “Research Grade” observation. As far as I know, the training process does not distinguish between the two. If you want the CV to be more accurate, you have to train it with photos associated with accurate IDs, but “Needs ID” and “Research Grade” have nothing to do with it.

sam_e · August 24, 2021, 3:51pm

The computer vision training dataset contains just over 38,000 species as of the last update iirc, so many species are not included in the model. It says on the taxon page for each species whether it is included in the computer vision model or not so you can check your favourites.

egordon88 · August 24, 2021, 4:30pm

It would be nice if there was a “bulk” way to disagree with all those erroneous observations. I was looking at Salvia greggii, which has the reverse problem (American plant in Europe).

iNat map: https://www.inaturalist.org/taxa/168380-Salvia-greggii
Wild distribution (extends south into Mexico): http://bonap.net/MapGallery/County/Salvia%20greggii.png

It’s a lot of work to mark all those other IDs as wrong or cultivated, so it would be nice to have a warning for users like “The selected species is out-of-range. Please double check ID or mark as ‘not wild’ if it’s in a garden.”

fffffffff · August 24, 2021, 4:50pm

Well, most of those obs are old ones and from incorrectly working current system, not nearby suggestions shouldn’t pop up at all now.

anasacuta · August 24, 2021, 4:59pm

I think there are two different problems here:

CV-followers: people who just follow the ID proposed by computer vision
ID-happy: people who love to “agree” with identifications

When the two come together, the result is bad RG observations, which in turn feeds a vicious cycle of bad IDs: both by badly training the CV and by misleading people who are going beyond the CV proposals (by comparing with other potential species in the local species pool).

Regarding problem 1: CV-followers

If my experience is representative, most observations are entered using a mobile phone, and CV is quite irresistible to use (even when we know the species well, if we can’t quite remember the name).

Discussing with some iNat friends, some systematically pick the proposed ID even though they know full well that it may well be incorrect. This is particularly the case of people who don’t have much time: they just want to put the observation out there (quickly), and then leave it to the community to sort the ID. Which is fair enough!

In my case, because IDing is a big part of the fun with iNat, I use a more complex (time-consuming) method. When I enter observations of species I am not that familiar with, I first see what CV proposes, then go one or two taxonomic steps upwards and select that (e.g. pick the family of the proposed species, rather than the proposed species). This is almost always correct. Then, later on, I use the “compare” tool on the web interface to see what species in that taxon are found in my region/country, and try to refine the ID (or just leave it at a higher taxonomic level, if too daunting and/or if I don’t have much time).

In my opinion, CV should more often present suggestions at a higher-taxon level, whenever there is evidence that the ID is not simple (strong uncertainty). Examples:

When the top suggestions are all over the place, with a wide diversity of taxa proposed and most (none…) of which are found in the region. For example (for an observation of Flexopecten flexuosus):

image596×709 127 KB
When the proposed taxa have low levels of RG observations (e.g.: Xylocopa violacea) - which indicates they are often tough (or impossible) to identify.
When the proposed species is commonly mistaken with another (e.g. see similar species to Anomia ephippium)

Regarding problem 2: ID-happy

Many observers are ID-happy: they assume identifiers are experts, and will readily “agree” (thus bringing the observation to RG). This has already been discussed in other threads (e.g. here). I am not sure what the solution is, but in my case the result is that I now refrain much more from IDing than I used to, focusing on the species I am most certain of, because I don’t want to be responsible for a wrong RG observation.

I recently found another type of ID-happy people: people who follow me, and then systematically agree with my IDs. I got suspicious as they never add any information (they never refine my IDs) yet that can agree with me even in tricky species (sometimes agreeing in minutes with something that took me 1h to ID). They are very active users with thousands of identifications (so not exactly newbies).

I contacted two of them to ask (very politely) if they were aware of iNat’s guidelines regarding identifications (An identification confirms that you can confidently identify it yourself compared to any possible lookalikes. Please do not simply “Agree” with an ID that someone else has made without confirming that you understand how to identify that taxon.). They were not. One thanked me for the information, said he would now focus on identifying species he knew well. The other was much less receptive. He told me iNat is also for non-experts (true) and that he is doing a service in supporting expert IDs by helping observations reach RG. He also told me that IDing allows him to follow an observation and thus learn if the ID is subsequently corrected (I recommended he faves the observation instead). The first one certainly seems to have changed his ID habits, I think the second too (less sure).

I am not sure what the solution is here, and I don’t want to discourage people who are often young and very well-meaning (and may be on their way to become specialists!). But I suspect they are attracted by the game aspect of iNat - the pleasure of seeing the ID numbers go up, their names in the top identifiers for a region/taxon… I know this has already been quite discussed, but I think one good approach would be to modify the way top identifiers are presented. If the idea is to point users to people who can help them with IDs, I would recommend that rather than presenting as the “top identifier” the person who did most IDs, to present the person who did most improving IDs (i.e., only those IDs that added new taxonomic information, and that were subsequently supported by a community consensus). These IDs are a much more accurate reflection of expertise. And I would include the IDs made by observers in this (as some observers are quite knowledgeable and give the good ID from the start).

If non-specialist users wanted to maximise their numbers of improving IDs (thus, gamify this), the easiest way is to ID at high taxonomic levels, either by refining observations currently at a very high level (e.g. from Insect to Lepidoptera) and/or by correcting wrong higher-taxa IDs (e.g. from bee to fly). This would be a really useful contribution to iNat, and (I suspect) a better way of nudging them towards learning how to correctly ID species.

cmcheatle · August 24, 2021, 5:28pm

The challenge with the seen nearby is just that, that it is seen nearby, not known to be nearby. It uses entered inat observations, so checklists, atlases, range maps that can be entered add no value to the computer vision.

It is further hampered by filtering even the inat records into a +/- 45 day window around the calendar date of the record. Which is good for taxa with seasonality, for something like a scallop, which I assume has no seasonality means potentially as much as 75 percent of the possible ‘seen nearby’ records are excluded for displaying it as such.

wildskyflower · August 24, 2021, 5:32pm

One problem is is that, because they have said the reputation system would only consider your history on the site and not external factors, is that there is no way to make it so it doesn’t reward people who ‘play the game’ over actual expertise. For example, a professor of entomology who only comes to the site every couple months when tagged to resolve very hard IDs in their taxa of expertise could easily be outvoted by someone who (intentionally or not) runs up their reputation agreeing on easy IDs, and so might be discouraged from helping at all.

Also, its not obvious to me how various factors should count. For example, should your reputation go up or down when you withdraw an ID when you are corrected? After all, it shows you are responsive to correction, which I think is almost always a positive, but it also shows you made a mistake. Or if you add a maverick ID to an RG observation on the taxa you know best, would that initially be counted as a mistake until someone else withdrew their ID, or do you get a boost for being daring? It seems unwise to directly reward adding maverick IDs, but the at least right now the people who do it are disproportionately correct in my experience.

opisska · August 24, 2021, 5:49pm

If I am not mistaken, the CV never says “we are pretty sure” about a species-level ID. Every time I use it, I only see “we are pretty sure” for a Genus or higher, which is not a “contribution to research grade”. Or am I wrong, or simply just lucky that I have never seen that behavior? Because the original post criticizes the CV for being “too confident” but I don’t see it being confident at species level, so this again should not create the problem you describe.

opisska · August 24, 2021, 5:55pm

If that’s the case then it’s a really bad choice (which could be easily changed). I think iNaturalist seriously needs to accept a level of “uncertain” ID - I am not saying that it necessarily needs to be the current “needs ID” flag, but it seems the most logical. Because you will have a hard time stopping users from adopting some flag in this sense (especially with the ever increasing user base). Many people will always want to label their observations with “it’s probably this” - if nothing then for their own convenience in looking back at their own observations.

joedziewa · August 24, 2021, 5:57pm

If you disagree to a higher than necessary taxa, you will be counted as disagreeing with correct IDs

ajadewitt · August 24, 2021, 6:00pm

New user here, adding primarily from the iPhone app. I know a lot of specifics, but I’m expert on nothing. For instance, I’ve been a bird-watcher for years, and know my limitations; I’m willing to identify a mockingbird on the wing, or a white-throated sparrow by call, but I don’t even try to sort out warblers.

In iNat, I’ve been impressed by some of the CV IDs I’ve gotten, but that doesn’t mean I let CV decide for me. When I just don’t know what something is, I’ll get pretty general: Birds, or angiosperms, or lepidoptera, or whatever. But I’m no taxonomy expert; I could use suggestions. I’m sure a lot of iNat users are the same.

What strikes me most is how few options the suggestion system offers. When it suggests “Hogwarts chromeback”, that’s only useful if I can confirm the ID myself. If I can’t confirm that suggestion, I’m left with manually typing a search.

What I’d find useful, at that point, is if the app also offered me steps up the tree from that suggestion. I might not know what the bug is, but I might know it’s a beetle, and could select Coleoptera if it was offered; if I wasn’t even sure of that much, I could tap to Insecta and leave it for the experts.

Such a series of stepped suggestions would reduce the temptation for what anasacuta calls CV-followers to simply click the top suggestion, making it clearer they should choose their level of knowledge. And it would save me a lot of typing on my phone while balancing a bicycle.

anasacuta · August 24, 2021, 7:35pm

@ajadewitt

What I’d find useful, at that point, is if the app also offered me steps up the tree from that suggestion. I might not know what the bug is, but I might know it’s a beetle, and could select Coleoptera if it was offered; if I wasn’t even sure of that much, I could tap to Insecta and leave it for the experts.

I use Android, so not sure if it is the same, but “stepping up the tree” is what I do. I pick a given taxon among the proposed ones > a page dedicated to the taxon appears (with lots of useful info: a map where I can see if it occurs in my location; a distribution of seasonality with the fraction of observation that are RG - tells me if the species is a tough one to ID) > scroll down to taxonomy and select a level above.

Not sure I can do it on a bike though

scubabruin · August 24, 2021, 8:00pm

How about some sort of icon is placed next to any ID chosen from the AI. Then all users, especially identifiers, can very quickly assess observations where the computer vision suggestions were used. This could be an easy start to educating users about the AI suggestions. Specifically, if the “icon” is clickable much like the VU or IN status on species, then a pop-up can appear further illustrating why the ID has an icon attached to it.

stu_crawford · August 24, 2021, 8:02pm

The Computer Vision on iNaturalist is pretty amazing, and has a lot of utility. However, I think that it is being used inappropriately.

If someone does not know what a species is, and they use Computer Vision to identify it, the suggested ID should clearly stated that it was a Computer Vision ID, and not an ID by someone who actually knew what it was. The Computer Vision ID is very useful, because it helps focus people, and could end up drawing the attention of an appropriate expert. But if it is confused with an ID from someone who actually knows how to identify that taxa, and has confirmed the necessary characteristics from the photo, then I think that it is actually detrimental.

As it currently is, the Computer Vision IDs are used to reinforce the Computer Vision algorithm, which is a circular logic that ends up reinforcing Computer Vision errors. I.e. because the Computer Vision thinks that a specimen is a given species, it ends up getting ID’ed as that species, which then reinforces the algorithm to ID other similar specimens as that species, even if it is incorrect.

Also, if the Computer Vision suggestion does look visually similar, people will tend to agree with the suggestion and it becomes Research Grade. Perhaps if they knew that it wasn’t a real ID, just a Computer Vision suggestion, they would be less likely to agree unless they actually knew what it was.

As has been mentioned previously, I think that it would also be very useful if each ID proposed by someone could be associated with a level of confidence. I do not want to propose any ID unless I am very confident in it, because I know that someone is likely to agree with it just because I proposed it. However, a tentative ID could be very useful in drawing the attention of an expert. Other people have different philosophies on it, and might suggest IDs that they are not as confident on, thinking that it still might be useful to others. Which is fine, but I think that it would be very helpful if everyone was forced to admit their confidence level in their ID.

Saying that the system is fine and the issue is just that people are using it wrong is a very bad argument. We are not going to change human behaviour, but we can change the interface to get the behaviour that we want.

wildskyflower · August 24, 2021, 8:05pm

There already is such an icon: see point 4 in the computer vision section of the Q&A to see what it looks like: https://www.inaturalist.org/pages/help

stu_crawford · August 24, 2021, 8:15pm

Thanks for pointing that out, I did not realize this. A quick perusal of some recent observations confirms that yes, all the outlandish suggestions that I have corrected recently are computer vision.

However, I think that this still fails to accomplish anything useful. Most people are not going to read the instructions and realize that this obscure icon means that it was computer vision. I have spent a lot of time on iNaturalist, and read many pages of instuctions on various aspects of it, and I did not realize this. In particular, the key group of people that we want to change the behaviour of regarding Computer Vision (casual and new users) are unlikely to recognize what the icon means.

Topic		Replies	Views
Computer vision problem Bug Reports	5	904	October 13, 2021
Computer Vision Clean Up - Wiki (2.0) General	23	1622	August 29, 2023
Difficulties with Identifying General	22	782	August 30, 2024
Why is iNat's computer vision so inaccurate for deer? General	8	148	September 13, 2024
Computer vision clean-up (archive) General	150	14766	May 1, 2021

Problems with Computer Vision and new/inexperienced users

Related topics