Don't use computer vision

It’s not just the Rest of the World. I’m constantly seeing computer vision IDs here in the Southeastern United States of species that don’t occur in the USA at all. Just the other day, I found computer vision identifying a beetle grub as a species endemic to New Zealand (an ID the user accepted without question).

Perhaps this would be worth the developers’ consideration: excluding computer vision suggestions that are more than x kilometers out of range. iNat has range information for all taxa, so if the computer vision’s suggestion would be far out of range, it could be dropped from the list of suggestions or bumped up a taxonomic level before being suggested.

I’ve seen the tag-teamers too. But I’m more concerned about people who blindly agree with expert IDs on their observations, enabling the observation to go to Research Grade on the strength of one person’s say-so. That seems to be a pervasive problem.

10 Likes

Oh yes that is indeed correct. There are many observations though that have location data but are given ID suggestions of a species not from that area. My guess is that there was not enough research grade taxons for that particular location so it then looked at research grade photos from elsewhere.

2 Likes

Even if you do that, though (I geotag all my photos before upload), iNat still provides out-of-range suggestions. It just flags the “in-range” suggestions as “seen nearby.” I don’t think that’s a strong enough way of noting which suggestions are range-appropriate and which are not.

8 Likes

I think we are overlooking the consequences of not having AI helping suggest IDs - thousands of identifications that just say “Life”. The AI does narrow it down to the right order or family generally and that attracts the attention of reviewers.

Maybe we could change the app so that it won’t make a suggestion until after a location is entered?

9 Likes

Also people do check what AI suggests and sometimes pick the 4th or more down the list suggestion, while the first was the right one. I believe that adding more words before the first, general suggestion, can help with it, like “it’s ok to make a general id” or “general is better than wrong”, something like that.

7 Likes

I agree there are issues here that would be great to address sooner rather than later!

I think long term, the AI will win out of course…and win over even those who are currently rebuffing it, but short term, it would be good not to waste experts time or to put them off contributing.

Maybe there could be a wiki or something collating ways to address this if there isn´t already.
A few that I think would help:

  • there could be more use of family level ID in training data for inverts and lichen to strengthen AI
    (https://forum.inaturalist.org/t/cv-on-family-level-inverts/11527)

  • for more obscure species, sometimes the AI isn´t pretty sure about anything…but offers up a list of species level suggestions - it would be better to always have a “pretty sure” suggestion that is at a coarser level instead I think…even if just “flies”… as this might encourage newcomers to choose that over the top species level suggestion

  • For insects the diversity is such (27000 in UK vs 620 birds) that I think it would really help if the training data took into account lower numbers of observations as well.
    As I understood, its currently trained on genera/species with 50+ observations… but maybe this would be stronger if the limit was 50 photos instead? ( 50 observations with 2 photos presumably not dissimilar to 20 observations with 5 photos from a training perspective I would guess?)…
    Many species are easy not too hard to identify, but rare to observe…
    e.g. We have had 10 sightings globally of Cratomus megacephalus since the site began - a very distinctive looking wasp - but it might take another decade to reach this goal of 50 observations and have the AI be able to suggest it with the current limit

I could imagine down the line, some sort of automated clean up could happen also - so perhaps its better if the experts simply don´t attend to these issues and waste their time! …Or I´ve also thought about exploring the API to automate my own response to flies like Sarcophaga carnaria which are misidentified every day… but wasn´t sure if this would be considered inappropriate(?)

Is there any plan to reapply the AI in the future to existing observations to flag up possible errors?

I didn’t realise location data was already taken into account - i thought it was in the pipeline. But thats good to know…It would be great to trigger it without having to input location first though… as again, newcomers wouldn´t see this as it stands.

6 Likes

I’m very surprised that the AI id tool suggests something that is not “very sure”, only the suggestions labeled as “we are very sure is this taxon” should be displayed in the drop down box, otherwise it becomes more of a problem than an aid. The rest of the ranked suggestions can be stored, but hidden to the user in the machine learning process. Once that observation once has an Id with research level, then it can proceed to score how good were its hidden suggestions and improve the learning process.
Anyone knows what’s the exact criteria that the AI tool uses for labeling certain suggestions “we are very sure is this taxon”?
Also would be very important to label those ids as “AI suggested” directly in the corresponding id box of the observation, so curators and identifiers have a better understanding when tackling one id

1 Like

This is a valid opinion, but it’s not the only one. As an identifier, I do not mind (actually prefer) that the observer make a guess, including a computer-vision one, because

  • I do a lot of my IDing one species at a time, as I find it mentally easier to judge yes/no for “Smilax laurifolia” than to sort through all the possible species for a “Smilax” designation.
  • I think commenting back and forth is engaging and helps hook users into being more active.
  • I count explaining a correction as a form of mentoring. Sure it doesn’t always work, but when it does and the person is grateful to learn, that’s fun.
  • I learn a lot myself from thinking how to explain why this can’t be Rhexia alifanus but might be Rhexia nashii.
  • Sometimes it’s appropriate to explain why the computer vision probably isn’t very good for this taxon. That helps users understand how iNaturalist works and how it grows better from active participation.

It’s always interesting to see different people’s perspective on using iNaturalist.

28 Likes

Well, I believe that the fact that some observations are tagged as “seen nearby” is a strong enough way of indicating the most plausible suggestions. Normally, “seen nearby” taxa are more likely to match the observed taxa than those that were not seen nearby.

The only problem I see here (besides some observers blindfully accepting AI suggestions, but that is a whole new problem not related to this), is that, as @robotpie said, there are cases where there are not enough research grade observations to serve as training data for the AI. Therefore, the AI will only suggest taxa according to the visual similarity with the photo, resulting in absurd suggestions (like suggesting brasilian birds for a observation in Northern Europe)

5 Likes

I am very new to using iNat. I am going to add a couple of my own experiences to this thread just in case they may be of help:

In my case, I am a newbie. I hold a M.Ed. in Teaching & Curriculum, but I’ve been photographing ‘nature’ for over a decade. I used iNat for about a month before I realized that entering the location of my observation will narrow down the AI suggestions to inform me if something has been “Seen Nearby.” Now I always enter location first - and my suggestions are almost always accurate.

Only once have I experienced frustration expressed by a qualified IDer and honestly, it discouraged me for a few days. He first stated that my suggestion would not be found in my location; next, after I made a second suggestion (found in my location), he told me I should not rely solely on an insect’s appearance for an accurate identification. For a few days I really pondered that issue b/c it seemed counter to the overall design of this site. Yes, in many cases a single image mnot suffice, but as far as I know, unless I’m posting to a Project with specific criteria (i.e. DNA vouchering), don’t I have the right to post my observations simply for my own record-keeping? iNat isn’t only for professionals/research, correct?

Fortunately, I eventually returned to my normal behavior…I post my image, fill in as many details as I can and view the suggestions. I select the best identification and hope my fellow users will be happy to aid in confirmation/correction.

I can understand the frustration felt by professionals - or self-trained experts - and so, I agree too many people probably make bad guesses based on the AI. I think making location a mandatory field prior to suggestions would greatly reduce ‘bad’ guesses. I know that once I learned about the “Seen Nearby” filtering of suggestions I have made significantly more accurate guesses and have had my images go to Research Grade more often.

Overall, I have discovered many kind, helpful IDers and most are more than willing to answer questions and/or discuss identification-related concerns.

**Edited to Add: While I understand it must be frustrating to IDers to wade through misidentifications, it is always more helpful when there is some sort of dialogue that helps me understand why I’ve mis-IDed something…it also aids in preventing repeated misidentifications. :wink:

20 Likes

I think that in addition to the “seen nearby” tag in the suggestions box, it would be nice if they also included a label for “out of range” or something to help caution users who are reporting something that isn’t anywhere near there. I keep finding Helicina orbiculata being suggested for any small white snail in the world when it’s endemic to the areas around the Gulf of Mexico, and I can’t help but feel if it were suggested but in red text said OUT OF RANGE, that would cut down on people choosing it over other snails.

6 Likes

One more thing I wanted to add…when I began using iNat (before I learned to enter an observation’s location before scrolling through Seen Nearby suggestions) my photos’ metadata sometimes resulted in weird suggestions based on keywording and file names and I would not catch it. The ID field would fill in during a bulk upload and if I didn’t catch the weird suggestion, it slipped through. Perhaps this sort of issue is another reason for so many misidentifications.

The AI gets updated about every 6 months - so patience is needed, as well as enough data.
https://www.inaturalist.org/blog/31806-a-new-vision-model

And earlier discussions about AI and the wrong location
https://forum.inaturalist.org/t/species-suggestions-for-the-wrong-continent/789

https://forum.inaturalist.org/t/provide-relevant-geographic-data-confidence-level-accuracy-scores-with-ai-suggestions/9226

11 Likes

Working with kids is often tedious. It requires patience and understanding–and isn’t for everyone! You don’t have to engage people in discussion if you don’t enjoy it. But I would encourage anyone who is unfamiliar with having online discussions to not be discouraged by the small minority that wish to argue. So what are you suggesting, actually? That AI be removed from iNaturalist? Or make it inaccessible to beginners? I don’t think any of us are actively going around telling beginners to use it, so telling us to stop doing that isn’t going to solve your problem.

4 Likes

Definitely ignore the small number of folks that scold you! I think many of the professionals (and some hard-core amateurs) are lacking social skills in general–detail-oriented people are often that way. And sometimes quick comments come across as scolding just because they’re short and to the point. I’ve been guilty of that myself (and my social skills could always use some improving!).

10 Likes

What if these are wrong? I am suring showing all is the best option for me.

It does also up-rank the nearby taxa, but it doesn’t remove non-nearby taxa.

I think removing them is an interesting idea.

The community guideline says you shouldn’t add any id if you’re not confident enough, if you checked the AI suggestions and thought it’s an ok species, it’s a better situation than just clicking on the first without thinking, though I tell you with many groups, like insects, website is lacking information, it can be a group of 10+ genera that look quite the same for a beginner in the group, so it’s always better to check which higher group the suggestions belong and choose it, or choose one of the higher groups that look better, even if it’s wrong it won’t get RG, so it’s kind of okay.

5 Likes

I ID a lot of Nandina domestica (hey, it’s a gift ;-). It’s in many landscapes and seems to be one of the first plants many new people see. iNat CV seems to suggest Nandina for any cultivated plant with similarly shaped leaves or leaflets, red berries, or reddish foliage. It also suggests it for many photos of wild plants with similar features that individually look right but don’t add up when considering the whole. That’s kind of the thing about AI: it works differently than the human brain.

I have a file on my computer that has a selection of my previous responses about the species. Usually one of them applies (maybe with a little tweak) to any given incorrect Nandina ID. I like to encourage new users by explaining why their plant isn’t what CV suggested and I sometimes educate them about checking maps and photos. A lot of them have no clue what to look for in an ID and they’re a large target audience for the app, so they need to be educated. I pay back the explanations I get with harder IDs by teaching newbies what sorts of things to look for.

BUT: I wonder if part of the problem with overzealous ID is that many of the incorrect IDs become Casual, so the misidentifications don’t get seen by the AI. Does that make sense?

9 Likes

Very good discussion! I’ve leaned a lot! I NEVER thought about putting in the location first, it being what I always do last. I’ve been on iNat since 2016 and nobody ever told me, though in retrospect it makes perfect sense.
I loved it when AI was introduced to iNat and it has help an awful lot.
However only @jessicaleighallen mentioned one of the problems with batch loading. First and foremost, is that it is not unusual at all (in fact it is virtually guaranteed) if you upload 75 images, (for example) iNat will auto-fill some of them with IDs without then being prompted. It is exceedingly easy after four or five hours in the field to get home and upload lets say 75 images for 50 observations. Once all the images are all uploaded I go in and ID them but if the field for the ID is already filled I can easily miss it even after I’ve looked over all the observations once or twice. Today for example I uploaded a Painted Turtle (Chrysemys picta) along with a number of other images. When I got done keying in the IDs I uploaded all the images and waited for some IDs to roll in only to learn that iNat put “Genus Chrysemys” into the field where Chrysemys picta should have been. Fortunately one of our members caught it right away (but to be honest I felt a bit foolish putting in a genus ID for something so easy to get to the species level).
One other issue is iNat deciding the ID and auto-filling it in for you based on the file name. So if I name something “skunk cabbage horse trail#19 5-13-20”, iNat may fill in every single observation’s ID as “Domestic Horse” pulling “horse” from the file name, no matter what the subjects are and the ID or it may auto-fill as “Eastern Skunk Mephitis mephitis ssp. nigra”. So I can no longer name my files before uploading to iNat.
I still think iNat’s AI is great and can live through these growing pains :sunglasses:

6 Likes