The AI algorithm - confusing pathogen/host or subject/background

victorengel · November 13, 2021, 11:02pm

Recently, I added an observation for a ladybug, and the AI suggested Hesperomyces. That seemed fungus-related by the ending of the name, but I decided to accept it anyway and then research the name. Searching the name indeed brings up many ladybug observations here or on the web in general. Most of the images of Hesperomyces observations are not cropped to accentuate the fungus, so it’s no surprise the AI is getting confused.

So the question is, how will the AI properly learn the distinction in a situation like this? I think proper cropping would help.

I think something similar is happening with Psychodidae. I’ve been monitoring this group for some time. It’s been stated here that the AI knows Clogmia albipunctata quite well, but I wonder if it’s not learning instead the environment around the flies.

Perhaps there should be more emphasis on properly cropping photos. Or maybe a tool could be added to the website or app for users to indicate the item in the observation. I see some other sites doing this.

cthawley · November 14, 2021, 12:42am

The AI definitely learns the background as well as the organism in some cases. I see it sometimes suggest anoles (lizards often perched on bark) when it is just the side of a tree.

Of course, as a lizard researcher, I’m occasionally fooled by “bark lizards” as well, so I can’t complain too much!

I think sometimes “background” identification will be helpful to the AI and sometimes not. But regardless, strategic cropping can help both the AI and human IDers. I think a basic cropping tool for upload would be good (also for workflow reasons).

wildskyflower · November 14, 2021, 1:35am

I’ve found that the vast majority of Acer glabrum observations contain evidence of Maple Erineum Mite, and consequently most pictures of acer glabrum could be ID’d either way and the AI apparently learns they are interchangeable. As a result the AI suggests all Acer glabrum observations are Maple Erineum Mite, even if a particular one does not contain evidence of it.

Of course, in most cases a human IDer would have the same problem guessing which thing the observation was for unless the observer specified or added a coarse starting ID.

Megachile · November 14, 2021, 2:30am

It’s reasonably common for gall observations to be created when by duplicating a plant observation, so in those cases the photos end up being literally the same for both taxa. Not that surprising that the CV learns to assign the both names to similar pics.

DianaStuder · November 14, 2021, 11:04am

Computer vision is intended to work on actual photos from non-scientists.

But I often tweak the taxon photos - there I don’t want to see a nice picture of your dog, with an incidental plant off to the left somewhere. If the specialist describes field marks - I try to make sure that those are captured in the taxon photos, for next time.

CV is working with pixel colour and arrangement. It doesn’t ‘see a moth’ or see a ‘flower gone to seed’. The greige seeds on Ursinia are moth coloured. A strategic few seeds had fallen and CV and I could both ‘see’ a pair of moth wings. #ThisIsNotAMoth and needs the observer / identifier to step in.

victorengel · November 15, 2021, 4:52pm

I realize computer vision is intended to work on actual photos from non-scientists. I also think it’s reasonably to provide a cropping tool or a pointer tool for them to use to indicate their observation. I think that would greatly improve the CV results.

Here’s a simple suggestion that requires only a change in the UI: the software currently separates a suggestion of the CV is confident of the ID. My suggestion is to leave things as is in that situation, but if nothing can be identified confidently, simply add some text suggesting to the user to crop the photo so the subject mostly fills the photo, perhaps with a link on how to crop. For a more user friendly approach that requires a bit more work to develop (but still simple), add a crop tool that the user may use. The user does not have to use it. However, I’m guessing it will be used a lot because people generally want identifications.

DianaStuder · November 15, 2021, 7:19pm

I think, last time we asked for cropping, the iNat response was - freely available in photo software. I do crop my photos before uploading - but others have a different workflow.

tiwane · November 15, 2021, 9:27pm

It’s possible to crop in our Android app, but not our iOS app. You can see this annoucement for the future of the iNat mobile app. I can’t promise a cropping tool in it, but I’d be surprised if there wasn’t one - for photos taken within the app it especially makes a lot of sense. I think it’d be great to add a cropping tool to the web uploader, but as there are free cropping tools for desktop/laptop computers, it’s not a high priority.

Seek by iNaturalist does suggest cropping a photo if it can’t get a confident CV ID for it. I could see that eventually being added as some sort of onboarding in the future for the standard iNat app.

As to whether a cropping tool would help with either training the CV* model or getting accurate results from a submitted photo for taxa like this, I’m not sure.

*We try to use “computer vision” rather than AI because it’s more accurate - our model isn’t really artificial intelligence)

wildskyflower · November 15, 2021, 10:59pm

Out of curiosity, what size images is the CV trained/run on? Is it the full upload resolution or something less?

victorengel · December 5, 2021, 5:01pm

Although it may not be possible to crop in the iOS iNaturalist app, it is very simple to crop in Photos on iOS and then make the observation from that cropped version. That’s what I do when my observation comes from my iPhone.

tiwane · December 5, 2021, 5:49pm

I’m not the best person to ask, but I know we were using 300 pixel images for some time. @alex would be the person who knows for sure.

alex · December 5, 2021, 8:22pm

Images are resized & cropped to 299x299 before training.

wildwestnature · December 5, 2021, 9:30pm

iOS’s native photo editing program allows cropping and the addition of circles, arrows, and free-hand drawing to highlight photos of organisms (https://www.inaturalist.org/observations/96310626) --among other features. Nonetheless, since the iOS iNat app allows users to take photos (“Observe,” “Camera”), then we could see benefits to basic photo editing functionality to support that Camera feature.

tchakamaura · December 6, 2021, 7:26am

The real way to do this is in iNat when making the obs. Otherwise the photos have to go through an extra processing step to crop any that have insects in them, which is double the work to upload. Whereas in the browser you can crop the first photo in-place after duplicating. For anything you were not meaning to grab but did.

victorengel · December 31, 2021, 7:41pm

Who or what does this cropping? If it’s done by algorithm, is there some check to ensure the cropping is useful?

alex · January 14, 2022, 4:56am

Our vision system trains and classifies on 299x299 pixel images, so all photos must be in that format before the vision system sees them. So we either have to crop, pad, or distort the photos when resizing before training/suggestion.

At training time, since the model will see the same images many times, once for each “epoch,” we perform a different random crop into the photo for each epoch. The intuition here is that it allows us to use the same photos many times, increasing model accuracy, while discouraging the model from “overfit” or “memorize” the training photo pixels. This step, along with random color shifting, results in a few percent improvement in overall accuracy.

At prediction/suggestion time, we do a center crop. It doesn’t help in all cases but it gives us another small overall bump in accuracy over alternatives like padding or distorting. I suppose the intuition here is that when humans take photos of really anything, they almost always centrally locate the subject. So a central crop very rarely loses important information, doesn’t distort the photographer’s perspective, and doesn’t predict on empty/padded pixels.

DianaStuder · January 14, 2022, 2:40pm

Would it be useful - if the observer offered a 300 by 300 crop of - what we are focused on this time. I crop for my obs anyway, but could deliberately choose that size. (I crop square because I like that format)

alex · January 19, 2022, 8:02pm

I wouldn’t make a crop just for the vision system, the downscaling happens automatically and seamlessly, and if you upload a higher resolution version of the photo, then the additional detail will be useful to humans. As for how to help the vision system understand what part of the photo to focus attention on, hopefully we’ll be able to address this at some point, perhaps with bounding boxes or attention based vision models.

jasonhernandez74 · January 20, 2022, 2:44am

I wouldn’t know, because I crop my photos in my phone’s camera app before going anywhere near iNat.

system · March 21, 2022, 4:12am

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Are you trying to "teach" the AI to recognize things? General	14	1841	September 10, 2019
AI for identification notes and comments? General	11	530	February 15, 2023
Hesperomyces harmoniae, a common fungus species that just finally got a name Nature Talk	7	1566	February 17, 2023
Recommendations on improving the AI algorithm? General	49	538	April 13, 2025
If not sure, is it better to try and be specific with i.d or leave it vague? General	43	2452	December 4, 2022

The AI algorithm - confusing pathogen/host or subject/background

Related topics