AI matching to background color instead of organism

Take a look at the attached screenshot. The observation photo is clearly of a mite. However, the AI suggestions don’t include any mites, and instead include mostly aquatic invertebrates that don’t look anything like mites. After looking through the observations of the suggested organisms it quickly became apparent that the AI was latching onto the white backgrounds as the basis for its matches. Aquatic invertebrates are frequently photographed with a completely white background, while this is relatively rare for mite photos. Theoretically this is a problem that should fix itself given enough observation photos, but I wonder if there are any training tweaks that could improve the situation in the meantime.


[I moved this to general as it isn’t a bug]

1 Like

when the prompt above the suggestions says we’re not confident, you should believe the prompt. “top” suggestions doesn’t always mean good suggestions. it could just be the least bad suggestions.


There is still a lot of work to be done by observers and identifiers until the AI will become totally reliable some day. Speaking from own experience, the core problem with the iNat AI goes something like this:
99 out of 100 times the AI correctly identifies a mediterranean goby to species level (something that most humans are not capable of), but every 100th time the AI mistakes a fish for a bird (something that wouldn’t happen to any human anywhere in the world, ever).
In general, this kind of issue currently prevents all so-called AI from working properly.


The CV simply can’t be trained to recognize all the various types of ex situ backgrounds.


This kind of machine learning will never be totally reliable, no matter how many observers and identifiers contribute. The larger and more correct the data set is, the better it will get, but perfection is an unattainable goal. (Of course, no person is perfect at this stuff either.)


Although there are certain things that can be tweaked in the algorithm to help minimise this problem to an extent, this is just a downside of the way this type of AI works. For shorthand, we say “the AI thinks the organism in your photo is X” but that misrepresents what’s really happening; more properly we could say that “the AI has determined that photos most resembling your photo are often marked as being photos of X”.

If there’s a tiny insect that is only ever seen on rose flowers, then almost every photo of that insect on iNat is also going to include a rose flower, and the AI has no way of determining if the flower or the insect is the thing of interest in the photo. An inevitable consequence will be that photos of roses (without insects) will be suggested as being photos of that insect. And indeed exactly this situation arises often on iNat, especially with pests and diseases that are closely associated with a particular plant, or other organisms that frequently associate. For instance oxpecker birds are nearly always seen sitting on large mammals like giraffes or rhinos, and so you will often find that the second AI suggestion for a photo of a giraffe or rhino is “oxpecker” even if there is no bird in the image.

As ever with using smart AI tools, recognising its limitations is important for using it effectively.


This sums up the definition of AI so beautifully.


This happens a lot with plant-bug pairs, a specific example being the Dogbane Leaf Beetle. Uploading a photo of dogbane will often suggest the DBLB. It’s pretty funny, IMO.


Totally! I always take the AI suggestions with a grain of salt especially in cases where it says it’s not confident. I don’t expect the AI to be perfect, but I thought this might be a useful example for investigating ways to further improve the AI.


I had to smile when I saw this post.

AI recognises patterns. If you train the AI with the same background pattern, the background becomes part of the identification.

A story I heard was about AI being used for tank recognition but unfortunately all the images were taken with clear blue sky and so the AI did not recognise tanks out on a cloudy day (or something like that).

The solution seems to be, take more pictures with lots of different backgrounds so that the background does not become part of the identification.


Could this problem be at least partially worked around by training a segmentation model, and including it’s output in the input to the classifier?

1 Like

I’m thinking about this topic for quite a while now.
I think that AI driven suggestions can definitely get better, when only the important ™ part oft the image is used. But a segmentation where a sharp polygon is drawn around the object of interest can lead to information loss around the specimen. I guess (but I still need proof for that) that a context around the organism can definitely help identifying a species. Imagine a butterfly feeding on distinctive plants while omitting several others. What do biologists think about that?

Question is: How much context is helpful and how much context risks disturbing the species detection?

1 Like

Could you please elaborate what you mean by that? I’m quite new to this field and look into species and contexts on images right now.
Would be much appreciated! Thanks! :)

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.