Computer vision proposes the same name for many records of orchids

chacled · July 10, 2024, 10:24am

After identifying more that 2500 records of Spathoglottis, beside many approximative identifications, I see that many records of orchids of quite different species have the same name taxon proposed by IA, Spathoglottis plicata. Many records of Phalaenopsis and Dendrobium as examples are wrongly identified as Spathoglottis plicata. As the flowers of that terrestrial widespread species (I understand wild or naturalized plants) throughout the tropical world are highly homogeneous, it seems strange that so different epiphytic species as Phalaenopsis or Dendrobium may be mixed with it. I wonder if the IA may have been trained with erroneous pictures of mixture of horticultural varieties (many interspecific hybrids of Spathoglottis are produced and sold for cultivation), and including the above different commonly cultivated genus. Would it be possible to check this?

AdamWargon · July 10, 2024, 11:18am

@chacled,

Je pense que par “IA”, vous faites référence à l’Intelligence Artificielle. L’acronyme est AI en anglais, ce qui signifie “Artificial Intelligence”.

Your post is valuable, so I don’t want English readers to be confused!

tiwane · July 10, 2024, 5:47pm

Can you please provide some specific examples of observations where this has happened?

sedgequeen · July 10, 2024, 5:53pm

When a name is applied to the wrong photos, the AI (or CV) learns that those species are examples of that name. Therefore, it suggests that name for photos that look like all the species it has learned it is. So people pick that name for all those other species, so the problem gets worse.

There’s good news! When you clean up the identifications of the observations labeled as that species, the AI begins to learn what the species really looks like and it improves. Then people pick that name mainly for the correct species and the AI learns more. The situation improves.

This takes time, though, and it takes a person watching observations with that name for a while.

susanhewitt · July 10, 2024, 5:54pm

Here on iNat the software that helps ID things is not called an AI – it is called CV – Computer Vision.

richyfourtytwo · July 10, 2024, 6:26pm

To be honest I start finding these ‘corrections’ (not from you specifically @susanhewitt, I know it’s a position that iNat staff has taken) a bit tedious. Computer vision is just a field within AI, so calling the tools AI is entirely correct, just not as precise as it could be.

tiwane · July 10, 2024, 6:32pm

I’ll take responsibility for that, it was me who really emphasized it and corrected posts, Susan and others are just trying to hew to that example set by me. We’re OK with “AI”, especially since it’s in such broad use now.

david99 · July 10, 2024, 10:31pm

This is an issue that’s widespread across many taxa, especially ones with a few commonly photographed taxa in a large genus. The CV might learn one taxa, then apply that suggestion to anything that looks like it without having “learned” that there are dozens of look-alikes. It’s slowly getting better as new species are added to the library of species that the CV can identify. The problem is that there’s a large legacy of incorrectly identified species in the system that need to be corrected, and not enough people to correct them as fast as they pile up.

The CV also doesn’t identify things in the same way that people do. Sometimes it focuses on unexpected parts of the organism, or even the background. For instance, sometimes a photograph of a plant will return an identification of an insect (like Monarch Butterfly) because many observations of that insect have a plant taking up most or all of the background.

thebeachcomber · July 10, 2024, 11:57pm

this can be ‘overruled’ by first adding a coarse ID to the observation, and the CV will then make suggestions to match that ID. So an ID of ‘Plants’ or ‘Dicots’ or something to that photo will force the suggestions to be of plant species

sgtbird08 · July 11, 2024, 12:12am

I’ve been pouring through Graphocephala recently, a genus of leafhoppers. ~50 species in it, and many look like at least 1-3 other species in the genus and a few outside of it as well. Only 11 species have more than 100 observations and most have only a handful. As such, the CV misidentifies these organisms very frequently. Doesn’t help when people confidently agree with an incorrect identification instead of just supporting at genus/subgenus.

It’s tedious to be sure, but necessary. And hey, I’m really learning a lot about IDing this group as I go.

zoology123 · July 11, 2024, 1:40am

I have recently spent a good amount of time in the past 4 months attempting to fix Chironomidae on the site. When trying to correct the CV, there are quite a few steps one can take. One step I highly suggest as an identifier is trying to target species that are not in the CV, but could be. Going out of your way to ID them, and encourage observers to observe more of them if they can. This should at least for Chironomidae be very effective since most of the misidentifications from the CV is because the CV only knows a tiny amount of taxa out of a few 1000. Maybe like 10 before starting the midge project. The last CV update, 6 Chironomid species were added, and a handful of genera.

chacled · July 12, 2024, 12:34pm

Sorry for mixing IA and AI.
Below are a few links of recent identifications. It includes Phalaenopsis, Dendrobium, Epidendrum:
https://www.inaturalist.org/observations/227849879

https://www.inaturalist.org/observations/227684844

https://www.inaturalist.org/observations/226509983

https://www.inaturalist.org/observations/226689670

https://www.inaturalist.org/observations/226977757

https://www.inaturalist.org/observations/227002482

https://www.inaturalist.org/observations/225333789

cromlyn · July 15, 2024, 1:55pm

One of my dormant projects is to help Cardiff Bristol Museum get their HUGE insect and arthropod historic collections digitised and added to the training dataset for the Inat CV.

Does anyone know if it is possible to add weightings to observations so things like type specimens can help tip the probabilities, despite there only being a few images?

bouteloua · July 15, 2024, 2:01pm

hi @cromlyn, iNaturalist observations are meant to represent your own nature observations, and collections by multiple different people such as those in a natural history museum aren’t really a fit for iNat. To answer your question, it is not possible to add weight to type specimens in the algorithm. Also, since computer vision is trained on photos, not organisms, a bunch of images of dead insects in collections might not be very helpful for identifying those same species in situ in nature.

egordon88 · July 15, 2024, 2:32pm

You may want to look at SCAN or https://ecdysis.org/ for the museum’s records.

cthawley · July 15, 2024, 2:45pm

Welcome to the forum!

I agree with @bouteloua that iNat isn’t a good fit for institutional collections, and staff have made clear that collections shouldn’t be uploaded wholesale to iNat or iNat used as a replacement for collections management software.

There are some previous threads that touch on this like:
https://forum.inaturalist.org/t/institutional-use-of-inaturalist/49050
https://forum.inaturalist.org/t/inat-development-for-museums-or-research-centres-improve-the-value-of-research-grade-data/7743
which also cover some of the issues with attempting to use iNat in this way as well as a couple of rarer use cases (like incidental observations from collections) where iNat usage might be appropriate.

Since institutional use of the CV isn’t the main focus of this thread, if it’s desirable to continue this conversation, we can split it into its own thread or reopen one of those (or another) older one and move it there.

chacled · July 20, 2024, 5:49pm

Three more recent examples (2 Phaleanopsis and one non-orchid):
https://www.inaturalist.org/observations/230441948
https://www.inaturalist.org/observations/230198323
https://www.inaturalist.org/observations/228396382

chacled · August 21, 2024, 12:03pm

Looking closely on the eleven last plants identified erroneously as Spathoglottis plicata by CV, ten of them (more than 90 %), are pictures of cultivated plants with human made back-ground. It seems that CV identified “artificial” background as S. plicata !!! One possibility to improve the Ids would be to remove any picture of cultivated/grown plant to train the CV.

Below the last links not sent before:
https://www.inaturalist.org/observations/230708992
https://www.inaturalist.org/observations/230699022
22/07/2024
https://www.inaturalist.org/observations/230842597

23/07/2024
https://www.inaturalist.org/observations/231092690

1/08/2024
Curcuma : https://www.inaturalist.org/observations/233216010

4/08/2024
https://www.inaturalist.org/observations/233847321

14/08/2024
https://www.inaturalist.org/observations/235713260

15/08/2024
https://www.inaturalist.org/observations/235891000

20/08/2024
https://www.inaturalist.org/observations/236923386

system · October 20, 2024, 12:04pm

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
"Helping" the computer vision - is this wrong? General	37	2362	September 10, 2020
Can the Computer Vision be stopped from this mis-ID? General	29	1146	December 25, 2021
Are the AI suggestions the same for everyone? General	13	881	September 10, 2023
Computer Vision mistaking plants for arthropods General computer-vision	21	541	November 9, 2024
Frequent incorrect observations due to specific common names! General	43	1934	January 24, 2024

Computer vision proposes the same name for many records of orchids

Related topics