Question about the AI Training Method

deniszabin · September 13, 2023, 10:24pm

Hello.

I’ll admit, I’m not really familiar with the intricacies of the AI Training for species identification in iNaturalist.

Macrolepiota colombiana, which a mushroom species originally described from Colombia, and based on an ongoing investigation of the genus, the species, so far, seems to be restricted to Colombia. M. colombiana is also morphologically unique, but because of the lack of study of Macrolepiota in the Neotropical region, most tall and robust species of the genus has been identified as M. colombiana.

My question would then be if the AI is also continuously trained when the observations identified as the species, which probably were used for the AI Training, are being correctly identified.

Thanks in advance,
Denis

cthawley · September 13, 2023, 10:58pm

I don’t know if I am understanding the question correctly, but new computer vision models are released fairly regularly (once a monthish). Species can be included (or not) based on whether there are enough observations or not. So new identifications will change the training set for the model.

There are a fair amount of threads on the CV model on the forum. A couple that might be useful are:
https://forum.inaturalist.org/t/what-is-the-current-computer-vision-model-really/35318
https://forum.inaturalist.org/t/new-vision-model-training-started/27378
but I’m sure many more. There are also some posts about the CV on iNat itself.

lothlin · September 13, 2023, 11:44pm

I think the problem you may describing goes beyond simple computer vision issues; There’s a lack of study on Macrolepiota in the Americas (afaik) in general. Actually, a lot of American genuses are just a massive mess (see, Cantharellus, Russula, Neoboletus, Inocybe, Entoloma, etc.) Plus, the CV struggles with mushroom ID in general and generally only seems to do well on extremely distinctive species of fungi.

Couple this with IDers who trust the CV suggestions and may not have much knowledge on mycology, and a lack of experts in certain regions that can even help sort out the mess, and well… you end up with a messy situation.

Best advice is to try to correct what you can and hopefully a comment here and there on posts can help tell people what they should be looking for. CV is going to recommend thing based on the dataset it has, and if the dataset is limited, it can only do so much, and that’s when people have to step in, because ultimately, the CV isn’t even AI, its mostly just pattern recognition

cooperj · September 14, 2023, 7:19am

it is the same observation I made on this thread …
https://forum.inaturalist.org/t/log-computer-vision-taxon-guesses/44224/3
For mycology the CV is mostly ‘garbage in - garbage-out’. Correct identification of fungi is often much deeper than macro-morphological visual similarity.

deniszabin · September 14, 2023, 11:44am

Yes, I’ve been trying to identify most of the observations. And I do understand how difficult the identification of macrofungi based on macromorphological characters is, but obviously, some species are very characteristic and easily recognizable in field based on macromorphology.

My question would be more if the AI (or Computer Vision) actually learns along with the new refined identifications of the species, or rather, if what she were trained on is fixed and can’t be refined.

lothlin · September 14, 2023, 11:57am

If I’m understanding how CV works correctly, once the IDs are fixed, it should help fix the CV (especially when the model gets updated, which as noted, happens pretty frequently)

(Though I don’t think it will ever stop trying to ID pieces of trash, piles of leaves, or rotted mushrooms as Grifola frondosa)

thebeachcomber · September 14, 2023, 12:18pm

this is correct

deniszabin · September 14, 2023, 12:36pm

Thank you!!

DianaStuder · September 14, 2023, 12:49pm

https://www.inaturalist.org/blog/83370-a-new-computer-vision-model-v2-6-including-1-399-new-taxa

About once a month we get an iNat blog post. In there are links to the species ‘added this month’. I asked to have that date added to the taxon page - but tiwane said can’t be done.

It is motivation to achieve the required 100 photos = about 60 obs, for the next CV update.

lothlin · September 14, 2023, 1:09pm

the photos have to come from research grade observations, right?

cthawley · September 14, 2023, 2:10pm

RG observations are preferred but others can be used. There is a good thread on this here:
https://forum.inaturalist.org/t/how-are-photos-selected-for-cv-training/42403

system · November 13, 2023, 2:11pm

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Similar Case of Many Incorrect Bryophyte IDs- Possibly AI? General	6	285	June 21, 2024
Computer Vision training status General	10	1127	June 10, 2021
Artificial intelligence and misidentifications General	19	2048	February 3, 2020
AI for identification notes and comments? General	11	700	February 15, 2023
Question - What observations train the AI? General	3	421	September 9, 2019

Question about the AI Training Method

Related topics