Species Photos Required to Train Vision

SteveMcBill · October 22, 2021, 9:38pm

How would one go about finding which species need observing/recording in order to provide photos for future training of Computer Vision ??

Many thanks for any insights

Cheers,
Steve

fffffffff · October 22, 2021, 9:47pm

On species page there’s a badge if it’s included in cv:

So if there’s none this species probably needs more photos, anyway, new photos are always welcome!

susanhewitt · October 22, 2021, 10:23pm

Also we need a lot of good identification work, because there may already be photos of a species, but if they are not identified, then the images can’t yet be fed to the CV.

trscavo · October 22, 2021, 11:41pm

I don’t think that’s true. An observation need not be Research Grade to be included in the training. An observation doesn’t need a community ID either (one ID is sufficient). It doesn’t even need to be verifiable (e.g., not wild observations are eligible).

fffffffff · October 22, 2021, 11:50pm

If it’s not ided as correct taxon or left at family level or higher it’s not useful for cv, and that’s the case for tons of taxa and thousands of observations.

sbushes · October 23, 2021, 12:55am

Aside from the badge @fffffffff mentions, there is also a feature request to make it a bit easier to get a broader overview in future :
https://forum.inaturalist.org/t/add-label-on-drop-down-menu-for-species-in-computer-vision-training-model-to-give-quick-overview/24874

pisum · October 23, 2021, 3:20am

right now the /v1/taxa/{id} API endpoint returns a “vision” field in its results, but the v1/taxa endpoint does not. if the latter did also return that field, then it would be relatively easy to get/view that information for a bunch of taxa. right now, you’d have to either go in with a list of taxon IDs, or you’d have to query to get a list of IDs and then query by that list of IDs. it’s possible to do this, but i’m not motivated enough myself to do it.

sbushes · October 23, 2021, 10:33am

@pisum
Nice thanks for the tip, maybe I will take a look myself

@brian_d
The number of photos/observations species require to be included is on the help page here

“as of the model released in March 2020, taxa included in the computer vision training set must have at least 100 observations, at least 50 of which must have a community ID. Photos for training are randomly selected from among the qualifying iNaturalist observations (that is, it is not only the first image of an observation that may be used for training)”

mrtnlowr · October 23, 2021, 10:50am

It wouldn’t be too difficult to modify this code I wrote for querying conservation status to also provide the taxon CV flag. It uses the algorithm outlined here:

Cheers!

susanhewitt · October 23, 2021, 4:54pm

I did not say the photos of a species had to be Research Grade, just that they had to be identified.

matthias22 · October 23, 2021, 5:18pm

Would it be possible for experts to review the photo set that is used for training in order to weed out misidentifications? I think that could speed up the process significantly. I am talking about groups that are hard to ID and still largely inaccessible because of lack of expertise in the community (e.g., I work on wasps).

matthias22 · October 23, 2021, 5:42pm

Another interesting question is how life stages and sexes are being handled. In some cases it is pretty straightforward such as caterpillars and butterflies. I assume two separate sets of images are being used to train CV for each of the two. It gets more complicated with the sexes or with species that gradually transform from juveniles into adults. Depending on the species, sexual dimorphism can vary from extreme to imperceptible, with every possible intermediate condition. I assume a priori decisions have to be made on how many separate entities CV will be trained for. For species with extreme sexual dimorphism it is obvious that there have to be separate sets of images to train CV. But where is the cut-off for species with moderate sexual dimorphism? Many insects fall into this category. To the untrained eye males and females may look quite similar yet taxonomists often use different diagnostic characters for each sex to get to a species ID.

fluffyinca · October 23, 2021, 7:55pm

I was under the impression that they are not done separately. From what I’ve read on various forum threads, it sounds like the training model randomly selects photos from among the eligible observations of that taxon, regardless of life stage or sex. But I could be wrong. Doing them separately would make sense in cases like the ones you described.

susanhewitt · October 24, 2021, 1:34pm

The life stages and sexes are not fed separately to the CV. It does not learn in the same way that a human learns, so variations like that do not confuse it.

sbushes · October 24, 2021, 4:59pm

Realised it’s pretty straightforward to look and see which species are included just using the observations portal of course. We can see we need to find some more Proteroiulus fuscus to gear up in European Blanuilidae for example! Only Blaniulus guttulatus has > 50RG and > 100 total obs in this family in Europe at present, likely leading to oversuggestion of Blaniulus sp.

RESEARCH GRADE

ALL

system · December 23, 2021, 5:00pm

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
How many observations are needed before a taxon makes it into the Computer Vision training package? General question	8	627	November 24, 2020
Searching for taxa that need observations to be included in computer vision? General	22	1162	November 7, 2022
How are photos selected for CV training? General	74	2690	December 10, 2023
Computer vision questions General	4	657	November 26, 2019
Computer Vision Clean Up - Wiki (2.0) General	23	1636	August 29, 2023

Species Photos Required to Train Vision

Related topics