Searching for taxa that need observations to be included in computer vision?

Is there a way to search for taxa that need additional observations so they can be included in a future round of improving the computer vision/artificial intelligence, besides searching the taxon pages one by one? I’m looking for ways to improve the usefulness of the observations I add to iNat.

11 Likes

I was thinking about this recently when playing with the API via pynaturalist for the first time. It would be quite useful to have have a parameter (for search URLs and for the API) which gave or allowed to select for (1) whether a taxon is currently included in the computer vision model and (2) when it was added to the CV model.

3 Likes

this isn’t a perfect solution, but you can use this link:
https://elias.pschernig.com/wildflower/leastobserved.html?user=elias105
and replace the username with a prominent observer in your region (could be yourself :) )
and then look through the species with total observations <100. Those are the ones that need
additional observations to be included in the CV.

In the US, this is likely to be galls, leafminers, other plant pathogens (rust fungi, etc), certain small insects or other arthropods that live in the soil or are otherwise hard to find or photograph, and of course, genuinely rare species.

3 Likes

it is possible to retrieve a taxa via /v1/taxa/{id} in the API, and the response will include a vision field that tells you whether the taxon is included in the current vision model. here’s an example of a request to look up a couple of taxa (click to view the response): https://api.inaturalist.org/v1/taxa/196822,196823.

the only difficulty here is that you have to specify a list of taxa IDs, rather than searching for, say, all taxa with a particular term or a in a particular genus. to do that kind of search, you have to use /v1/taxa in the API, but the results will not include the vision field. you could request that this be updated to include vision in the results. alternatively, you could get a list of taxa from /v1/taxa (ex. https://api.inaturalist.org/v1/taxa?q=moose&rank=species&per_page=500&only_id=true) and then look those up via /v1/taxa/{id}.

i’m not going to write such code, since i don’t see the benefit in doing something like this, but i have created a page that will present the results of /v1/taxa in a human-friendly format that may or may not be useful: https://jumear.github.io/stirfry/iNatAPIv1_taxa. this other page may also be useful: https://jumear.github.io/stirfry/iNatAPIv1_observations_species_counts?taxon_id=3&order=asc

3 Likes

July last year the Pending label was new

Meanwhile the Pending example I offered has been Included!

But a dedicated list to target … would be wonderful. Feature request?

3 Likes

I know this doesn’t answer your question, but I could think of 10-20 manually if you’re looking to make a list ;)

Feel free to message me a list!

And thank you to the rest of you for your suggestions; they’ll be helpful!

I recently spent a week in Arizona and I was surprised by how many taxa I observed that weren’t included in the computer vision, even though I went to well-known birding sites and am absolutely not an expert in anything.

1 Like

It’s bigger than that. For example, my group (robber flies) are pretty big and charismatic for insects… but it looks like only ~50 species from the USA are included in the CV model, out of >1000 known to exist in the country. Of course part of our problem comes from the group still needing lots of alpha taxonomy, and the difficulty in distinguishing insect species from field photos. But it’s not like the only taxa missing from the model are scrawny forgettable or super rare things. We need more of the common stuff everywhere.

4 Likes

For a long time only two species of microleps would show up as suggested taxa whenever I photographed a small moth in my backyard in California. That resulted in those 2 species being waaaayyyy overrepresented in the area, and lots of unrelated species (most of them stuck in Needs ID forever) being labeled those species by people just selecting one of the suggested taxa. I like the idea of specifically trying to add enough observation of species to get them included in the CV.

1 Like

I would love to know quickly what taxa are in need of additional observations to be included in the CV model!

Would the iNat staff have a list of the taxa included in the CV model?

I was thinking it would be cool if there was a CV model list I could compare the taxa I’ve observed against, so I know which previously observed species need more observations and attention to help improve the model.

2 Likes

That’s true everywhere in the world. At least the ones in the US may have had something written about them besides the initial species description.

1 Like

I would absolutely so appreciate this! My love for photography is almost as great as my love for nature, so since I joined iNat, one of my “missions” (sorry, too pretentious a word, but you get my meaning) is to add images of even common species that may help others ID their observations. If I could focus on species that were missing from CV, I’d really feel able to make a contribution. Do hope a way can be found.

1 Like

Besides, I like the scrawny forgettable things. Also, hickory galls are neither scrawny nor forgettable!! (OK, they ARE small, and I’m always forgetting the names, but I love them, nonetheless.)

2 Likes

Now if they would just sit still to be photographed…

1 Like

If the question is “Which taxa have been observed in {place}, but not observed enough overall to be included in the CV model?”, then it would be possible to build a script using /v1/observations/species_counts to get all observed taxa in a place and then /v1/taxa/{id}, as @pisum pointed out, to check whether each taxon is included in the model. Wouldn’t be very efficient though. Could be a worthy feature request to ask that vision be filterable (or at least included in the response) for /v1/observations/species_counts.

If you use this link:
https://elias.pschernig.com/wildflower/leastobserved.html?user=ellyne

and look for taxa that have <100 total observations (first column) that should match pretty closely with species that are not included in the latest CV model (from your observations).

2 Likes

That was easy - thanks! And of the few species I’ve checked so far, they were either hickory galls or species from my recent trip to Arizona. Oh, darn, I’ll have to go back to Arizona, what a hardship. ;-)

I already had that link bookmarked, thanks.
Will target my nineties, bit daunting to tackle the only 3 obs so far!

That is a hardship. New Mexico is much nicer :slightly_smiling_face: :hot_pepper: