iNat misidentifies Xysticus as Bassaniana

I generally agree. For a more or less current breakdown (based on iNat’s Explore page looking at verifiable observations and iconic taxa numbers, though these don’t add up because iconic taxa aren’t perfect):

Taxon Observations CV Performance
Total 234614184
Birds 32919771 Good
Amphibians 3116913 Good
Reptiles 4970602 Good
Mammals 4984256 Good
Fish 2634471 Mixed
Molluscs 3747166 Mixed
Arachnids 6861761 Bad
Arthropods 55083370 Mixed
Ladybeetles 1161306 Good
Odonates 3661940 Good
Bombus 1071262 Good
Apis 529210 Good
Plants 93270169 Mixed
Fungi 14831766 Bad
Protozoans 369490 Bad
Other 530866 ?

With my subjective assessment of how the CV performs on groups and pulling out some arthropod groups as mentioned above (though I didn’t have an easy way to do butterflies).

Plants and Arthropods make up about 2/3rds of verifiable observations posted to iNat, so I don’t think it’s accurate to say that

Both plants and arthropods have very mixed success in CV, depending on which part of the tree of life you are in. That said, I would guess that the most commonly observed plants and arthropod taxa have reasonably decent CV accuracy. But Fungi (where the CV and human identifers are awful) has as many observations as Reptiles, Mammals, and Amphibians combined.

4 Likes

Arthropods at least all have too many legs.

But iconic plants are
moss and worts
ferns
pines and assorted other conifers
if you have flowers or fruit
there’s monocots - all the cereal crop staff of life
and dicots - the fruit and veg we eat, the flowers we pick.
Oh yes - seaweed - if it is red or green. But not brown.

Ahem. All lumped in ‘plants’ because I as ‘naturalist’ cannot possibly tell them apart, can I?!

However the ‘trivial’ third which is animals is well spread across many icons. (And whimper when I ID as a planty person SO MUCH EASIER to say - that’s a beetle)

1 Like

If I stop IDing spiders on iNat it won’t be I “ragequit”, it will be after deciding that my time and expertise are better used elsewhere. That’s not rage, that’s logic. I can’t speak for others.

I don’t understand the explanation that once the CV understands a few species, it can’t back off to genus. How often to I see “Branta” offered, or “Anas”, for example? So if the CV can do this for bird genera, why not spiders? All Bassaniana species in North America are quite distinctive in shape from all Xysticus.

1 Like

The CV isn’t trained on observations that have a genus-level ID once any species in the genus is in the model. This means that if the only species in the model is atypical of the genus, it will have trouble recognizing the genus.

The CV can suggest broader taxa because it has some information about taxonomic hierarchies. But it is doing this only using the material in its training set (i.e., photos representing certain species only, not the genus as a whole) – it doesn’t know the genus separately from the species within that genus. The CV is based on pattern recognition only; it doesn’t actually “know” what is in any of the photos it is trained on – that is, a spider is merely a particular combination of shape and color against a larger background of other shapes and colors, it can’t suggest IDs based on morphology except insofar as these differences represent consistent features in particular sets of its training material.

I wrote a longer explanation here: https://forum.inaturalist.org/t/allow-some-non-leaf-taxa-to-be-added-to-the-cv-model/63937/9

3 Likes

So basically this isn’t going to improve? Good to know, but I’m sorry for the observers who are going to continue to be misinformed by the CV about the ID of their spiders.

1 Like

You can vote on the feature request(s) linked earlier in the thread.

I think the premise of the current training principle is that these problems will be self-correcting, given enough time and observations – i.e., once it learns one species it might temporarily be unable to recognize other members of that genus, but eventually there will be enough observations of other species in that genus that they will be included in the model as well and once it has several species it may be able to also recognize some species not included in the model based on visual relatedness. My experience with bees is that this only sort of works – for every improvement it feels like it unlearns something else, and I suspect it will be a very long time, if ever, before it manages to provide even somewhat accurate suggestions for the majority of bee observations. Which is why many of us have been suggesting that the training model needs to be changed.

1 Like