New Computer Vision Model Released

alex · April 14, 2022, 4:01pm

Yeah, we’re interested in this as well. As I mentioned above, once I wrap up my two current projects I plan to do some evaluation in this area.

wildskyflower · April 14, 2022, 6:06pm

My concern with doing it this way is that if the CV decides not to include a particular taxon in suggestions but the user accepts whatever its new guess is, that guess may be now even less likely to be correct than the original guess would have been. And having predictable mis-IDs is a big advantage for people trying to clean up a taxon to get new taxa into the CV. I am currently looking at a taxon that was previously not in the CV and more or less 100% misidentified as another species, which made it easy to clean up last fall to get it into the new CV model, and now spot checking a few observations the CV suggestions appears to have jumped to basically 100% accurate at splitting between the two species for its first suggestion. I suspect it would have been much harder for the jump in accuracy to be as clean if the mis-identifications were less predictable.

alex · April 14, 2022, 6:19pm

I’ve only said that I plan to do evaluation because I’m unsure of what approach makes sense at this point - I think we have to account for things like amount of training data (might more training data solve this?), accuracy at a higher rank if a taxon were excluded (would an intervention even make things better?), whether it makes sense to train as usual and exclude post-hoc, or whether it makes more sense to exclude target taxa before training, etc.

However, examining what taxa are included in the model is a priority.

Megachile · April 14, 2022, 6:19pm

I’d second this. The huge majority of new gall taxa we’ve been able to push across the training threshold in the past two years have come not from people tagging their observations in gall projects etc, but just by looking at the observations that had been placed in the few gall taxa the CV knew in the prior round. Having them grouped in only a few places has made it much easier to correct them manually, whereas a more cautious or “correct” CV guess might have made them hard to find.

wildskyflower · April 14, 2022, 8:23pm

Makes sense, and the ducks are a good example. I’ve wondered whether there could be a good way to have, say, ‘genus y except species x’ as a leaf in the CV, i.e. in a genus where 99% of the observations are 1 species and few or none of the other species have enough observations to be included by themselves, but in aggregate have enough to be a leaf.

It suppose from an interface perspective it might still be simplest to display it as just ‘genus y’ in the suggestions, but perhaps if the CV thinks it has significantly higher weight than ‘species x’ it could inhibit the species x suggestion from appearing or penalize it in the ranking or something.

chrisangell · April 15, 2022, 2:43am

I would love for this to happen!

DianaStuder · April 15, 2022, 9:02am

I’m confused by the

We are pretty sure it is X
followed by our best suggestion is Y, which is something quite different.

hkibak · April 15, 2022, 9:47am

After reading this thread I agree with marina_gorbunova that most of the problems are now human error. I think the quality of research grade observations would be greatly improved for plants by requiring three human suggestions instead of just two. What I see in the majority of new user observations is agreement by the user with the suggestion, followed by agreement with the first experienced identifier if the ID is different from the CV suggestion. What that means is that in reality you only have one experienced/expert ID. As someone who does IDs for a few taxa, I know I can make mistakes and I know that experts make mistakes. Requiring two folks who actually are familiar with the species to confirm the ID would improve the training set in my opinion.

blue_lotus · April 15, 2022, 4:20pm

I see aggressive location determination still isn’t being implemented. This alone could easily cut down 50-75% of misidentifications in the marine invertebrates I’m interested in. The AI basically never gets sponges correct, and frankly, the new AI seems slightly worse at it, if anything. But what is really bewildering is it also seems to be colorblind (both the new and old AI). It suggests Aplysina fistularis all the time, including when it spots A. archeri. These species have pretty reliable coloration, but the AI can’t see purple or something I guess.

alex · April 15, 2022, 4:53pm

Hi Diana,

Can you provide a more concrete example, or better yet a screenshot?

I see this:

DianaStuder · April 15, 2022, 9:16pm

Slightly different. iNat is not sure but suggests (splodge of red) is Haemanthus - a bulb. Not a shrub with berries.
https://www.inaturalist.org/observations/111514801
The obs is probably https://www.inaturalist.org/taxa/334567-Grewia-occidentalis which is already Included.

fluffyinca · April 15, 2022, 10:32pm

What she was referring to is when the CV is “pretty sure this is” a certain genus, but the top (and sometimes only) species suggested is not in that genus and often not even closely related. It was more common with one of the older models, but still happens occasionally. I’m unable to find a relevant example, however.

sbushes · April 15, 2022, 10:33pm

I also just checked a bunch to try and find an example…
I have come across the same issue from time to time.
But couldn’t find one at present.

alex · April 15, 2022, 11:57pm

Sounds like a bug, but I’ll need a case that I can reproduce or at least a screenshot to start debugging it.

mamestraconfigurata · April 16, 2022, 12:18am

I know the CV is not perfect, but it is decent. I was wondering, though, if it might be possible for a ‘Consider this’ inclusion? I have no idea how hard this would be, but I find a number of Noctuid species are frequently identified as y as opposed to x. As in Noctua pronuba identified as Peridroma saucia. A ‘Consider’ prompt might help people look and learn something in the process. I know that Bugguide often includes that information, but I do recognise that it has a different way of doing things (i.e. it’s not CV). I’ve been thinking about this for a month or so.

alex · April 16, 2022, 12:56am

On the web, on the taxon page, we have a Similar Species tab, and sure enough the similar species for Noctua pronuba includes Peridroma saucia. This isn’t produced from our vision model, instead it’s made by counting observations that contain observations with conflicting identifications made by users.

Curiously, when I look at our newest vision model with regard to Noctua pronuba and Peridroma saucier, on our test set the model is 90%+ accurate on both and doesn’t mistake either for the other. Mistakes for Peridroma saucia include Atypoides riversi, Euxoa comosa, Euxoa messoria, Euxoa ochrogaster, and Spodoptera exigua, while CV test set mistakes for Noctua pronuba include Abagrotis alternata, Agrotis segetum, Dargida procinctus, Euxoa auxiliaris, Euxoa tessellata, Noctua comes, Pantydia sparsa and Proxenus miranda.

Please note that our CV test set contains at most 100 observations per taxon so there is significant sampling bias at work here - don’t draw too much from these conclusions - it’s just fun to see how the model makes mistakes!

mamestraconfigurata · April 16, 2022, 1:09pm

I had forgotten that tab on the taxon page. I’ve even used it!
It is fun to see the mistakes made. A lot of those suggestions do not resemble the target species. In fairness to the model, though, those two moths are extremely variable, making it’s job a lot harder!
Oh, and I’ve seen some wildly inaccurate identifications made by the Cornell Birds AI. Life is never simple!

DianaStuder · April 16, 2022, 8:45pm

Here is one for @alex. https://www.inaturalist.org/observations/111625786
Pretty sure it is Santolina. Yes, and the common garden herb which is Included.
Followed by 2 Helichrysum and an Athanasia?
Okay - solved that one. If I go to Not Seen Nearby, it is the first and correct suggestion.

jwidness · April 16, 2022, 10:04pm

(screenshot)

DianaStuder · April 17, 2022, 6:50pm

Another https://www.inaturalist.org/observations/111796456
Pretty sure it’s Lewinskya (that is moss!). But that is Northern hemisphere. So = wrong.
Nearby is Wahlenbergia or Agathosma. Obs is a herbaceous perennial, small shrub.

Topic		Replies	Views
New Computer Vision model released! News and Updates	45	5595	October 4, 2024
Celebrating 100,000 Modeled Taxa with the iNaturalist Open Range Map Dataset News and Updates web , news , computer-vision	7	466	June 15, 2025
How many species are currently integrated to the Computer Vision model? General	2	435	December 6, 2019
New Computer Vision Model Released - August, 2022 News and Updates web , android-app , ios-classic-app	21	1670	January 10, 2023
Psst - New Vision Model Released! News and Updates	26	4026	June 15, 2020

New Computer Vision Model Released

Related topics