Largest genera with every species covered by CV?

Computer vision: We love it, we hate it, we love it again. One of the challenges is that the CV model only covers taxa with enough verifiable observations and photos. The bigger a genus gets, the more likely it will have some species that haven’t yet been observed enough times for iNat’s CV to include them in its model.

It’s interesting when iNat gets to the point that all species in a genus are included in the model. In theory, if we validate the IDs of all those observations, then iNat’s identification competence should be as close to “as good as it gets”.

How big (speciose) can a genus be and yet CV still “knows” how identify every species? Feel free to add your examples. I feel like we should have separate tallies for each taxon group, as CV is fairly comprehensive within bird and mammal genera, and very patchy for insects and spiders. Here are some starters:

Plants: Nemastylis (4 species)
Mammals: Canis (8 species)
Birds: Melanerpes woodpeckers (23 species)

[Confession: I just validated that the species with fewest observations in those genera are in the CV model. Based on that, I assumed all the others are in CV.]

9 Likes

Calidris sandpipers have 24, all easily over the threshold

Edit: and Larus has 25, can’t believe that wasn’t the first one I checked!

3 Likes

Anas is very close: 30 species, 28 already in CV, and the other two species have 46 and 63 observations thus far

3 Likes

Anas bernieri has 27 obs, no?

1 Like

46 including casual records of zoo/captive animals, which the CV also trains on

4 Likes

These two are particularly difficult groups though do we have any information on how well does the CV ID them? Because while I personally know about most Laruses, I cannot say that I could ID them in most cases …

3 Likes

Could someone remind me how to find a list of which species are covered by the CV?

2 Likes

Hello hello! :sun:

I believe this might be relevant (:

https://forum.inaturalist.org/t/where-can-i-find-the-latest-list-of-species-that-are-included-in-the-cv-system/70262

3 Likes

According to POWO there are 5 species of Nemastylis. Do you mean “all species” or “all species that iNat knows about”?

A new plant leader: Sarracenia (North American pitcher plants) have 11 accepted species in POWO, and all of them are recognised by Computer Vision. There are also lots of hybrids, not all of which are recognised yet, but it seems fair to count this as complete for full species.

6 Likes

I feel like it has to be a bird. Larus is probably the correct answer, but there may be a genus with more.

2 Likes

Setophaga has 35 species with 3 below 60 obs, but two of those still have a long way to go. The island endemics really hinder the large genera for this.

1 Like

For turtles, Graptemys has all 14. Interesting, since these can be tricky to ID for humans.

4 Likes

Yeah, it’s kind of counterintuitive how our best examples of Larus, Calidris, and Graptemys are all taxa that are sort of infamous for being tough to ID. I think this actually makes some sense, as many congeners with many observations (and thus probably overlapping ranges) is a good recipe for ID headaches.

1 Like

Thanks @ripleyrm! I guess I meant “all species accepted in iNat’s taxonomy”, but truthfully I had no idea that Pierfelice Ravenna had described Nemastylis selidandra from Texas. And, to be honest, I haven’t previously come across any reference to this species. Ravenna published this name in 2004 in his own journal, Onira (volume 9, issue 10, p 48). Onira is very difficult to find, so it’s going to be hard to check the protologue, but a 2024 paper on Ravenna’s taxonomic history tells us which specimen he based the name on.

“Texas, Comal Co., rocky hills along route 21, about 5 km (3 miles) N of Cibolo Creek,” Correll 32517, CTES, holotype; isotypes, Herbarium Lundell, TEX.

That specimen does still exist in the Lundell Herbarium, and is imaged here. It is the only GBIF occurrence assigned to that species. I’ll note that it was determined as Nemastylis geminiflora both by the collector in 1966, and by Peter Goldblatt in 1994, and that it was collected solidly within a core part of the N. geminiflora range. My strong suspicion is that POWO accepts the species solely because no-one in the past 22 years has formally synonymized it with N. geminiflora, and not because it is truly a separate species. But… it’s worth looking into!

Congrats! I knew there had to be some improvement on 4, but every time I thought of larger genera they turned out to have narrowly endemic species that don’t get enough observations to meet the CV threshold.

4 Likes

On the other hard, it needs to be a genus where all species are in locations that are easily accessible for most people. If we had a genus of two birds, one of which lives in downtown Sydney, Australia and is bright red, and the other lives in Utupua in the Solomon Islands and is bright blue, it won’t fulfil the requirements we are looking for, even if the Australian species has 10,000 observations, because the Utupua species will never get to the number of observations and photos needed to be included in CV because Utupua is very difficult to get to and almost no one goes there.

Meanwhile, every species in genus Graptemys is found in the lower 48, as is every species in genus Nemastylis. A sharp eyed Yank with a phone but no passport could conceivably see every species in the genus.

Olrog’s Gull, the least observed species in genus Larus, is found in Montevideo and eastern Argentina.

The Spoon-billed Sandpiper, the least observed species in genus Calidris, is found throughout Southeast Asia, including Taiwan, HK, and Thailand.

Meanwhile, I guess Melanerpes woodpeckers are simply easier to see than Setophaga warblers for whatever reason, explaining the very rarely sighted endemics in Setophaga.

2 Likes

Here are some of the taxa with the most children. Most (46/60) are genera, and 6 are subgenera. All of the genera and subgenera have at least one species not in CV.

3 Likes

Yes, accessibility and charisma are big factors here. People like birds a lot and flowering plants quite a bit. If these species live relatively close to dense human populations, they will easily exceed the roughly 60-observation threshold to get included in CV. Apparently, enough people find turtles charismatic, too. ;-)

I’m curious whether turtle identifiers find that CV does a good job with Graptomys, and similarly for birders with Larus. It seems that these might provide useful anecdata to plot our journey toward ever-better IDs.

2 Likes

I’ve checked, and Larus (25 species) is the largest bird genus with every species covered by CV. Calidris (24) is a very close second. Melanerpes (23) is third, and the largest that isn’t a chardriiform. Fourth is Campylorhynchus (15), this is the largest that’s a passerine.

Edit: Fixed spelling of Calidris.

3 Likes

That’s a good one for plants, higher than any I thought to check.

Equisetum (horsetails) has 14 of 16 genera (excluding hybrids) in the CV, and the missing two—E. diffusum (63 obs.) and E. xylochaetum (53 obs.)—are very close to being included. If a taxon expert or two went through the Equisetum disagreements in the regions with those species, I’d wager they’d be included in the next CV update.

As of now though, close but no cigar.

(Aesculus is also pretty close: 11 of 12 species, only missing A. assamica with 34 obs.)

2 Likes