Computer vision: We love it, we hate it, we love it again. One of the challenges is that the CV model only covers taxa with enough verifiable observations and photos. The bigger a genus gets, the more likely it will have some species that haven’t yet been observed enough times for iNat’s CV to include them in its model.
It’s interesting when iNat gets to the point that all species in a genus are included in the model. In theory, if we validate the IDs of all those observations, then iNat’s identification competence should be as close to “as good as it gets”.
How big (speciose) can a genus be and yet CV still “knows” how identify every species? Feel free to add your examples. I feel like we should have separate tallies for each taxon group, as CV is fairly comprehensive within bird and mammal genera, and very patchy for insects and spiders. Here are some starters:
[Confession: I just validated that the species with fewest observations in those genera are in the CV model. Based on that, I assumed all the others are in CV.]
These two are particularly difficult groups though do we have any information on how well does the CV ID them? Because while I personally know about most Laruses, I cannot say that I could ID them in most cases …
A new plant leader: Sarracenia (North American pitcher plants) have 11 accepted species in POWO, and all of them are recognised by Computer Vision. There are also lots of hybrids, not all of which are recognised yet, but it seems fair to count this as complete for full species.
Setophaga has 35 species with 3 below 60 obs, but two of those still have a long way to go. The island endemics really hinder the large genera for this.
Yeah, it’s kind of counterintuitive how our best examples of Larus, Calidris, and Graptemys are all taxa that are sort of infamous for being tough to ID. I think this actually makes some sense, as many congeners with many observations (and thus probably overlapping ranges) is a good recipe for ID headaches.
Thanks @ripleyrm! I guess I meant “all species accepted in iNat’s taxonomy”, but truthfully I had no idea that Pierfelice Ravenna had described Nemastylis selidandra from Texas. And, to be honest, I haven’t previously come across any reference to this species. Ravenna published this name in 2004 in his own journal, Onira (volume 9, issue 10, p 48). Onira is very difficult to find, so it’s going to be hard to check the protologue, but a 2024 paper on Ravenna’s taxonomic history tells us which specimen he based the name on.
“Texas, Comal Co., rocky hills along route 21, about 5 km (3 miles) N of Cibolo Creek,” Correll 32517, CTES, holotype; isotypes, Herbarium Lundell, TEX.
That specimen does still exist in the Lundell Herbarium, and is imaged here. It is the only GBIF occurrence assigned to that species. I’ll note that it was determined as Nemastylis geminiflora both by the collector in 1966, and by Peter Goldblatt in 1994, and that it was collected solidly within a core part of the N. geminiflora range. My strong suspicion is that POWO accepts the species solely because no-one in the past 22 years has formally synonymized it with N. geminiflora, and not because it is truly a separate species. But… it’s worth looking into!
Congrats! I knew there had to be some improvement on 4, but every time I thought of larger genera they turned out to have narrowly endemic species that don’t get enough observations to meet the CV threshold.
On the other hard, it needs to be a genus where all species are in locations that are easily accessible for most people. If we had a genus of two birds, one of which lives in downtown Sydney, Australia and is bright red, and the other lives in Utupua in the Solomon Islands and is bright blue, it won’t fulfil the requirements we are looking for, even if the Australian species has 10,000 observations, because the Utupua species will never get to the number of observations and photos needed to be included in CV because Utupua is very difficult to get to and almost no one goes there.
Meanwhile, every species in genus Graptemys is found in the lower 48, as is every species in genus Nemastylis. A sharp eyed Yank with a phone but no passport could conceivably see every species in the genus.
Olrog’s Gull, the least observed species in genus Larus, is found in Montevideo and eastern Argentina.
The Spoon-billed Sandpiper, the least observed species in genus Calidris, is found throughout Southeast Asia, including Taiwan, HK, and Thailand.
Meanwhile, I guess Melanerpes woodpeckers are simply easier to see than Setophaga warblers for whatever reason, explaining the very rarely sighted endemics in Setophaga.
Here are some of the taxa with the most children. Most (46/60) are genera, and 6 are subgenera. All of the genera and subgenera have at least one species not in CV.
Yes, accessibility and charisma are big factors here. People like birds a lot and flowering plants quite a bit. If these species live relatively close to dense human populations, they will easily exceed the roughly 60-observation threshold to get included in CV. Apparently, enough people find turtles charismatic, too. ;-)
I’m curious whether turtle identifiers find that CV does a good job with Graptomys, and similarly for birders with Larus. It seems that these might provide useful anecdata to plot our journey toward ever-better IDs.
I’ve checked, and Larus (25 species) is the largest bird genus with every species covered by CV. Calidris (24) is a very close second. Melanerpes (23) is third, and the largest that isn’t a chardriiform. Fourth is Campylorhynchus (15), this is the largest that’s a passerine.
That’s a good one for plants, higher than any I thought to check.
Equisetum (horsetails) has 14 of 16 genera (excluding hybrids) in the CV, and the missing two—E. diffusum (63 obs.) and E. xylochaetum (53 obs.)—are very close to being included. If a taxon expert or two went through the Equisetum disagreements in the regions with those species, I’d wager they’d be included in the next CV update.
As of now though, close but no cigar.
(Aesculus is also pretty close: 11 of 12 species, only missing A. assamica with 34 obs.)