I was inspired by some of the comments on the iNat blog post on their new Anomaly Detector to work up some numbers on the percent of Needs ID plant observations currently above Family level vs. Family level and below.
These numbers are for Verifiable plant observations in New England as of 5 October 2024. Currently, there are 2,779,871 such observations. Of those, 1,624,603 are Research Grade (58%) and 1,155,279 are Needs ID (42%).
Of the Needs ID observations, 111,515 (10%) are at Epifamily level and above; 1,043,756 are at Family level and below (90%).
Having 10% of the Needs ID plant observations in the Kingdom-to-Epifamily range seems quite reasonable to me, (especially given the number of student observations I’ve been forced to leave at Vascular Plants in the past couple of weeks!). But New England has a very active and competent community of plant identifiers, so maybe the region is anomalous compared to the rest of the world.
So, I ran the numbers for ALL plant observations worldwide. That’s 85,890,668 Verifiable observations - 51,592,544 Research Grade (60%) and 34,298,135 Needs ID (40%).
Of the Needs ID observations, 3,484,198 (10%) are currently at Kingdom through Epifamily. 30,813944 (90%) are currently at the Family level and below. Again, that ratio seems quite reasonable to me.
But, of course, the real question is not where the already identified-to-some-point plant observations lie in the classification; it’s what to do with Unknown observations. Unknown observations are a small percentage of all Needs ID observations, but still important, to some degree. I don’t know of a way to estimate what percent of Unknown observations are of plants - @pisum, @jeanphilippeb, can either of you help with this?
My guess is that around 60% of Unknowns are plants, but I could be wildly off. For that matter, even if ALL Unknowns that are plants get identified to somewhere in the Kingdom-to-Epifamily range, would it really make that much of a difference? Currently, there are 684,574 Verifiable Unknowns world-wide; if all of those are plants and all get an initial ID of Epifamily or above, that changes the numbers to 4,168,772 out of 3,4982,709, or 12%.
Assuming I’ve done the numbers correctly, 10% or 12% hardly seems worth all the fuss. Or am I missing something?