Does anyone else get bothered by how many observations are marked as "unknown species"?

lynnharper · December 22, 2022, 1:46pm

Yes, they are independent. I ask because if I filter for Unknowns that are Verifiable (https://www.inaturalist.org/observations?iconic_taxa=unknown&identified=false&place_id=any&quality_grade=needs_id), right now there are only 336,868 such observations worldwide, much less than 1% of all Verifiable observations (currently 123,397,480).

Thank you for all your work pulling together these numbers!

lynnharper · December 22, 2022, 1:53pm

Yeah, my guess is that somewhere up to 5% of Verifiable observations are submitted as Unknowns and it’s only the valiant efforts of identifiers like you that keep the percent of Unknowns down so low when measured at any given moment.

matthewvosper · December 22, 2022, 2:04pm

Interesting analysis: I’ve done the same for the needs ID pile:

Rank	Number	%	URL
NeedsID	46932224	100.00	https://www.inaturalist.org/observations?place_id=any&quality_grade=needs_id&subview=map
Unknown	336238	0.72	by subtraction
StateofMatterLife	53888	0.11	https://www.inaturalist.org/observations?rank=stateofmatter&place_id=any&quality_grade=needs_id&subview=map
Kingdom	1357606	2.89	https://www.inaturalist.org/observations?rank=kingdom&place_id=any&quality_grade=needs_id&subview=map
Phylum	480917	1.02	https://www.inaturalist.org/observations?rank=phylum&place_id=any&quality_grade=needs_id&subview=map
Subphylum	444831	0.95	https://www.inaturalist.org/observations?rank=subphylum&place_id=any&quality_grade=needs_id&subview=map
Superclass	85	0.00	https://www.inaturalist.org/observations?rank=superclass&place_id=any&quality_grade=needs_id&subview=map
Class	1632305	3.48	https://www.inaturalist.org/observations?rank=class&place_id=any&quality_grade=needs_id&subview=map
Subclass	136544	0.29	https://www.inaturalist.org/observations?rank=subclass&place_id=any&quality_grade=needs_id&subview=map
Infraclass	11491	0.02	https://www.inaturalist.org/observations?rank=infraclass&place_id=any&quality_grade=needs_id&subview=map
Subterclass	3621	0.01	https://www.inaturalist.org/observations?rank=subterclass&place_id=any&quality_grade=needs_id&subview=map
Superorder	32458	0.07	https://www.inaturalist.org/observations?rank=superorder&place_id=any&quality_grade=needs_id&subview=map
Order	1866068	3.98	https://www.inaturalist.org/observations?rank=order&place_id=any&quality_grade=needs_id&subview=map
Suborder	320111	0.68	https://www.inaturalist.org/observations?rank=suborder&place_id=any&quality_grade=needs_id&subview=map
Infraorder	191969	0.41	https://www.inaturalist.org/observations?rank=infraorder&place_id=any&quality_grade=needs_id&subview=map
Parvorder	1974	0.00	https://www.inaturalist.org/observations?rank=parvorder&place_id=any&quality_grade=needs_id&subview=map
Zoosection	14312	0.03	https://www.inaturalist.org/observations?rank=zoosection&place_id=any&quality_grade=needs_id&subview=map
Zoosubsection	47679	0.10	https://www.inaturalist.org/observations?rank=zoosubsection&place_id=any&quality_grade=needs_id&subview=map
Superfamily	495222	1.06	https://www.inaturalist.org/observations?rank=superfamily&place_id=any&quality_grade=needs_id&subview=map
Epifamily	55445	0.12	https://www.inaturalist.org/observations?rank=epifamily&place_id=any&quality_grade=needs_id&subview=map
Family	3859573	8.22	https://www.inaturalist.org/observations?rank=family&place_id=any&quality_grade=needs_id&subview=map
Subfamily	1368677	2.92	https://www.inaturalist.org/observations?rank=subfamily&place_id=any&quality_grade=needs_id&subview=map
Supertribe	1361	0.00	https://www.inaturalist.org/observations?rank=supertribe&place_id=any&quality_grade=needs_id&subview=map
Tribe	954707	2.03	https://www.inaturalist.org/observations?rank=tribe&place_id=any&quality_grade=needs_id&subview=map
Subtribe	186369	0.40	https://www.inaturalist.org/observations?rank=subtribe&place_id=any&quality_grade=needs_id&subview=map
Genus	14303596	30.48	https://www.inaturalist.org/observations?rank=genus&place_id=any&quality_grade=needs_id&subview=map
Genushybrid	388	0.00	https://www.inaturalist.org/observations?rank=genushybrid&place_id=any&quality_grade=needs_id&subview=map
Subgenus	392732	0.84	https://www.inaturalist.org/observations?rank=subgenus&place_id=any&quality_grade=needs_id&subview=map
Section	204810	0.44	https://www.inaturalist.org/observations?rank=section&place_id=any&quality_grade=needs_id&subview=map
Subsection	16550	0.04	https://www.inaturalist.org/observations?rank=subsection&place_id=any&quality_grade=needs_id&subview=map
Complex	258240	0.55	https://www.inaturalist.org/observations?rank=complex&place_id=any&quality_grade=needs_id&subview=map
Species	17667239	37.64	https://www.inaturalist.org/observations?rank=species&place_id=any&quality_grade=needs_id&subview=map
Hybrid	55852	0.12	https://www.inaturalist.org/observations?rank=hybrid&place_id=any&quality_grade=needs_id&subview=map
Subspecies	119538	0.25	https://www.inaturalist.org/observations?rank=hybrid&place_id=any&quality_grade=needs_id&subview=map
Variety	57306	0.12	https://www.inaturalist.org/observations?rank=variety&place_id=any&quality_grade=needs_id&subview=map
Form	2449	0.01	https://www.inaturalist.org/observations?rank=variety&place_id=any&quality_grade=needs_id&subview=map
Infrahybrid	73	0.00	https://www.inaturalist.org/observations?rank=infrahybrid&place_id=any&quality_grade=needs_id&subview=map

No surprise that the ‘peaks’ are at the major taxonomic ranks. Interesting that >2/3 of the needsID pile is either at Genus or Species. Testament to the work of the ‘higher order improvers’ and also encouragement to those working at the finer end of the scale that it’s a good place for them to focus. I’m surprised that Family is only 8%, and Order only 4%.

If unknowns are coming in circa 5% then the fact that they are well below 1% of the pile is a great achievement.

jeanphilippeb · December 22, 2022, 6:54pm

I don’t understand. “Needs ID” depends on the ID rank (if above rank species, an observation will always need an ID, if not casual), on the “casual” status and on the “research grade” status. So the rationale is not simple, when plotting “Needs ID” rank by rank, because “Needs ID” and rank are not independant by definition. But I don’t know what you are looking for, or evaluating.

All “unknowns” (2.5 % of all observations) should be “needs ID”.
Update: only all those not “casual” are “needs ID”.

I also don’t understand the substraction. What is this substraction?
I think any substraction is risky, because we could miss a subtlety. A direct request for counting the results would be better. (Then we can sum the counts for consistency checking).

I think that :

For Species, Hybrid, Subspecies, Variety, Form, Infrahybrid, I would consider, the percents of observations (at the rank considered) that are “casual”, that “need ID” and that are “research grade”.
For all other ranks, and for “unknown” observations, only 2 of the 3 categories: “casual” and “needs ID”.

For the rank species:

https://api.inaturalist.org/v1/observations?rank=species
99272604 observations at rank Species
100 %

https://api.inaturalist.org/v1/observations?rank=species&quality_grade=needs_id
17668781 “Needs ID”
17.80 %

https://api.inaturalist.org/v1/observations?rank=species&quality_grade=research
73473121 “Research grade”
74.01 %

https://api.inaturalist.org/v1/observations?rank=species&quality_grade=casual
8130732 “Casual”
8.19 %

Consistency checking:
17,80 % + 74.01 % + 8.19 % = 100 %

For the rank Family:

https://api.inaturalist.org/v1/observations?rank=family
4203699 observations at rank Family
100 %

https://api.inaturalist.org/v1/observations?rank=family&quality_grade=needs_id
3860377 “Needs ID”
91.83 %

https://api.inaturalist.org/v1/observations?rank=family&quality_grade=research
32 “Research grade”!? Bug or intentional?
Looking at 3 of them, they have “Maverick” with only 3 IDs (2 + 1, not 3 + 1):
https://www.inaturalist.org/observations/136149727
https://www.inaturalist.org/observations/134781504
https://www.inaturalist.org/observations/133800602

https://api.inaturalist.org/v1/observations?rank=family&quality_grade=casual
343289 “Casual”
8.17 %

Consistency checking:
91.83 % + 0 % + 8.17 % = 100 %

For “unknown” observations:

https://api.inaturalist.org/v1/observations?identified=false
3472426 observations without identification
100 %

https://api.inaturalist.org/v1/observations?identified=false&quality_grade=needs_id
338361 “Needs ID”
9.74 %

https://api.inaturalist.org/v1/observations?identified=false&quality_grade=research
2 “Research grade”!? Bug!
Still stranger… For both the “user was suspended”:
https://www.inaturalist.org/observations/42192459
https://www.inaturalist.org/observations/13509863

https://api.inaturalist.org/v1/observations?identified=false&quality_grade=casual
3134069 “Casual”
90.26 %

Consistency checking:
9.74 % + 0 % + 90.26 % = 100 %

jeanphilippeb · December 22, 2022, 6:57pm

I am surprised to see that more than 90% of the “unknowns” are “casual”.

This suggests that:

It is important to label casual observations as such.
Most of the work for identifying the “unknowns” is directed toward those that are not “casual” (those that “need ID”).
The worst case for an observation of poor value is a casual observation let by the observer without ID.

fffffffff · December 22, 2022, 7:17pm

I think it’s normal, if observation lacks some data user is more likely to not id too, new users like uploading without any id, then many iders mark cultivated plants without iding, it’s not optimal, but when one group of school kids upload a thousand of planted petunias, it’s fine to not id every duplicate and such.

matthewvosper · December 22, 2022, 8:15pm

Needs ID is everything that comes up by default in the identify view. It’s what most people base their identifying effort on. It includes everything that’s not Research Grade or Casual so it doesn’t depend on rank very much, any rank can be casual, and any rank below Family can become Research Grade. It’s interesting to see unknowns as a proportion of all observations, but I think it’s also helpful to see the proportion of unknowns as a fraction of ‘what we have left to do’.

Actually searching for unknowns can miss a subtlety because it actually includes everything outside of an iconic taxon (inc. identified bacteria/viruses etc). Although, to be fair what I should have done is use identified=false like you did earlier: https://www.inaturalist.org/observations?place_id=any&quality_grade=needs_id&subview=map&identified=false (currently 336049 so someone’s done a few today :-D )

jeanphilippeb · December 25, 2022, 8:07pm

I don’t know if it is still useful, but I make it available.
I can regenerate it later, or with an additional filter (for instance, only >5 year old observations), without additional effort (generated by a software).

Each cell shows:

Number of observations (obtained from the API).
Percent relative to the row.
Global percent.

	Casual	Needs ID	Res. Grade	TOTAL
No ID	3,135,380 90.423 % 2.2545 %	332,047 9.5762 % 0.2388 %	2 0.0001 % 0.0000 %	3,467,429 100.00 % 2.4932 %
Stateofmatter	38,641 41.771 % 0.0278 %	53,864 58.228 % 0.0387 %		92,505 100.00 % 0.0665 %
Kingdom	204,408 13.081 % 0.1470 %	1,358,178 86.918 % 0.9766 %		1,562,586 100.00 % 1.1236 %
Phylum	56,438 10.481 % 0.0406 %	482,035 89.518 % 0.3466 %		538,473 100.00 % 0.3872 %
Subphylum	104,380 18.987 % 0.0751 %	445,360 81.012 % 0.3202 %	2 0.0004 % 0.0000 %	549,742 100.00 % 0.3953 %
Superclass	7 7.5269 % 0.0000 %	86 92.473 % 0.0001 %		93 100.00 % 0.0001 %
Class	315,002 16.145 % 0.2265 %	1,635,992 83.853 % 1.1764 %	11 0.0006 % 0.0000 %	1,951,005 100.00 % 1.4029 %
Subclass	10,563 7.1727 % 0.0076 %	136,703 92.826 % 0.0983 %	1 0.0007 % 0.0000 %	147,267 100.00 % 0.1059 %
Infraclass	2,447 17.491 % 0.0018 %	11,543 82.508 % 0.0083 %		13,990 100.00 % 0.0101 %
Subterclass	93 2.5034 % 0.0001 %	3,622 97.496 % 0.0026 %		3,715 100.00 % 0.0027 %
Superorder	1,715 5.0193 % 0.0012 %	32,453 94.980 % 0.0233 %		34,168 100.00 % 0.0246 %
Order	132,123 6.6084 % 0.0950 %	1,867,188 93.391 % 1.3426 %	9 0.0005 % 0.0000 %	1,999,320 100.00 % 1.4376 %
Suborder	20,713 6.0684 % 0.0149 %	320,609 93.929 % 0.2305 %	6 0.0018 % 0.0000 %	341,328 100.00 % 0.2454 %
Infraorder	8,873 4.4119 % 0.0064 %	192,236 95.585 % 0.1382 %	5 0.0025 % 0.0000 %	201,114 100.00 % 0.1446 %
Parvorder	337 14.557 % 0.0002 %	1,978 85.442 % 0.0014 %		2,315 100.00 % 0.0017 %
Zoosection	310 2.1149 % 0.0002 %	14,348 97.885 % 0.0103 %		14,658 100.00 % 0.0105 %
Zoosubsection	588 1.2173 % 0.0004 %	47,714 98.782 % 0.0343 %		48,302 100.00 % 0.0347 %
Superfamily	20,676 4.0005 % 0.0149 %	496,157 95.998 % 0.3568 %	3 0.0006 % 0.0000 %	516,836 100.00 % 0.3716 %
Epifamily	3,299 5.6281 % 0.0024 %	55,318 94.371 % 0.0398 %		58,617 100.00 % 0.0421 %
Family	343,743 8.1660 % 0.2472 %	3,865,692 91.833 % 2.7796 %	31 0.0007 % 0.0000 %	4,209,466 100.00 % 3.0268 %
Subfamily	123,160 8.1979 % 0.0886 %	1,370,367 91.215 % 0.9854 %	8,816 0.5868 % 0.0063 %	1,502,343 100.00 % 1.0803 %
Supertribe	15 1.0691 % 0.0000 %	1,385 98.717 % 0.0010 %	3 0.2138 % 0.0000 %	1,403 100.00 % 0.0010 %
Tribe	62,313 6.0742 % 0.0448 %	955,844 93.175 % 0.6873 %	7,700 0.7506 % 0.0055 %	1,025,857 100.00 % 0.7376 %
Subtribe	13,659 6.7940 % 0.0098 %	186,741 92.885 % 0.1343 %	644 0.3203 % 0.0005 %	201,044 100.00 % 0.1446 %
Genus	2,309,444 13.621 % 1.6606 %	14,314,991 84.431 % 10.293 %	330,037 1.9466 % 0.2373 %	16,954,472 100.00 % 12.191 %
Genushybrid	998 68.875 % 0.0007 %	392 27.053 % 0.0003 %	59 4.0718 % 0.0000 %	1,449 100.00 % 0.0010 %
Subgenus	16,247 3.7380 % 0.0117 %	394,257 90.708 % 0.2835 %	24,139 5.5538 % 0.0174 %	434,643 100.00 % 0.3125 %
Section	21,134 9.1913 % 0.0152 %	203,960 88.703 % 0.1467 %	4,841 2.1054 % 0.0035 %	229,935 100.00 % 0.1653 %
Subsection	1,041 5.8701 % 0.0007 %	16,577 93.475 % 0.0119 %	116 0.6541 % 0.0001 %	17,734 100.00 % 0.0128 %
Complex	15,308 4.7571 % 0.0110 %	259,974 80.788 % 0.1869 %	46,514 14.454 % 0.0334 %	321,796 100.00 % 0.2314 %
Species	8,143,674 8.1912 % 5.8557 %	17,662,142 17.765 % 12.699 %	73,613,533 74.043 % 52.931 %	99,419,349 100.00 % 71.487 %
Hybrid	174,988 54.425 % 0.1258 %	55,820 17.361 % 0.0401 %	90,710 28.213 % 0.0652 %	321,518 100.00 % 0.2312 %
Subspecies	137,149 5.8162 % 0.0986 %	119,582 5.0712 % 0.0860 %	2,101,333 89.112 % 1.5110 %	2,358,064 100.00 % 1.6956 %
Variety	55,998 11.135 % 0.0403 %	57,331 11.400 % 0.0412 %	389,568 77.464 % 0.2801 %	502,897 100.00 % 0.3616 %
Form	6,521 26.831 % 0.0047 %	2,441 10.043 % 0.0018 %	15,342 63.125 % 0.0110 %	24,304 100.00 % 0.0175 %
Infrahybrid	228 7.3572 % 0.0002 %	74 2.3879 % 0.0001 %	2,797 90.254 % 0.0020 %	3,099 100.00 % 0.0022 %
TOTAL	15,481,613 11.132 %	46,955,001 33.762 %	76,636,222 55.105 %	139,072,836 100.00 %

jeanphilippeb · December 27, 2022, 5:09am

There are many “unknowns” observations that are casual just because they lack observation date (not “verifiable”), although they have a location (which is much more important than a date?).
It’s a pity to miss identifying so many observations just because of that.

Is observation date that important?
Anyway, they have a submission date, which is better than no date at all.
Shouldn’t observations with missing observation date be considered “needs ID” (“verifiable”)?

fffffffff · December 27, 2022, 7:25am

Of course, and also most such misses are on observer not checking how observation is uploaded.

dianastuder · December 27, 2022, 7:53am

Date is vital for phenology. When does it bloom, this year, last year, during the drought years, after the Polar Vortex …

Also problems from people battling internet connection, or loadshedding (ask me how I know - but mostly from comments, since my own photos are camera and uploaded thoughtfully much later)

We need to split Captive / Cultivated, better named honestly as Not Wild
from Casual again better named Lacking Data.

And Needs ID should be running independently from Wild / Not / Lacking. Been fighting for that since I landed on iNat.

sedgequeen · December 27, 2022, 5:29pm

You are so right, @dianastuder !

jeanphilippeb · December 27, 2022, 7:20pm

Fortenately, I can push the “Casual” and the “Needs ID” to the same projects,
and let you append this filter to the URL, for reviewing only the “Needs ID”:

&quality_grade=needs_id

For instance, from this link (for identifying all the 61 x 30 observations in 3 projects):

you would obtain this other URL (only 15 x 30 observations):

https://www.inaturalist.org/observations/identify?quality_grade=casual,needs_id&not_in_project=153322&project_id=153422,153421,153420&identified=false&quality_grade=needs_id

Some questions/responses here:
https://www.inaturalist.org/posts/73398-draft-for-creating-projects-for-unknown-observations

system · February 25, 2023, 7:20pm

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Identifying "Unknown" from experienced users General	17	1927	January 9, 2021
Is it good to go through "Unknown" observations that need an ID and at least do a general classification? General question	10	757	November 15, 2020
What if nobody IDs your observation? General question	38	4277	August 26, 2020
Why do some of my observations get identified as something that looks completely different than the comparison photos? General question	44	1478	December 23, 2022
Delete observations that don't get support? General	45	1788	November 7, 2020

Does anyone else get bothered by how many observations are marked as "unknown species"?

Related Topics