Hi everyone, I wanted to post here about an iNaturalist project I participated in for an undergraduate Big Data/Geography course at San Diego State University in Fall 2020. Our team analyzed iNaturalist users themselves, and how they use the platform. Our insights come in four general categories:
United States Activity
Growth Around the Globe
Taxonomic Categories
Super Users
We have a project-specific website with a page for each of these categories and a full report on the last page, linked below. Big-time credit to team members Jessica Embury (iNat: ājessembā), Tim Anderson, and Amy Petris, as well as Professor Ming Hsiang-Tsou for leading the course that instigated this project.
Hopefully it serves as an example of how this platform can be used in an undergraduate environment; the observation data from GBIF is freely available!
some more typos in the taxonomy section:
āiNaturalist has produced many more research-grade observations on animals and plants (Kingdoms Animalia and Plantae ) than on fungi.ā
" Class Aves (birds) and Class Insecta have many more observations in Kingdom Animalia that the runner ups: reptiles, mammals, amphibians, and Actinopterygii ( ray-finned fish)."
" Order Passeriformes was the most frequent category among birds. This is reasonable, since Passeriformesā¦"
Unfortunately I never find them in my own texts ;-)
āThis excludes lichens and moss, which perhaps are considered āboringā to observe and identify among iNaturalistās user-base.ā
I would not say that these are āboringā to identify, but along with fungi, might need chemical, microscopic, or other analysis, rather than just visual inspection, in order to be identified to species.
As far as I can tell, the report covers only active users, correct? By āactiveā, I mean users who have posted at least one observation. By inspection of the iNat user community, it seems evident that the median number of observations among all iNat accounts is zero. Curious whether your dataset allows you to assess the veracity of that.
This apparent large burden of completely inactive users makes name-searching quite a chore, burdening curatorial time. And, I think, it reflects over-enthusiastic recruitment.
To clarify, the median user has zero activity of any type. I reached this by randomly searching user names and perusing results. (Try it yourself, and see what you find)
I posted my remark to see whether there is data to support the assertion which I arrived at entirely casually. Though I suppose it could be asserted that I used a monte carlo approach.
I kind of wish that in the sections on activity and super-users, youād looked at identification activity and top identifiers as well as observation activity and top observers. After all, there are many databases you can submit observations to, but iNaturalist is one of the few where identification has such a large role. It would have been interesting to see side-by-side comparisons of the growth of observers/observations and identifiers/identifications, and maps showing super-identifiers who do the most geographically distributed and/or concentrated identifications. Iād especially liked to have seen the share of leading and improving identifications done by the top 1% of identifiers vs everyone else, as was done for observers, and the taxons where the largest numbers of leading and improving identifications are done by the fewest users.
Maybe thatās too much to ask for an already large project. Next fallās project could be to add all the parallel analysis for identifications.
loarie compiled a lot of statistics on this in this blog post.
from the chart, it would appear that roughly half of iNat users have zero obs and zero IDs.
I remember that! So sad to know how many observations are stored in phone memory of those āwhite-circleā people. Still it doesnāt show that median user has 0 everithing if half of all users do have observations.