iNaturalist Research Project Fall 2020

Hi everyone, I wanted to post here about an iNaturalist project I participated in for an undergraduate Big Data/Geography course at San Diego State University in Fall 2020. Our team analyzed iNaturalist users themselves, and how they use the platform. Our insights come in four general categories:

  • United States Activity
  • Growth Around the Globe
  • Taxonomic Categories
  • Super Users

We have a project-specific website with a page for each of these categories and a full report on the last page, linked below. Big-time credit to team members Jessica Embury (iNat: “jessemb”), Tim Anderson, and Amy Petris, as well as Professor Ming Hsiang-Tsou for leading the course that instigated this project.

Hopefully it serves as an example of how this platform can be used in an undergraduate environment; the observation data from GBIF is freely available!

GBIF data:

Please comment if you have any access issues or general questions and I’ll respond as soon as possible!

Dustin Smith
San Diego State University


Looks great!

where’s Australia?! We’ve been 4th in the world for observations for a while now!


I found a typo P:


1 Like

Australia is there it’s just list of top 10 shows 9 only, needed to be changed!


good catch, I’ll call off the attack kangaroos


Updated: Passerformes > Passeriformes

Thanks for letting me know!

1 Like

Updated: Apologies for the omission, Australia now shows up third in the sentence order to make up. No emu war necessary!


Cool project! Thanks for sharing it.


some more typos in the taxonomy section:
“iNaturalist has produced many more research-grade observations on animals and plants (Kingdoms Animalia and Plantae ) than on fungi.”
" Class Aves (birds) and Class Insecta have many more observations in Kingdom Animalia that the runner ups: reptiles, mammals, amphibians, and Actinopterygii ( ray-finned fish)."
" Order Passeriformes was the most frequent category among birds. This is reasonable, since Passeriformes…"
Unfortunately I never find them in my own texts ;-)

1 Like

“This excludes lichens and moss, which perhaps are considered “boring” to observe and identify among iNaturalist’s user-base.”

I would not say that these are “boring” to identify, but along with fungi, might need chemical, microscopic, or other analysis, rather than just visual inspection, in order to be identified to species.


I’m shocked one only has to observe 1250+ times per year to be in the top 1% of observers!


I’d love to see “II. Do different geographic regions have different popular taxonomic categories?” done with plants


As far as I can tell, the report covers only active users, correct? By ‘active’, I mean users who have posted at least one observation. By inspection of the iNat user community, it seems evident that the median number of observations among all iNat accounts is zero. Curious whether your dataset allows you to assess the veracity of that.
This apparent large burden of completely inactive users makes name-searching quite a chore, burdening curatorial time. And, I think, it reflects over-enthusiastic recruitment.

1 Like

this is not necessarily true; many of these zero observation users are very active identifiers, including users with hundreds of thousands of IDs

1 Like

statistically, only a few percent of zero-observation users have any IDs at all. loarie did a blog post and mentioned that somewhere…


To clarify, the median user has zero activity of any type. I reached this by randomly searching user names and perusing results. (Try it yourself, and see what you find)
I posted my remark to see whether there is data to support the assertion which I arrived at entirely casually. Though I suppose it could be asserted that I used a monte carlo approach.

Do you mean statistically or just from experience?

I kind of wish that in the sections on activity and super-users, you’d looked at identification activity and top identifiers as well as observation activity and top observers. After all, there are many databases you can submit observations to, but iNaturalist is one of the few where identification has such a large role. It would have been interesting to see side-by-side comparisons of the growth of observers/observations and identifiers/identifications, and maps showing super-identifiers who do the most geographically distributed and/or concentrated identifications. I’d especially liked to have seen the share of leading and improving identifications done by the top 1% of identifiers vs everyone else, as was done for observers, and the taxons where the largest numbers of leading and improving identifications are done by the fewest users.

Maybe that’s too much to ask for an already large project. Next fall’s project could be to add all the parallel analysis for identifications.

Good work.

1 Like

loarie compiled a lot of statistics on this in this blog post.
from the chart, it would appear that roughly half of iNat users have zero obs and zero IDs.

I remember that! So sad to know how many observations are stored in phone memory of those “white-circle” people. Still it doesn’t show that median user has 0 everithing if half of all users do have observations.