I thought @comradejon’s map of iNat contribution rates across the US was really neat (check it out here, also copied below) and was curious what variables might predict a region having a high rate of iNat participation. I was especially curious because some of the usual suspects don’t seem to be very predictive in this situation (politics, climate, etc.; rare that Vermont/Wyoming and North Dakota/Georgia cluster!). So I thought it would be interesting to make a quick model to try and glean some insight about what’s going on.
First, I hypothesized what variables might predict high or low iNat contributions in a state (many of these came from ideas people had on @comradejon 's original post).
- Socio-demographics: past work (e.g. this paper of ours) shows that citizen/participatory science volunteers are more likely to be White and have high levels of education, so states with more people fitting this demographic might have higher rates of iNat participation. [Quick aside that citizen science’s lack of diversity is an important problem that our community should work hard to understand and remedy; some useful literature here and here if you’re interested in learning more.] Age, political persuasion, profession and income might also be important (some work finds that citizen scientists are more likely to be higher-income, retired, politically liberal and work/used to work in STEM).
- Urban Population: @wendyjegla pointed out that regions with more people living in urban areas might have lower participation rates, since there may be fewer opportunities to observe wild organisms (or at least, it may be a less common/accessible activity to do there).
- Climate: In temperate parts of the world, iNat contributions drop every winter, so it seems probable that places with pleasant weather much of the year would have more observations (suggested by a few people e.g. @scottdwright).
- Environment-oriented Tourism: As @whitneybrook and @comradejon pointed out, a state like Wyoming might have particularly high rates of participation as a result of many tourists visiting for the primary purpose of interacting with nature (e.g. in Yellowstone).
- Amount of Natural Lands: As @ciafre pointed out, areas that are primarily agricultural might have fewer interesting/desirable places to record biodiversity (related to #2 above), while a region with lots of protected natural lands might encourage more observations.
Control variables
- Population: obviously, the more people in a state, the more participation we would expect. @comradejon’s map already takes this into account (it’s a map of per capita participation).
Other variables that could be important
- As pointed out by @bugbaer and @kpmcfarland, colleges/community organizations that promote iNat widely (e.g. university extension or the Vermont Center for Ecostudies) could play a role in increasing iNat participation rates in an area. I’m not sure what kind of national level variable I could use to measure this but it could be important.
- Individual-level effects: It’s possible that a handful of super-users can drive up the per capita participation rate, particularly if the state has a small population, as @upupa-epops pointed out, though if these users are randomly distributed across the US (probable?), you wouldn’t really need to control for it (it would just be noise). But with a small sample size (50) it could skew things/reduce power.
- Something norms-related would be interesting but I’m not quite sure what robust data there is for that. Are there states where spending time outside/observing nature is more or less normalized? Probably!
- Spatial control: States from the same region might have similar contribution levels because of their proximity to one another (“spatial autocorrelation”). A robust model might include/control for some spatial effect explicitly.
First, let’s confirm that a state’s population is a confounder by plotting a state’s population against its number of iNaturalist observations.
Indeed it is! A state’s population explains 88% of the variance in the number of iNaturalist observations from that state in 2024. We could include state population as a predictor to control for this, but let’s just make the outcome “per capita iNat contributions,” in line with the map.
1. Race/ethnicity: not very predictive of per capita iNat contributions, at least when just looking at % White, non-Hispanic residents (note that this way of categorizing race obviously has limitations; people of color are not a monolith). New England over-performs but the Midwest under-performs, both regions with large White populations.
2. Education: States with a higher proportion of residents with bachelor’s degrees seem to have moderately higher per capita contributions. Income and % STEM professional were highly correlated with education; I hypothesized that education might be most relevant for iNaturalist users and focus on it here.
3. Age: states with more retired-age residents (65 and older) seem to have more iNat participation.
4. Politics: more liberal-leaning states do seem to have more participation, though it’s interesting to note the states that buck this trend (Wyoming way over-performing, New York underperforming).
5. Urban Population: States with a higher proportion of residents living in urban areas appear to have lower per capita participation (data from here).
6. Climate: Interestingly, warmer states seem to have lower rates of participation.
7. Environmental Tourism: Not a very clear relationship between per capita visits to national parks, national preserves, and other non-history-related NPS sites, and level of iNat participation. Data from here. This is not an ideal estimator of what we’re interested in (doesn’t include state parks visitation etc.) so if anyone can think of a better data source let me know.
8. Natural Lands: states with a larger percentage of “natural” land (non-agricultural and non-developed) tend to have higher rates of iNat participation. Data from (here).
[Quick note that Vermont is obviously quite an outlier in the scatterplots above, with double the per capita contribution rate as the next most-engaged state. As a result, when testing OLS assumptions Vermont posed some issues with homoscedasticity/normality of residuals. So the model below excludes Vermont, though when I did a sensitivity analysis (including Vermont), the results ended up being similar anyway.]
Theoretically speaking, all 8 of these predictors could be important (except environmental tourism, just bc I don’t trust the way we’re measuring it, so not including it in discussion below). However, that’s a lot of predictor variables for a sample size of 50. So I calculated AIC values for a model with all predictors (stat technique for choosing a well-fit model while penalizing having too many variables). AIC suggested only including average temperature, political persuasion, percent urban population, and percent natural land. This generally makes sense to me; those last three variables in particular seemed to have the strongest relationship with our outcome according to the scatterplots above.
Here’s the output of the OLS model predicting iNat contributions in 2024 per 100,000 residents among the US states except Vermont.
Coefficient | Estimate | Standardized Estimate | p |
---|---|---|---|
Intercept | 3764.1 | NA | 0.16 |
Dem vote share 2024 | 173.2 | 0.5426381 | <0.001 |
% of pop living in urban areas | –103.8 | -0.4715187 | <0.001 |
Avg temp (F) | –57.8 | -0.1639696 | 0.12 |
% natural land (not developed, not agriculture) | 77.6 | 0.5247150 | <0.0001 |
Adjusted R^2 = 0.54
This is telling us that, controlling for the other variables in the model…
- a state with a 1% higher democratic vote share will have, on average, 173 more iNat observations per 100,000 residents.
- a state with a 1% higher percentage of its residents living in urban areas will have, on average, 104 fewer iNat observations per 100k residents.
- Warmer states have fewer per capita observations (a bit of a head-scratcher) but this relationship is not significant.
- a state with 1% more natural land will have, on average, 78 more iNat observations per 100k residents.
Together, these four variables explain 54% of the variance in per capita iNat contributions. Looking back at the map, we now have a bit more insight. States with high rates of iNat participation tend to be more liberal-leaning, rural states with large tracts of forest and other non-developed, non-agricultural land. Let’s look at some of the states with the highest and lowest per capita participation and see if we can make sense of their performance, given this.
Highest-performing states
Vermont, Maine, New Hampshire (#1, #2, #4): These are rare blue-leaning states with large proportions of their population living in rural areas. This, in combination with their high amount of natural, non-agricultural land explains in part how they perform so well in iNat participation.
Alaska (#3): Alaska is a red-leaning state but that factor seems to be overtaken by Alaska’s fairly large amount of rural residents and massive abundance of natural land by % of land area (#1 in the country).
Oregon, Washington (#5, #10): These PNW states are blue and have large amounts of natural land, which overcomes their middling amount of rural residents.
Wyoming (#6): Wyoming is the reddest state in the country, BUT is also #4 in natural land and #12 in rural population. Quick aside that my metric for natural land must be including pasture as natural, given this. I suppose that makes sense.
Hawaii (#7): Hawaii has a fairly low rural population and modest % natural land, but is very blue, which seems to explain its performance.
New Mexico (#8): New Mexico is #3 in natural land and blue-ish, despite being somewhat urban, explaining its performance.
Massachusetts (#9): Somewhat surprising given its very urban population and middling amount of natural land. But it is the third bluest state in the country.
Lowest-performing states
North Dakota, Iowa, Nebraska, Kansas, Indiana (#1, #2, #3, #7, #8): These conservative agricultural states of the midwest have relatively little natural land, which seems to overcome their fairly low urban populations.
Georgia, Mississippi, South Carolina (#4, #6, #9): These southern states have medium to low urban populations and medium amounts of natural land, but they’re red to red-ish, which I guess explains to some degree their position here. Still sort of confusing.
Nevada (#5): This purple state has the second-highest % natural land in the country but also has the second highest % urban residents.
New Jersey (#10): having the third-highest urban population and low % natural land explains this state’s position despite its strong blue lean.
So, there you have it! This is very much a quick “back of the envelope” attempt at this (let me know if I missed something (likely)) but I think it can at least serve as a starting point for understanding what drives iNaturalist participation, which also might help us better understand how to reach new audiences, which can both improve spatial coverage of the data and bring the joy/benefits of biodiversity monitoring to more people.
I’d be honored if anyone wanted to try and improve on this. Could maybe even work this into a paper? CSV here. I can share R code too, though everything I did was basically just ggplot and the base lm() function.