New paper offers suggestions for improving the value of citizen(community) science data

andy71 · July 18, 2019, 10:55pm

A new open access paper in PLOS critiques the haphazard sampling usually found in citizen(community) science databases (such as iNat).

https://journals.plos.org/plosbiology/article?id=10.1371/journal.pbio.3000357

There is some nice discussion here of the statistical limitations of project such as iNat that I think echos discussions we’ve had on the forum. In a way the paper is aimed more at the organizations running citizen(community) science projects rather than users themselves but I think everyone could learn something from considering their points.

The authors make a fairly non-specific recommendation to incentivize sampling in particular times and places rather sampling (and detecting) specific species. This would improve coverage and give us better data on where organisms are NOT found for instance. For example, current leaderboards in iNaturalist incentivize finding species and making observations rather than sampling per se. The authors don’t provide a lot of details of how to build such a system and I don’t know how their suggestions would fit into iNaturalist. But their paper is very relevant to the ultimate scientific value of iNaturalist.

I hope some of you find it of interest.

thebeachcomber · July 18, 2019, 11:03pm

If you liked this one stay tuned for another one coming out soonish looking at the important future areas of citizen science :))

andrewgillespie · July 19, 2019, 1:09pm

Hold a bioblitz if you want observations from a particular place.

caththalictroides · July 19, 2019, 1:31pm

But, what can individuals do on an ongoing basis?

I would love more of an emphasis on natural communities, and what makes up the community in a particular ‘patch’. There must be ways to break that down so that individuals can work on that.

I visit a limited number of natural areas, trying to delineate the ‘architecture’ of that area. But I don’t know if the resulting observations are of any use, beyond just presence of the individual organisms. It would be interesting to focus on how iNat could structure the observations in a way that meets the paper’s recommendation.

thebeachcomber · July 19, 2019, 2:18pm

It goes a bit beyond just holding a bioblitz. The key is incentivisation, and a lot of people aren’t necessarily incentivised by a bioblitz. Benefits not always immediately obvious to them.

andy71 · July 19, 2019, 3:15pm

Thanks for the suggestion but just to clarify the authors are addressing a slightly different issue. The problem is the scientific value of the observations–not the number of observations per se. These are not necessarily the same. For instance knowing where organisms are not found and how hard the users looked for them are valuable data that are not collected during bioblitzes. I won’t elaborate but you get the picture.

andy71 · July 19, 2019, 3:37pm

I think it’s awesome to regularly visit the same local spots and record what’s there. I find that more rewarding than traveling to far flung places but that’s just my personality.

I’m not sure what you can do as an individual. Projects like iNaturalist kind of get their strength from having a whole lot of people involved all making the same kind of observations. In general, recording some notes about your route through the ecosystem of interest, how long you were there, and more detailed data on phenology and behavior of what you saw is always valuable. Keeping counts of the number of individuals of certain species is valuable (but very difficult for numerous things like common plants and insects).

A good model perhaps are the journals of 19th century naturalists such as Henry David Thoreau. Ecologists have gone back to his journals and used them to extensively in comparing the historical ecosystem to what we observe in the same place today. Here’s paper from the journal Bioscience that summarizes of some of that research (let me know if the link doesn’t work): https://preview.tinyurl.com/yxl3jx86

You can actually read his journals here which is pretty cool. https://www.walden.org/collection/journals/

Lots of other naturalists who were not professional scientists also kept valuable journals like this that have made really important contributions.

bouteloua · July 19, 2019, 3:42pm

Related is the Trips feature, which you can read more about and see some examples here: https://www.inaturalist.org/trips

This type/level of data would increase the value a lot.

caththalictroides · July 19, 2019, 6:47pm

“Related is the Trips feature, which you can read more about and see some examples here: https://www.inaturalist.org/trips”

Cool! So I could make up, say, a list of the dominant overstory trees in my region, and record presence/absence of them at all my sites. Which raises issues when I make repeated trips to the same small place… maybe do a tree inventory once a year, ditto an understory tree/bush inventory once a year, etc. And record bedrock and soil type, slope, water regime, etc. in the comments. If iNat develops a more systematic way to record this info, it would be easy to update.

pisum · July 19, 2019, 11:25pm

long ago, i worked in marketing research. most of the time before we did a survey, we would look at past research, study trends, and do a focus group to figure out what would even be relevant to consider in a more structured survey. i think of stuff in iNaturalist as similar to that background and focus group research. i think it’s a mistake to try to make it out to be anything more than that in general.

sure, you can structure how some people are collecting data, but you will make that data more useful only for whoever has defined that process. you define the process according to what kind of question you’re trying to answer. ask a different question, and you’ll need a different process.

jennformatics · July 20, 2019, 3:41pm

I read recently about an ongoing citizen-leveraging flooding survey that’s structured to make sure people go out and take pictures not only of the “impressive” spots that usually get photo-shared, but also known spots that might have flooded but didn’t this time. If you have people you can count on to make an observation routinely at a given place during each major rain event, then you get something more like the kind of robust data you need to do effort statistics.

Not something individuals can do, but perhaps something specific area or project managers could do locally — making sure your bioblitzes evenly cover the area rather than focusing on the high-diversity spots. Some of your volunteers, on some of the occasions, would have to sacrifice their likelihood of getting rare observations or lots of observations, but bird counts do this sometimes — somebody’s got to take the side of the park with all the pigeons, for the good of Science!

silkenpaw · July 20, 2019, 7:17pm

They are also data that that are not collected by iNaturalist. That would require a totally different approach and it would be difficult to get citizen scientists to collect them,IMO. Negative data are always problematic.

pisum · July 20, 2019, 7:32pm

it would be neat for iNaturalist to have a feature to match people who want to do the footwork for science and people who have projects that need foot soldiers. i think that would definitely be in the spirit of the platform / community.

bouteloua · July 20, 2019, 7:41pm

sounds like scistarter.org

pisum · July 20, 2019, 7:53pm

yes, but done within the familiar ecosystem and community of iNaturalist.

elleng · July 22, 2019, 2:45pm

I’ve hesitated using iNat for “tracking-over-time” projects. As an example, I am involved in tracking the progress of invasive plant in a specific (small-ish) area. iNat doesn’t seem to be the right software for this, but I can’t put my finger on why it isn’t.

carrieseltzer · July 23, 2019, 8:30pm

Again not within iNat, but Adventure Scientists was established to matchmake adventurers and scientists. They’ve used iNat for some of their projects.

sandboa · July 24, 2019, 12:07pm

I didn’t read the paper thoroughly, I just scanned it. But I disagree with the premise of the analysis in this paper.
Our current scientific methodologies for sampling and statistical analysis were developed over centuries of parallel intellectual work by scientists and mathematicians. That’s why those methods work so well.
But the sampling methodology of Citizen Science doesn’t meet those rigorous standards because it isn’t designed to do so. It is a new type of sampling. The burden of responsibility shouldn’t fall on the Citizen Scientist who is contributing data, it should fall on the statisticians and professional scientists to figure out how to use the data that is contributed.
Incentivising more rigorous data collection policies has to be done very carefully. Citizen Scientists have established how they want to gather data. We shouldn’t discourage that in the name of a framework of scientific rigor developed for a different kind of data collection.

jdmore · July 24, 2019, 6:14pm

Well said!

fogartyf · July 24, 2019, 7:28pm

I think it’s important to recognize that statistics are not magic. In most cases, we can’t just “invent” a new method that overcomes the limitations of a sampling design. Statistics can be clever about new ways to account for biases and variability in measured variables, but it generally can’t replace data that wasn’t collected.

This is a struggle even in the research community - I work with some particular types of population models and very often my first piece of advice to colleagues is “think about your analysis first, then design your data collection protocol”. This inevitably saves a lot of headaches later and helps ensure that their data are actually useful.

I wouldn’t suggest totally overhauling community science protocols to make them as robust as many used by researchers, but I think there is a lot of room to make small improvements to data collection that could vastly improve the utility of these data.

For example, the ability to easily record presence/absence information or effort (even if not standardized) opens up a lot more options for using the data.

Topic		Replies	Views
Not an unbiased dataset General	37	3731	December 3, 2020
The academic rigour of RG observations General	26	1677	November 8, 2019
Article: Social inequities and citizen science can skew our view of the natural world General	39	1743	May 26, 2024
Strengths and Weaknesses of iNaturalist Data General	59	9879	February 21, 2021
Advanced Beginner Questions General	27	1944	December 30, 2023

New paper offers suggestions for improving the value of citizen(community) science data

Related topics