Platform(s), such as mobile, website, API, other: all platforms from which data can be downloaded
URLs (aka web addresses) of any pages, if relevant: [not sure I understand; any pages could be relevant]
Description of need:
Some people seem to be anxiously clutching their pearls as they try to protect researchers from data the researchers may want to see by shifting that data to Casual. I think that’s bad. Shouldn’t researchers be able to make their own choices? Also, people argue on the Forum about what to to protect the researchers from. I’m tired of seeing that. Finally, iNaturalist data does actually have some problems that researchers should consider and there’s evidence that sometimes researchers don’t deal with these problems, at least the first time they use the data.
Feature request details:
Need: iNaturalist data are great and useful for many kinds of research. They’re not perfect. I think that warning researchers about the problems would be useful for researchers. It would also be useful as an alternative to the actions of some people who shift observations to the cesspit that is Casual just because the data don’t meet their own standards, even though they would meet some researchers’ needs. Additionally, presenting such a list would allow us to short-circuit some endless, rancorous discussions on the Forum. We could just point people to this and ask them to improve it.
Proposed solution: Have a pop-up that shows up when someone downloads data saying something like, “Data from iNaturalist are great but not always perfect. Would you like to learn about problems you should watch for when using this data?” Include on the pop-up a place to click on “Don’t show me this again.” (But maybe make it show up once a year anyway.)
What should such a list include? I recommend the following, but no doubt you could improve this list. (Maybe the items could include links to iNaturalist descriptions or explanations of the problems.)
“Even RG observations may not be correctly identified; we recommend you sample the observations for check rate of error and especially check geographic and temporal outliers. The percent of accurate identifications is often high (over 90%, in many cases over 95%) but is dismal for other species.
Captive/cultivated organisms may not be marked as such. (Please mark those you notice as “captive/cultivated” or “not wild.”)
Observations with geoprivacy set to “obscure” are assigned by iNaturalist to a location within a latilong; their locations as presented are not accurate, though within a few miles of the true location.
Some observations have huge error circles around the reported location. The largest circles result from errors in data entry; in these cases, the locations is reasonably close to the reported location. Other large circles result from writing down a location that’s the center of a lake when the observation was on the shore or using park headquarters as the location for anything in a park. The post office location may be used for anything in a town. A large circle may mean “Seen somewhere along this trail.”
Fraud is rare and we try to keep it out, but we can’t entirely. It seems most common in observations posted by students using iNaturalist a graded assignment or in a few problem projects where the CNC and GSB are treated as contests to be “won.” Fraud using AI pictures also exists. (If you find fraudulent observations, please flag them.)
iNaturalist data are inevitably biased by the interests, skills, and distribution of the volunteer citizen-scientists who post these data and who identify them.
So . . . Do you think presenting such a list on download would be useful? What do you think such a list should include?