Improving Data Quality

raphael1c · June 7, 2022, 10:01am

We know that people’s biases, particularly amateur biases, affect the quality of the data in Inaturalist. There are more observations of eye-catching or charismatic species than equally important/valuable species that lack those qualities. Is there a guide for users to counter those biases and improve the quality of the data in the site? Should we just focus on tardigrades and not birds? Any guidance would be welcome.

DianaStuder · June 7, 2022, 10:36am

We each focus on what interests us. If tardigrades speak to you …

iNat does bias towards both - common enough to be seen - and eye-catching enough (to be seen) It is outreach to people who are being encouraged to ‘see’ nature. There are lots of projects to encourage going beyond the obvious, been there done that.

If you help to ID Unknowns there is an endless stream of - Never Seen One of Those Before. Which helps to tip the balance to biodiversity. It’s sad to find good pictures of interesting life forms languishing in Unknown.

raphael1c · June 7, 2022, 10:45am

Excellent info! Many thanks!!

matthewvosper · June 7, 2022, 11:59am

It’s a really interesting point. For sure iNat data skews to the big and beautiful: it always will. About the most abundant creatures on the planet are nematodes, but they will never compete with Mallard for observations :).

For those of us who are into a particular taxon it can be surprising perhaps how many of these big and beautiful things are first discoveries for many users - it should be borne in mind that what is obvious to some will be obscure to others. But I could see a use for a ‘how to go further’ tutorial to help people to search out less observed taxa and make useful observations of them once they’ve been hooked - taking account of people’s different levels.

cthawley · June 7, 2022, 12:04pm

Agreed with what others have noted here.

I don’t think that there’s anyway to seriously counteract the major biases in iNat data. It is really the responsibility of scientists and other data users to educate themselves about those biases and account for them in whatever way they are analyzing/using the data. So I wouldn’t worry too much about the issue of bias and would focus on people using the site in a way that they find rewarding (though don’t get me going about users who enter incorrect/falsified data…;)

Additionally, the data is not the primary purpose of iNat’s existence - it is a “happy byproduct” (though for some users it may be a primary motivation to use). iNat’s core purpose is to encourage interaction with the natural world, so each user should find that for themselves. And if it creates usable data, all the better!

janetwright · June 7, 2022, 12:36pm

Those biases for the common and spectacular have traditionally dominated field guides, too (especially for taxa like plants or seashell mollusks, where there are way more species than can be covered). It’s OK, and as others point out, encouraging connection with nature is the main goal of iNaturalist.

As identifiers and commenters, we can make more effort to give positive reinforcement to folks who make observations of obscure groups. I know I got sucked into Sphagnum identification because of the enthusiasm of one identifier (@schneidried) who raved about how beautiful and fascinating they are when I posted one. I’m still a beginner at that but I keep plugging. Pointing out resources to learn more helps.

mamestraconfigurata · June 7, 2022, 2:36pm

I’d also like to point out that iNat’s data are also biased towards populated and fairly well off places, as well as the big and flashy organisms. Sparsely populated areas like N. Canada and places where people struggle to make ends meet are also underrepresented. It’s not a criticism, but merely another aspect of data quality.

bbk-htx · June 7, 2022, 2:46pm

I never thought of iNaturalist as being about people’s biases, but about people’s observations.
It is people’s biases that improve our knowledge of that organism.

billryerson · June 7, 2022, 3:42pm

It’s not something the scientific community is blind to. There’s a really interesting summary by Di Cecco et al 2021 (PDF). In addition to the charismatic species bias, there is also a bias in “newness”. Many users will log a species/photo the first time they observe it (and never again) or more frequently if they think that species is rarer. For example, I’m sure there are many people who would photograph every single bald eagle they see, but are they likely to do the same with robins?

sbushes · June 7, 2022, 3:47pm

I recommend trying to find 1000 species within 1 sq km as a way to dig deeper into what’s around you. It forces you to constantly move into new taxa and learn more as you go.

egordon88 · June 7, 2022, 4:03pm

This is something I think about, because plants are easy to photograph, birds are easy to record, and often arthropods are neither. There are some people mostly looking for one set of organisms or another or we get tunnel vision and chase a butterfly for miles and miss all the ants under our feet. If I only take my smart phone on a walk, I’m going to have 0 identifiable Chalcid wasps or springtails.

I had an idea for a project a couple weeks ago to compensate for some of the biases. I’m compiling a database of cleptoparasitic bees and nest parasites of native bees and I plan to analyze iNaturalist data for different bee families. Are people seeing a lot of bee flies or velvet ants in an area, while ground nesting bees are underreported? Likely, it will be the opposite - bees implying the presence of parasites, but we’ll see.

For the metric-ally challenged among us, 1 sq km is 247 acres. Since I’ve found about 675 species on my 0.0008 sq km lot (0.2 acres), I approve of this challenge!

thomasgreen · June 7, 2022, 4:08pm

I realize my observations have strong bias, and I wonder if they could be more valuable if I capture more information instead of mostly pictures and location. For plants, is it worthwhile to describe habitat characters (slope, aspect, soil, sun), pollinators present (I’m even worse at identifying insects), population size (this could be a lot of effort), associated species, etc? My confidence in capturing some of this data is low, so I’m afraid it could be poor quality, and I don’t know if the effort required would provide much value to how these observations are used. Maybe that’s best done within a project with goals to focus on biodiversity or species characteristics, and I should keep my observations casual? Usually the only comments I provide are measurements or unique habitat descriptors to aid in identification.

egordon88 · June 7, 2022, 4:11pm

Associated species can tell a lot. Is this wildflower growing at 9,000 ft in a Ponderosa Pine forest or 11,000 ft in a spruce-fir forest? From an accurate GPS, a curious identifier can view the location on google earth or NRCS web soil survey (in USA) and find most of that data.

sbushes · June 7, 2022, 4:19pm

As someone who records predominantly arthropods, I feel the opposite tbh!
With a macro lens on my DSLR, plants always seem really difficult to capture well.
I resort to smartphone but quality lacks and I am never sure what to photograph.
Birds also seem way harder to capture to me, even with an average telephoto.
In UK its also pretty boring in terms of diversity though, so not much incentive to focus on them…

Even with the best DSLR and best lens, many chalcids and many springtails will remain unidentifiable to species though to be fair. Of the identifiable springtails, I have a good few at species level using a smartphone (+add-on lens). I’d even wager one could do 1000 species in 1 sq km with only a smartphone and an add-on lens :)
( if you count species as nodes like iNat does ).

egordon88 · June 7, 2022, 4:33pm

Not to get too far in the weeds … I agree on bird pictures, that’s why I mentioned recordings. I’d rather have a song thrush’s song than grainy brown blob.

You may be in the minority, with Mallards and Robins as the #2 and 3 most uploaded UK species. But what is the appropriate proportion of common birds vs common plants vs nematodes when making observations?

With a macro lens, you should have a great opportunity to supplement phone pictures with closeups of flowers, leaf hairs, etc. The difference between https://www.inaturalist.org/observations/117005643 and https://www.inaturalist.org/observations/116641624, for example.

sbushes · June 7, 2022, 4:41pm

Yes… I do supplement at times, just always seems a bit of an awkward workflow so try and use smartphone and macro lens alone where possible. I admit desire probably plays into this more than anything - I just had minimal interest in plants for a long time. But I am finding curiosity now - both to complete my mapping of my local sq km… and to dig deeper into plant-insect interactions.

graysquirrel · June 7, 2022, 4:44pm

I was thinking about this recently when looking at maps of plant species distribution - many species get frequently recorded in the areas where they’re not common, and ignored in the spots they are common, resulting in a map that is pretty much the inverse of the actual species density!

This is why I go out of my way to make a lot of observations of very common species as well as the unusual ones. Every so often when hiking, I’ll stop, look around, and try to take a picture of every species I can see from the spot I’m standing in. I’ve actually found a lot of new species I never would have noticed by doing that, too.

I think most of the value of iNat for researchers is giving them starting points of where to find populations they want to gather data on. I wouldn’t bother with the other stuff unless you actually want to do it for your own reasons - for most studies, it wouldn’t be that useful, because other observers aren’t collecting the same information or using the same methodology to do so. And it sounds like doing it would feel like a chore and discourage you from making more observations.

This is a good point - I think the best way to gather this data is just to make more observations of other things near your first one. If you find a cool bug, stop and observe 3 of the shrubs nearby as well. If you’re photographing a pollinator, make an observation of the plant it’s on as well. The habitat and possible associations will become obvious on their own, eventually.

fffffffff · June 7, 2022, 4:54pm

Plants are super easy to photograph with your phone, most are idable, there’s a topic dedicated just to “how to” with plants, but I go by simple rule, if I don’t know what it is: a photo of the whole plant, then leaf (if all are the same) from both sides, making sure I hold it the way photo shows its shape from base to end, then photo of stem, to see if it has ridges or what type of hairs there’re, base of the plant, flowers from 3 sides.
UK is not boring with birds, and with how often you can find random birds from Asia and NA there, it is worth looking.

sbushes · June 7, 2022, 6:02pm

Well, I guess depends on your POV. Relative to invertebrate fauna it is in my book! :)
I grew up loving birds and have done plenty of birding in the past in UK and in Iceland, but to me now it seems a lot of effort for minimal gain in comparison with observing other taxa.

In UK in total we have ~20000 insect species vs ~600 bird species.
There are 54 bird species on iNaturalist observed in my town, of which I’ve seen 42 already. I would struggle to record many of the other 12 I think (e.g. kingfishers are rare to see here, let alone successfully record). Whereas even after finding over 1000 arthropod species, I can go to any local green space any day of the week and still find something new.

fffffffff · June 7, 2022, 6:38pm

I would say 600 is a lot for that territory, and you likely have much more than 54 species in a town, with any major green area it should be 100+, but I agree with what you say about arthropods.)

Topic		Replies	Views
Biases in iNat data General	89	7679	September 13, 2021
Not an unbiased dataset General	37	3526	December 3, 2020
What is iNat's wish list for observations? General question	38	1787	May 27, 2022
How do you decide whether something is "interesting" enough to observe? General question	93	3060	September 30, 2022
Advanced Beginner Questions General	27	1798	December 30, 2023

Improving Data Quality

Related topics