I´ve already read many interesting posts on the data quality and biases, but is someone aware of “scientific” comparison of data qualtiy assesed by iNaturalist compared to expert knowledge? How valid iNat IDs are for some taxonomic groups? And if something like the “community-taxon” is improving iNat data? Would be happy to get some comments and references.
not sure how “scientific” this is, but you may want to read this if you haven’t already: https://forum.inaturalist.org/t/identification-quality-on-inaturalist/7507.
Great question, and I appreciate the interest in the more formalized “scientific” information you seek compared to more anecdotal “I feel like…” types of responses to this type of question. I’m not sure if you’re familiar with Google Scholar (scholar.google.com), but it’s a great way to search for published research on certain topics. Some of the papers can be dense for those not used to reading peer-reviewed research, however
Here’s a few quick excerpts from a few articles that I found simply by searching Google Scholar with the phrase “iNaturalist ID”:
Overall, it can be concluded that the classification quality of iNaturalist termite data is excellent with error rates up to only 5.1% for genus designations and 10.2% for family designations. (https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0226534#sec014)
Forty-five percent of the observations [of herpetofauna in Sonoma County, California] made it to the iNaturalist “research grade” quality using their crowdsourcing tools. Independently we evaluated the crowdsourcing accuracy of the research grade observations and found them to be 100% accurate to the species level. (https://theoryandpractice.citizenscienceassociation.org/articles/10.5334/cstp.131/)
Our assessment shows that a high number of lichen observations on iNaturalist are misidentified or lack the necessary microscopic and (or) chemical information required for accurate identification. (https://cdnsciencepub.com/doi/pdf/10.1139/cjb-2021-0160)
In summary, seems like there’s a massive amount of variation in data quality between taxonomic groups. Several articles I found also noted that geography plays a strong role since the AI often suggests species from more well-observed areas (e.g., the USA) in areas that are under-observed, though this trend seems to be getting better in recent years:
In 2018, most images of plants and animals from any part of the world were likely to receive from the system an identification of species inhabiting North America. Over the course of 2019 and 2020, AI has almost stopped suggesting incorrect identifications for plants within European Russia (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7686226/)
If you’re interested in this type of info for certain groups, it’s best to reach out to taxon experts on iNaturalist.
Also, it’s worth recalling that “expert knowledge” is also not a consistent variable depending on the group and context–when I visit museum collections I sometimes find dozens of misidentified insect specimens in a single tray.
iNat has the (online) advantage that the ID is not ‘carved in stone’.
If the taxonomy changes it can be automagically updated, with the previous synonyms still searchable.
When a new to iNat scientist trawls thru existing records, they can make appropriate changes. Public discussions on an obs can be interesting for the rest of us to follow.
It is one of the perks of trawling thru old Needs ID (for this NON-scientist)
@jmillsand That was really interesting how you pulled out such pertinent research and to see what other entities have said about iNat data. :)
I observed the rate of improvement to the Computer Vision identifications has been just amazing in the last year or so.
For older research assessments of iNat, keep in mind that a great deal of resources went into Computer Vision training and re-tooling in 2022. And, the pace of improvement seems to be accelerating.
I feel like …
especially Bryologists (moss experts) and Lichenologists have a ‘hard time’ here regarding ID quality.
Both those groups are common (I mean the observed taxa, not the respective experts - these are actually a rare species here on the platform) and speciose in most areas, easy targets for a quick cell phone photo, but often need very good close-ups, microscopy or chemical methods to ID, while the CV on the other side is quite confident with its suggestions (on species level!) of the most common species, where often it’s even a different family.
Thus, a high number of observations, many of those with inaccurate IDs on the one hand, and few IDers (very much appreciate your efforts!! ) on the other hand.
Ask an Odonatologist or Herpetologist, and they will tell a different story of course.
Well, what is “expert knowledge” to you?
The professor emeritus of botany whose life work is the taxonomy of a branch of plantae?
The English literature professor with a passion for botany with co-author credit on publications of the former?
The government or NGO employee with the job title “vegetation ecologist”?
The private consultant with the job title “environmental scientist”?
The factory foreman with a lifelong field botany hobby whose decades of collections and field work have culminated in a checklist of flora provided to their local field naturalists club?
I would argue that all of these individuals can hold the same depth and breadth of knowledge, but, traditionally, our understanding of “expert” has been fairly narrow. iNaturalist gives an unprecedented legitimacy to those latter individuals. It’s up to the data users to decide on the quality of that knowledge.
iNat data is created by expert knowledge, so I don’t know how you can compare them, if there’re serious problems with ids it means experts aren’t playing their role.
@teellbee at least this article is not really a separate entity, it’s our project here on iNat
This is a fascinating discussion with many great points, but maybe what @alexwirth was asking was actually “how accurate is iNaturalist’s Computer Vision”? with “expert” opinion assumed accurate. This would be hard to assess from existing records in part because many of us pull up the CV for an observation (because it’s easier than typing in a name) and then accept or reject it based on whatever our level of expert knowledge is. So even if the ID is tagged as being done by the CV (the little shield icon), it’s been vetted by the observer.
Also, the more experts do IDs, the faster the CV improves! I can’t wait to see when it can do a good job with grasses (the mosses, yes, are another issue).
If that is the question, a bad workman blames his tools. CV is a suggestion - we don’t all have a row of field guides, or a university library, or the time and skill to ‘key it out’. CV is a good place to start, but it is our name attached to our ID. I decided this obs is This species and I can’t blame CV if I called it wrong.
The quality of computer-vision ID suggestions has definitely improved over time. Several factors appear to have contributed to this:
- iNat has accumulated more observations and more human identifications over time, so computer vision has a bigger database to work from, which means it now covers more taxa.
- There was a change to the identification algorithm to focus CV suggestions on taxa observed relatively nearby. This I think is the biggest factor in improving IDs outside of North America and Western Europe.
- The CV training process was reworked to enable it to be updated almost every month rather than 1–2 times a year. Also, it was made more flexible so that it can now identify at a genus level even when there are not enough observations to support species-level IDs.
Of course, computer vision is just one factor that informs a proportion of iNat IDs. The exchange of real live knowledge by people from interested amateurs to seasoned experts is still the core of the process.
To me, and I think for most others here, “expert” is a person who knows the group well and can identify with high accuracy at least some individuals of the taxonomic group, in some geographic areas, if the photos are good enough.
Titles and job descriptions may hint at who may be experts, but those of us who work in these fields know that knowledge and titles don’t necessarily go together.
I think the best we can do is hold taxa to a set standard, e.g. someone takes the reins on identifying a group and makes consistent enough mistakes to make fixing them later much easier. It’s why I’ve made an effort to utilize my IDs to standardize the groups I like, in order to form a baseline for others to build on. Species with a great number of observations aren’t really feasible to comb through with near-perfect accuracy, even for “experts,” (due to the nature of identifying from images) but I think it’s quite possible for someone specializing in an identifiable group to achieve over 99% accuracy.
I can say from personal experience that having someone pick through the more “unsavory” groups can quite quickly begin to improve data quality. After sorting through most of Polygonatum in North America, definite range maps miraculously appeared, neatly sorting observations into their expected and predicted ranges. Same goes for the genus Prosartes. Using the Compare tool, I can make custom maps showing different species observations alongside one another, which is helpful for groups like these two since the North American representatives have different habitat preferences. In going through existing observations, I can reliably find my old misidentifications by where observations of unlikely-to-coexist, similar-looking-but-different-habitat-preferring species exist.
Oh wow…thank you all for your interesting comments to my question. I´m very happy to see so many interesting answers and ideas. @wdvanhem I fully the agree with the narrow view on “what are experts” and I´m happy that iNaturlist allows everyone to be an “expert”. Citizen Science is sometimes reported (“It feels like”) to be incorrect and unreliable. Most likely since everyone can contribute and if the data are not curated by experts (poeple who have reputaitons in the field basically by scientific publications), the data have to be less correct (That´s not my personal opinion, but “commen sense” in some parts of “society”. So I am interested in publications such as @jmillsand (or @pisum) postet in which iNat data are compared to differently curated data sets or re-evaluated iNat data sets. And of course this will differ between the taxonomic groups, but even this is interesting how this differs between groups.
So thanks again for ALL the interesting posts.
just for all who are interested in this topic as well. I came across this interesing thesis adressing and in part comparing different systems, not only iNat.
I’m interested enough in this, that I did a small scientific test, for moths in western Canada. First caveat: I consider myself to be an expert. So, I posted a large number of moth photos, accepted the CV identification, and then waited 6 months for citizen scientists to weigh in, before correcting the IDs based on my own expertise (with dissection as required, to confirm). Short answers: CV was about 80% accurate on macromoths, but less than 50% accurate on micromoths. The real humans - when they attempted an ID - improved upon the CV significantly, bringing macromoths up to about 95% species accuracy, which is almost good enough for me as a scientist. Unfortunately there were not enough humans with enough time to attempt all the images, particularly among micromoths.
Thanks for the interesting answer and attemp to adress this issue.
Fascinating! Would it be worth repeating the experiment in, say, five years, to see if the CV had improved significantly in that time period?
This would be interesting, but you’d need an independent sets of images. Those observations and their images will now be used for training themselves, so you’d need an independent sample with the same coverage.