I’ve mentioned from time to time that scientists shouldn’t blindly use databases like GBIF (or even specific scientific museum databases like the Smithsonian) or iNaturalist data for research purposes. By blindly, I mean not verifying the accuracy of the identifications and not following up on outlier observations (geographically out of range). Doing so is dangerous because all biodiversity databases contain a large number of errors. And this is so because natural history collections are underfunded and understaffed.
A recent real-world example of overreliance on these databases has just occurred. The bumble bee Bombus pensylvanicus has been petitioned for listing as an endangered species. They base the argument on studies that made the broad assumption that the databases are largely correct. Here’s one such study…and a group of scientists and curators have written a letter in response to this study. Both the study and the response letter are here.
https://science.sciencemag.org/content/367/6478/685/tab-e-letters
iNaturalist is extremely valuable for documenting occurrences of species. But if scientists don’t examine the photographs themselves and use their own expertise to verify accuracy, very serious mistakes can be made. The same is true of examining specimens in scientific museum collections. In both cases, assume the identifications are incorrect unless you can document otherwise. Too much work, you say? Not as much work as having to rein in bad science once it escapes.
It makes me wonder if uncurated database repositories like GBIF will end up causing more harm to scientific knowledge than benefit. It makes it too easy to do “big science” in a bad way.
Note: I’m not arguing against iNaturalist or the placement of voucher specimens in scientific collections, just against the blind use of databases that rely on those specimens being accurate.