Taxon rank and name mismatches between iNaturalist and GBIF

iNat. RG ID records are also added to GBIF, which also adds records from sources including museum specimens. GBIF is a great source, although parts of it’s taxonomy need updating. It lists some previous (now invalid) taxon names, which prevent the corresponding iNat taxa page maps (which use the current valid name) from displaying the GBIF data filter. GBIF also uses some previous (now invalid) taxonomic rank statuses, e.g. subspecies that were since elevated to species rank elsewhere in literature.

What if anything can be done to give feedback or correction to GBIF to work toward a more fully aligned taxonomy without name or rank mismatches?

Curation is a major effort investment on iNat. alone, which is unfeasible if needed to be done manually on multiple individual databases (GBIF and others, e.g. ITIS). Any solution toward greater standardization where corrections would only need to be made in one place would be very useful to all databases.

Examples:

  • Name mismatch between iNat and GBIF: Lasioglossum zephyrus vs. zephyrum (as a result, the GBIF map filter isn’t available for the species on iNat’s taxa page map).

  • GBIF receives image of museum specimens and uses the names written on the physical specimen labels in the photos, but those are often now-invalid due to having been revised since in academic literature. This can affect the name and taxonomic rank (e.g. species vs. subspecies).

1 Like

I see you edited your post to remove an example you were using, but I will use it here.

GBIF aggregates occurrence data containing taxon names from multiple sources. To provide some level of standardization, especially with regard to synonymy and higher classification, those names are mapped to the GBIF taxonomy database. That separate database is also an aggregation/integration of data from multiple sources.

In the case of your (now deleted) Lasioglossum zephyrum example.

If you look at the GBIF page …
https://www.gbif.org/species/1353533
You see it says “Source: Cataloge of Life Checklist”
and if you go the CoL annual checklist and look up the name …
https://www.catalogueoflife.org/data/taxon/3SFC6
You see the source of name (and synonyms/higher classification) is ITIS …
https://www.itis.gov/servlet/SingleRpt/SingleRpt?search_topic=TSN&search_value=759587#null
So in this case there is a single source of the data, and that is usually the case. That is where you need to direct your feedback about errors.

The sources of data feeding into the global Catalogue of Life are of variable quality and currency. For some groups it is very likely the iNat taxonomy will be better, especially for easily observed/identified taxa that are dominant in iNat observations, and that largely because the maintenance of the iNat taxonomy is crowd sourced and dynamic. It’s a good model that could be adopted by entities like ITIS and the Catalogue of Life. The flip-side is that often the ‘Global Species Databases’ that feed into CoL are maintained by leading experts who might find the iNat model challenging.

Do you know if it would be possible or worthwhile to ever contact GBIF themselves, for any of the various issues?

Generally no, there is no point.
GBIF aggregates data from multiple providers. You need to track back to the source of the data to provide feedback. For occurrence records there is (or was) a mechanism where feedback could be entered on the GBIF site and it was automatically directed to the source data provider. I don’t believe there is an equivalent for the data used to synthesize the taxonomic backbone database.

In that case do you have experience or advice about contacting any of the other sources? (especially to give corrections) I have contacted ITIS before and they replied, but it wasn’t about corrections.

I agree, standardization between platforms would be ideal, if ever possible (ITIS, CoL, also Discover Life, BugGuide, etc).

The topic wasn’t replied to initially, so I tried to simplify it. I’ll add some details back now. The Lasioglossum example is also complex: the GBIF name is incorrect (per taxonomic code), yet GBIF’s was based on an academic publication.

Remaining questions:

Is more standardization and corrections possible and worth pursuing?

Also, GBIF sometimes creates invalid names or taxonomic ranks when receiving museum collection specimen photos with physical determination labels showing names which were since revised in literature.

A major issue that these issues bring up is it would be better if GBIF didn’t use multiple various sources for taxa names/ranks, because by doing so it creates more invalid names/ranks.

Finally, what happens if iNat. adds a species GBIF doesn’t even have in any form? Does GBIF add the species via the iNat RG records, or can it not add it?

Here is GBIF’s information on their taxonomic backbone …
GBIF Backbone Taxonomy

these are common problems not only of these two portals, so we should always rely on current scientific publications in a given area, otherwise we will, unfortunately, run into errors; the solution would be to hire professional staff to edit such content, but it is too expensive

The iNat taxonomy is based on established authority lists (where they exist) and is generally not based on current publications. The reasons for that are stated in the curator’s guide …
Curator Guide · iNaturalist

Some iNat taxon rank and name changes are based on publications too. I created curation flags to request taxon changes based on articles, some of which have been completed (e.g. adding species). So, iNat uses both reference/authority database or handbook sources, and sometimes publications, essentially on a case by case basis.

Re: the main topic, I wonder if each major online database (e.g. GBIF, EOL, ITIS, iNat, Discover Life), or specifically those which add citizen science data (iNat, GBIF, DL), would consider solutions toward greater standardization if someone contacted them, explained the current issues, and suggested solutions like more editing or greater standardization.

The idea is, it would be ideal to have one “upstream” database/source which all taxon changes would be made to, which would then automatically be made in all “downstream” sources. Or similarly, for each platform to have equal standing, but where changes to any would reflect on all others. If possible also, it would help for the standardization to only apply for shared taxa, but where any of the platforms could also potentially have additional species (like GBIF currently does), which wouldn’t be affected by standardization.

In the event no standardization is possible to create, it would at least be better if GBIF noted/flagged taxon rank or name mismatches more clearly, and people responded to fix those issues, or if GBIF accepted feedback or direct editing changes (e.g. if some iNat users volunteered to make GBIF corrections).

Some things can also be done on the end of iNat users too even currently, since the issues aren’t entirely on GBIF’s end. Users can look for taxa pages with maps with the GBIF filter missing. For each, they can if GBIF (the site) has data points on it’s map. If it does, it means there’s a name mismatch between iNat and GBIF. And then possibly those iNat taxa pages could be flagged for attention (to address the iNat-GBIF mismatch).

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.