I’d like to be able to match GBIF taxon identifiers with their respective iNaturalist taxon identifiers (example GBIF:5141342 = iNaturalist:49972). iNaturalist periodically dumps observations to GBIF, so somewhere there must exist a database or something to translate between taxon IDs. I understand the taxonomy backbones probably aren’t one-to-one (for example iNat may include a taxon that GBIF doesn’t) and that’s ok. Does anyone know a straightforward way to go between GBIF and iNat taxon IDs?
iNaturalist taxon pages do list other external taxon identifiers (example) including GBIF, but iNat’s get_taxa_id API endpoint doesn’t appear to return the GBIF ID.
Some possible solutions are to go through WikiData (as suggested by @pisum or maybe the ChecklistBank API, but I’d love to find a more direct go-between and ideally search on the GBIF taxon ID and get the corresponding iNat taxon ID.
Welcome to the forum!
iNat doesn’t keep track of GBIF taxon identifiers for the purposes of data export. Instead it sends over the scientific name and GBIF uses its name matching tool.
Some taxa know their GBIF id through a taxon scheme, but this is not available in the API, see here.
Assuming you have the species (scientific) name associated with the GBIF id, I think you can use the get_taxa endpointq parameter to submit a query for the name and get the iNat id. You may not get anything in some cases though and you will have to sort through the results to get the corresponding name if there is one.
At least historically the GBIF ‘Taxon IDs’ were non-persistent and were identifiers for instances of name-strings, not taxon concepts. So, any linkages using these identifiers would have transient validity. That may have changed.
Personally I would have preferred just 49972 as the identifier, but iNat has decided to publish the DwC archive with the website URLs instead. Maybe that should be reviewed?
You can then list related names from any of the other datasets by using this API call (53147 being the GBIF Backbone datasetKey):
The crossmapping is based on names, not taxa. We have also planned to work on a taxon concept mapping more like Avibase does, but that is not on our immediate priority list right now.
Thanks @jwidness for the tip regarding Wikidata. For anyone who needs it, here’s a Wikidata SPARQL query that takes a GBIF taxon id (“5141342” in this example) and returns the corresponding (1) iNaturalist taxon id and (2) ITIS TSN, if known.