Use Wikidata to link to appropriate Wikipedia articles in all languages

Thanks to @bouteloua for pointing me to this thread. A couple of notes:

  • matching iNat taxa to Wikidata items would not rely on strings, as suggested earlier in this thread, but on Wikidata’s and iNat’s IDs. Every Wikidata item has a free-form label in multiple languages. While the text label is subject to change, the numeric ID of the item (what Wikidata calls a “QID”) is persistent. E.g. item Q630829 represents Larus occidentalis and it has labels in 47 different languages. Each Wikidata item also contains links to the corresponding Wikipedia articles, when they exist. There are currently Wikipedia articles about Larus occidentalis in 28 language editions, from English to Navajo or Hungarian. Finally, each Wikidata item also contains links to many external databases in the Identifiers section, for example the item about Larus occidentalis links to 21 other databases, from NCBI to eBird etc.
  • Now, one of these identifiers is the iNaturalist Taxon ID. We created this property a while ago for the purpose of reconciling iNat taxa and Wikidata items and, as far as I can tell, it has been extensively populated in Wikidata (it’s currently used by over half a million Wikidata items). The first batch of iNat IDs mapped to Wikidata via Mix’n’Match got imported by a script (since the property didn’t exist at that time), but future edits made in Mix’N’Match should go live on Wikidata immediately. I don’t know how often Magnus Manske refreshes the Mix’N’Match catalog with new data dumps from iNat but we can ask him :)
  • So why does this all matter? Having an iNat ID—>Wikidata QID mapping means that it’s straightforward to automatically retrieve all the associated Wikipedia articles for a given iNat taxon, irrespective of spelling or variation in the title. How to best ingest these links depends on what works best for iNaturalist, but this is by far the best mechanism to have correct matches from iNat to Wikipedia instead of relying on heuristics that match labels.
  • Finally, I’d like to make a case for adding a Wikidata link directly in the “More info” list. There’s a lot of information about taxa in Wikidata that would be valuable to iNat users, if the link were displayed alongside GBIF, BOLD, Google Scholar and others.
7 Likes