Use Wikidata to link to appropriate Wikipedia articles in all languages

Unfortunately this is not something you can count on. There is no standard for naming, and in typical Wiki fashion agreement seems impossible. There is a current edit war going on between users setting the scientific name as the default and users replacing it with the English name.

1 Like

Right – “for the most part” as in the grand scheme of the number of taxa listed there.

2 Likes

Also just wanted to clarify that the connection doesn’t have to rely on a scientific name in iNat matching up to the title of the item on Wikidata, because the iNaturalist taxon_id is/can be linked to each Wikidata item.


Here's a higher visibility example that just came up: the iNat taxon for mammals is directing to a 2016 film called Mammal when iNaturalist is viewed in Dutch...

5 Likes

Using any label for matching seems like a bad idea, in any case. If we want to do text matching rather than manual linking via IDs, we should be looking exclusively at the taxon name property. And of course this matching should still be overriden by a manual matching to an iNaturalist taxon ID on the Wikidata side.

Edit: it seems most iNat taxa should already be matched to Wikidata properly anyway, according to the appropriate mixnmatch catalog, meaning there should be a very easy approach to just move to Wikidata ASAP.

3 Likes

I’m not really clear how to interpret that mixnmatch data, I have added numerous iNat taxa ID numbers, but my user name does not appear there at all, even users with 1 edit are there. Is it only showing edits made with that tool in the stats?

1 Like

Yes, I think so. It might be that the ones that already existed were “manually matched” by Manske via some script, or that the system skipped the already matched ones entirely. I have only used mix’n’match for stuff that wasn’t in Wikidata at all yet, so I’m not sure what it does otherwise.

Thanks to @bouteloua for pointing me to this thread. A couple of notes:

  • matching iNat taxa to Wikidata items would not rely on strings, as suggested earlier in this thread, but on Wikidata’s and iNat’s IDs. Every Wikidata item has a free-form label in multiple languages. While the text label is subject to change, the numeric ID of the item (what Wikidata calls a “QID”) is persistent. E.g. item Q630829 represents Larus occidentalis and it has labels in 47 different languages. Each Wikidata item also contains links to the corresponding Wikipedia articles, when they exist. There are currently Wikipedia articles about Larus occidentalis in 28 language editions, from English to Navajo or Hungarian. Finally, each Wikidata item also contains links to many external databases in the Identifiers section, for example the item about Larus occidentalis links to 21 other databases, from NCBI to eBird etc.
  • Now, one of these identifiers is the iNaturalist Taxon ID. We created this property a while ago for the purpose of reconciling iNat taxa and Wikidata items and, as far as I can tell, it has been extensively populated in Wikidata (it’s currently used by over half a million Wikidata items). The first batch of iNat IDs mapped to Wikidata via Mix’n’Match got imported by a script (since the property didn’t exist at that time), but future edits made in Mix’N’Match should go live on Wikidata immediately. I don’t know how often Magnus Manske refreshes the Mix’N’Match catalog with new data dumps from iNat but we can ask him :)
  • So why does this all matter? Having an iNat ID—>Wikidata QID mapping means that it’s straightforward to automatically retrieve all the associated Wikipedia articles for a given iNat taxon, irrespective of spelling or variation in the title. How to best ingest these links depends on what works best for iNaturalist, but this is by far the best mechanism to have correct matches from iNat to Wikipedia instead of relying on heuristics that match labels.
  • Finally, I’d like to make a case for adding a Wikidata link directly in the “More info” list. There’s a lot of information about taxa in Wikidata that would be valuable to iNat users, if the link were displayed alongside GBIF, BOLD, Google Scholar and others.
7 Likes

I was reporting some wrong links on Dutch pages and got linked to this. So if I’m understanding correctly there is currently no way to fix it?

3 Likes

Hi @lwgph, welcome to the iNat Forum. Yes, the only way to fix it currently would be to change the titles on the Dutch Wikipedia to exactly match the taxon on iNaturalist. But that won’t make sense in many cases (i.e. the name usually more commonly used to refer to something other than the taxon).

I hope this feature can get implemented to help communities who use iNaturalist in a language other than English.

3 Likes

So I get that my reports on wrong wikipedia links are directed here, but I also found a few instances where it uses eol.org and it shows a page of a completely different species and when I click the link it turns out it doesn’t exist. Those were also redirected here, but that seems a different problem to me. For example, https://www.inaturalist.org/taxa/92957-Anomala shows a page about a bird. Are there any solutions for that?

1 Like

@radrat, can you provide an example of such a straightforward API request?

1 Like

I found this because of this issue with Seek (https://github.com/inaturalist/SeekReactNative/issues/258). Mapping with Wikidata could solve lots of problems, like having local names and Wikipedia links in the appropiate language. With iNaturalist this is important, but in Seek is vital, because most children in the world don’t understand English.

4 Likes

Sorry for the late reply, this fell off my radar.

See the examples below provided by @rdmpage and @andrawaag:

What’s the most efficient and robust mechanism we can give the iNat developers for programmatically looking up a Wikidata QID for a given iNat taxon ID?

A SPARQL query will do it, e.g.

SELECT * WHERE { ?qid wdt:P3151 "342395" . OPTIONAL { ?qid wdt:P225 ?wikidataTaxonName. } }

Where 342395 is the iNaturalist Taxon ID. See: query / query results

If you URL encode the SPARQL query and append it to the URL then you will get a SPARQL result with the init ID (if it exists): https://query.wikidata.org/bigdata/namespace/wdq/sparql?query= Note that the iNat developers will likely want JSON, in which case they’ll want to use content negotiation.

@andrawaag also noted:

I am currently working on a DOI resolver for my upcoming Wikicite talk next wednesday. Happy to create a iNaturalist equivalent. If next week is early enough and the developers don’t mind python, I can add iNat identifier lookup to the Wikidata Integrator like we currently do for PMID and DOI. This is the code that would enable that.

2 Likes

Note that you can also look up directly the list of existing articles in various Wikipedia language editions associated with that Wikidata QID, in a single query.

1 Like

i wonder if this discussion should be merged with https://forum.inaturalist.org/t/use-wikidata-for-place-names-and-wikipedia-descriptions/7702 at this point?

seems like they’re converging (see https://forum.inaturalist.org/t/use-wikidata-for-place-names-and-wikipedia-descriptions/7702/31).

2 Likes

i think this has been implemented in https://github.com/inaturalist/inaturalist/commit/29ec770e47dec9de4b9e65e1293cb4c1e49ab7de. can anyone confirm?

Not sure if it is a function of all mappings not being done between iNat taxon ID and Wikidata, but I’m not seeing it (universally)

‘Ingen matchende artikel fra Wikipedia’ means no matching article from Wikipedia.

It’s Danish if anyone wants to try and replicate.

The species has an English Wikipedia article -> https://en.wikipedia.org/wiki/Familiar_bluet

And the linkage exists in Wikidata -> https://www.wikidata.org/wiki/Q1633993

I switched over to Swedish (even if they spell everything funny :wink: which has a Swedish Wikipedia language article https://sv.wikipedia.org/wiki/Enallagma_civile and the Wikidata mapping, but still get the no matching article message.

So either mappings are not done or it cant resolve it on this taxon, which was literally the first one I opened this morning.

1 Like

Someone had unchecked the box to automatically retrieve Wikipedia articles for Enallagma civile.

image

I checked the box, saved, and get these results:

1 Like

5 posts were split to a new topic: EOL content on About tab