from that, you could get the âtitleâ from ânlwikiâ to build a link to https://nl.wikipedia.org/wiki/Westpark_(Groningen). (youâd have to pick the right Wikipedia instance to match the preferred location, if there are articles for multiple Wikipedia instances.)
if you need to look up the Wikidata entity ID based on the iNaturalist place ID (160964), then the query would be something like this:
Getting closer. Can you put that in a curl command so I can see the endpoint and method in addition to the request body (which I presume is the sparql query)?
That takes 12s, so thereâs no way weâre going to make that request dynamically, for taxa or places, especially if we then have to make additional queries to map a Wikidata identifier like Q36341 to a Wikipedia page URL like https://en.wikipedia.org/wiki/Brown_bear. I guess we could have some process that grinds through the taxa on a regular basis and stores Wikipedia URLs. Can anyone figure a way to get the corresponding Wikipedia URLs for all locales in a single request given a bunch of iNat IDs (for taxa or places)?
A couple of things, Iâm very much a basic sparql author so it is entirely possible there is a much more efficient way to run it faster. Other sparql fluent folks may be able to help. You can probably delete the wikibase:label service call as you neither need nor care what language the item label is returned in.
In my code library at home I have a query that returns all taxa in Wikidata that have an associated entry for their inat identifier propery. I assume that could be run and results stored periodically?
One thing you may have to account for is it may be possible, in fact I am virtually certain it is possible that multiple Wikidata entities have the same inat ID listed. If you do this a warning flag is added to the entry, but I believe it still physically saves the data.
It should also be possible to get the Wikipedia url directly in the same query, i just did not realize you wanted that. In theory in fact you can get the url for any and all languages that have a Wikipedia page.
Correct would be to ask Wikipedia for the english article corresponding to the iNaturalist place id 152114 and as the Wikidata property is Property:P7471 then this hub tool can help
you could check out the tool to see exactly what itâs doing, or maybe it could just be used directly rather than trying to replicate its functionality.
One thing likely slowing it down is you might query for any level in the taxonomic tree. If you know for example you only want species you can filter for that which dramatically cuts the query time as it applies a filter. I have saved queries that I know I only want species level for and they run in milliseconds. Not sure if there is a way to replicate that.
Wow, a lot happened on this thread overnight! Firstly, youâve overcomplicated the query by calling for all iNat places and then filtering for the one you want. Instead you can just ask for the one you want. https://w.wiki/gzX
But I actually agree with your conclusion that it is not good to do this on the fly. I suggest you run a query once every day or so to get the whole list, so that as soon as a user goes to any taxon page, the Wikipedia article (in whichever language) is already connected.
It turns out the reason I joined the forum is because Iâve developed a browser extension (called Entity Explosion) which actually does an end-run around websitesâ âmore infoâ links and allows users to directly navigate to identical items on other websites, as long as they are linked on Wikidata (which is why I wanted to start linking your place data).
It is pretty new, so only has about 500 users so far, but Iâm confident that it plays a role that no other tool yet plays, and works well on iNaturalist (but also about 5000 other sites). I recommend you all try it out: https://www.wikidata.org/wiki/Wikidata:Entity_Explosion
Here are some demonstrations. The first two are homonyms, so will always fail if you are just matching strings. The third is an example of coming to iNat from elsewhere (in this case the Dutch Wikipedia page):
Still think a significant performance hit is coming from having to dynamically search for the iNat ID to match it to a WikiData Q item.
I changed it slightly to also show how to get the Wikipedia article, but that slows it even more. The question is do you only want the English article or multiple languages ? I made it return the English and Danish articles but this approach is not scalable nor a viable format.
A better option is you can list all the Wikipedia articles for a given item in any language via this
If you know the Q item this example lists all Wikipedia articles for all taxa set as children in the hierarchy of genus Ursus (Q243359 is the Wikidata page or genus Ursus):
Trying to do the same query above where you dont know the Q item and dynamically search for one that corresponds to a given iNat ID times out.
Better sparql authors than I may be able to suggest how to optimize the search, but clearly dynamically searching for a given iNat ID is very impactful on search time
i think this might get you closer to what youâre describing here:
SELECT ?iNatTaxon ?wdTaxon ?wdTaxonName ?member ?memberName ?memberLabel ?memberRankLabel ?wpLang ?wpArticleLink ?wpArticleName WHERE {
#1. define a list of iNat taxa IDs that you want to look up in Wikidata
VALUES ?iNatTaxon {"43328" "41636"} . #horse and bear families
#2. get the corresponding Wikidata IDs
?wdTaxon wdt:P3151 ?iNatTaxon ;
wdt:P225 ?wdTaxonName .
#3. get the members of the WD Taxa. (for example, if you start with a family-level taxon, this will get the family and all the member genera, species, subspecies, etc. that are defined in Wikidata.)
?member wdt:P171* ?wdTaxon ;
wdt:P105 ?memberRank ;
wdt:P225 ?memberName .
#3b. you can also just filter by, say, species
# FILTER(?memberRank in (wd:Q7432)) .
#4. if one exists, get a Wikipedia article for each of the members
OPTIONAL {
?wpArticleLink schema:about ?member ;
schema:inLanguage ?wpLang ;
schema:name ?wpArticleName ;
schema:isPartOf [ wikibase:wikiGroup "wikipedia" ] .
FILTER(?wpLang in ('en')) . #define which version of the Wikipedia article you want
}
#5. define your language preferences for Labels
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE]","en" }
}
ORDER BY ?wdTaxonName ?memberName ?wpLang
Interesting that doing the filter as a Values statement seems faster. There does seem to be a cross-join or something in the returned data, all child species are returning the iNat ID of the parent. So for instance all bears in the list are returning 41636 as the iNat id which is the iNat ID of bears, not the iNat of the species.
yes. that âiNatTaxonâ is supposed to represent the input taxon ID (of the parent), not the iNat taxon ID for the members. if you want to also see the latter thing, you could do something like this:
SELECT ?iNatTaxon ?wdTaxon ?wdTaxonName ?member ?memberName ?memberLabel ?memberRankLabel ?memberiNatTaxon ?wpLang ?wpArticleLink ?wpArticleName WHERE {
#1. define a list of iNat taxa IDs that you want to look up in Wikidata
VALUES ?iNatTaxon {"43328" "41636"} . #horse and bear families
#2. get the corresponding Wikidata IDs
?wdTaxon wdt:P3151 ?iNatTaxon ;
wdt:P225 ?wdTaxonName .
#3. get the members of the WD Taxa. (for example, if you start with a family-level taxon, this will get the family and all the member genera, species, subspecies, etc. that are defined in Wikidata.)
?member wdt:P171* ?wdTaxon ;
wdt:P105 ?memberRank ;
wdt:P225 ?memberName .
#3b. you can also just filter by, say, species
# FILTER(?memberRank in (wd:Q7432)) .
#3c. if one exists, get the member's iNat ID from Wikidata
OPTIONAL { ?member wdt:P3151 ?memberiNatTaxon } .
#4. if one exists, get a Wikipedia article for each of the members
OPTIONAL {
?wpArticleLink schema:about ?member ;
schema:inLanguage ?wpLang ;
schema:name ?wpArticleName ;
schema:isPartOf [ wikibase:wikiGroup "wikipedia" ] .
#4b. define which version of the Wikipedia article you want
FILTER(?wpLang in ('en')) .
}
#5. define your language preferences for Labels
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE]","en" }
}
ORDER BY ?wdTaxonName ?memberName ?wpLang
yes. i just started using sparql <24 hrs ago, but it seems like filters may be applied at a later time in execution. so you retrieve a bunch of data first, and then you filter it down (inefficient), as opposed to just defining the specific entities you want up front (more efficient).
apparently the third way to limit your set (besides filter and values) is to union a bunch of statements like so:
SELECT ?item WHERE {
{ ?item wdt:P3151 "43328" }
UNION { ?item wdt:P3151 "41636" } .
}
the union method seems to be fast like the values method, but the union method obviously is more cumbersome to write out.
Yeah I dont pretend to understand anything about the internals of the SPARQL engine, but Iâm surprised it does not seem to optimize the query into the most efficient code possible regardless of where the qualifiers are (filters, values, where statement etc). All the options generate the same result set, so youâd think they could optimize it.
Where should we go to ask for optimalisation ? Is there an email adres, forum or loket somewhere ?
Is Wikimedia Category Eelderbaan connected to Eelderbaanâ a âQ100257347â number
Is Wikimedia Category âRoege Bosâ related to " Q100257307â
Is Wikimedia Category "Westpark " related to âQ100256456â
on the Wikidata Query Service page (https://query.wikidata.org/), there is a Help menu at the top that gives you an option to provide some feedback. thatâs probably where i would start.
Just to be clear, I have no idea if the query engine optimizes anything or not. If not, there must be a very good reason why one has not been added after the site has been active for years.
Where? Online, using BigBlueButton. Please register here (free of charge, a valid email address is needed). The link to the call will then be sent to you.
What? Discussing about what makes you feel enthusiastic about/with Wikidata
Who can join? Everyone! The call will be moderated by various people (see schedule below)
Ground rules: all participants must complied to the code of conduct. No recording allowed. No screenshot allowed without express consent from all participants. Showing video is not mandatory, sound-only is perfectly OK.
Contacts: in case of technical problems, if you have trouble joining the call, etc. feel free to contact Lea Lacroix (WMDE). During the meetup, you can write a private message to the person taking care of moderation (having â(moderation)â in their nickname, also mentioned in the table below). If you want to want to report a problem with a participantâs behaviour, you can contact techconductwikimedia.org (see details)