Wikipedia content not recognized by iNat place

Here’s a new place: https://www.inaturalist.org/places/camel-s-hump-forest-reserve

If you click the About tab, it says “There is no Wikipedia page” but in fact there is: https://en.wikipedia.org/wiki/Camel’s%20Hump%20Forest%20Reserve

Details: This place was created about 36 hours ago. The wikipedia page was created about 16 hours ago.

2 Likes

I would think the wikipedia content would have been incorporated into this new place by now. Is this a manual process?

i wonder if it has to do with the apostrophe in the place name? i wish there was a way to manually link an iNat place to a Wikipedia page so that you could link names even if they didn’t match exactly…

Thanks for the reply.

I was wondering the same thing. I don’t really want to give up the quote, though.

+1

So far I have created three places:

https://www.inaturalist.org/places/camel-s-hump-forest-reserve
https://www.inaturalist.org/places/camel-s-hump-national-natural-landmark
https://www.inaturalist.org/places/camel-s-hump-state-park

Wikipedia content still will not load for the first two. However, for the third place (which I just created moments ago), the wikipedia content loaded immediately.

I’m not sure what’s going on here. Something may have changed on the back end, I don’t know. Any ideas?

Your second one (national natural landmark) does not appear to have a corresponding Wikipedia page.

Not sure why the 3rd one would be working, but the first one not, since the iNat vs. Wikipedia name patterns seem to be the same in both cases. Maybe different kinds of apostrophes? If you edited the name of the first one to match, maybe it doesn’t repeat the Wikipedia search, or still uses the original name. You could try deleting and re-creating the forest reserve place with an exactly matching name, and see if it links up then.

Yes, you’re right of course, I don’t know what I was thinking.

I deleted and re-created the following place as requested:

https://www.inaturalist.org/places/camel-s-hump-forest-reserve

Still no wikipedia content. Okay, so I created a new place:

https://www.inaturalist.org/places/camel-s-hump-state-forest

The corresponding wikipedia page loads fine on the About tab but interestingly the content is stale. None of the changes I made to the wikipedia page during the last two days appear on the About tab. So clearly iNat processes wikipedia content statically, not dynamically.

Now we’re getting warm :-) The following wikipedia page was created ten days ago:

https://en.wikipedia.org/wiki/Camel's_Hump_Forest_Reserve

Could it be that iNat has not yet processed this wikipedia content? Does anyone know the schedule?

2 Likes

i don’t know Ruby so that i can visualize what it’s doing exactly, but the iNaturalist Wikipedia service code (https://github.com/inaturalist/inaturalist/blob/a75bcc94f479f5f652d0b5386e25dab3ae7ae878/lib/wikipedia_service.rb) seems to reference a cache length of 720 hours (30 days). maybe that’s related?

https://www.inaturalist.org/places/wikipedia/Camel’s%20Hump%20Forest%20Reserve
https://www.inaturalist.org/places/wikipedia/Camel’s%20Hump%20State%20Park

Thanks! That explains everything I’ve seen so far. If that’s true, then the “bug” will eventually be revealed. Since this thread was started 13 days ago, we have at most 17 days to wait ;-)

I did another experiment to rule out the apostrophe as the cause. I created a new place:

https://www.inaturalist.org/places/huntington-gap-wildlife-management-area

When I created the place about three-and-a-half hours ago, there was no corresponding wikipedia article. Since then, I created a new wikipedia page:

https://en.wikipedia.org/wiki/Huntington_Gap_Wildlife_Management_Area

However, there’s still no wikipedia content on the About tab. If the hypothesis above is true, this will fix itself when the Ruby script finally runs.

Oh, I didn’t know about that, thanks. Actually, you’ve identified another bug since the apostrophe is a reserved URI character that needs to be percent-encoded.

1 Like

you could probably verify this sooner by comparing the iNat version of Wikipedia content vs the Wikipedia revision history. i assume iNaturalist stores a snapshot of content when someone looks for it, and then it relies on that copy until it expires (after 30 days). so on actively viewed places, i would assume the content will always be somewhere between 0 and 30 days old.

That doesn’t explain why these places have no wikipedia content on the About tab:

https://www.inaturalist.org/places/camel-s-hump-forest-reserve
https://www.inaturalist.org/places/huntington-gap-wildlife-management-area

Unless I’m missing something, if an iNat place is created before the wikipedia article is created, you have to wait up to 30 days for the content to appear on the About tab.

The wikipedia content on the About tab of this page is stale:

https://www.inaturalist.org/places/camel-s-hump-state-forest

iNat cached the wikipedia page on 2/12 but the page was last modified on 2/13. I don’t know how to get iNat to refresh its cache.

i think the way the pull of data is initiated is when someone pulls up the about tab for the place that first time. so if you went to the tab prior to the creation of the WIkipedia page, i think iNaturalist will assume not information for 30 days. you can test this out by first creating a W page and then creating the iNat place. (it has to be an entirely new place, i think, otherwise, the system may have still remembered the name of a previous place.)

@pisum Thanks for hanging in there and hammering out the details of this bug.

TIP. A cache timestamp is embedded in the source. For example, consider the following cached file:

https://www.inaturalist.org/places/wikipedia/Camel's%20Hump%20State%20Park

The timestamp is at the very bottom of the HTML source:

<!-- Saved in parser cache with key enwiki:pcache:idhash:36421533-0!canonical and timestamp 20200213184749 and revision id 940634623
 -->

Yes, I’ve done this experiment at least twice since I started this thread. I created this iNat place after the corresponding Wikipedia article had been created:

https://www.inaturalist.org/places/camel-s-hump-state-park
https://www.inaturalist.org/places/wikipedia/Camel's%20Hump%20State%20Park

The Wikipedia content was loaded immediately on the About tab and AFAICT the cache file is up to date.

I also created this iNat place after the corresponding Wikipedia article had been created:

https://www.inaturalist.org/places/camel-s-hump-state-forest
https://www.inaturalist.org/places/wikipedia/Camel's%20Hump%20State%20Forest

The Wikipedia content loaded immediately on the About tab but the cache file is now stale:

<!-- Saved in parser cache with key enwiki:pcache:idhash:53096390-0!canonical and timestamp 20200128004304 and revision id 937862474
 -->

You can confirm the content is stale by inspecting the Wikipedia history.

Bottom line: I don’t know what triggers a cache replace.

… probably because State Park was created in iNat on the 13th (and you first used the About tab that day), and there haven’t been any changes in Wikipedia since then. i’m just guessing, but i think any changes you make in Wikipedia won’t be reflected in iNaturalist until you visit the About tab on March 14th.

… probably because the content in Wikipedia has been changed since the 14th (when the iNat place was created and you first visited the About tab). iNat is probably pulling a copy of the Wikipedia page as it existed on the 14th. probably it won’t pull the latest from Wikipedia again until you visit the About tab on March 15th.

probably the trigger is either (1) the first visit to the About tab for a place (assuming no other existing places with the same name), or (2) the first subsequent visit >30 days after the date of the snapshot of the page. (in other words, the first snapshot would be taken that first About tab visit, and the second snapshot would be taken on the first About tab visit 30 days after the first About tab visit… if things are working the way i suspect they’re woking.)

@trscavo – just FYI – the Wikipedia description seems to be pulled in for the Forest Reserve.

@trscavo – i wonder if this bug report can be closed at this point?

it seems pretty clear to me that the problem you noted initially has to do with the length of the interval between snapshots of the Wikipedia page. that seems to be an intentional design choice rather than a bug. here’s another thread where we noted a similar lag between activity in Wikipedia and the eventual reflection of those changes over in iNat: https://forum.inaturalist.org/t/creating-inaturalist-places-and-linking-to-wikidata-using-geojson/12220/19.

1 Like