Can't import newer images from iNaturalist to Wikimedia Commons

Whenever I try to import certain images from iNat to Wikimedia Commons, I get a “403 Forbidden” server error. It looks like this is because the iNat images are no longer being hosted at https://static.inaturalist.org/ but at https://inaturalist-open-data.s3.amazonaws.com/ instead. Is inaturalist-open-data.s3.amazonaws.com going to be the new domain name for all images going forward? Are older images going to be moved to this domain name?

1 Like

With opening the following url:
https://static.inaturalist.org/photos/113564633/medium.jpeg?1613910715

I receive the following error:

403 ERROR

The request could not be satisfied.

Request blocked. We can’t connect to the server for this app or website at this time. There might be too much traffic or a configuration error. Try again later, or contact the app or website owner.
If you provide content to customers through CloudFront, you can find steps to troubleshoot and help prevent this error by reviewing the CloudFront documentation.

Generated by cloudfront (custom)

OS: Android / Windows
Browsers: Chrome

There have been some intentional changes to the photo URLs, but I don’t know the whole story behind them. Hopefully someone from staff will chime in.

Thanks for explaining the static url change was intended.
Also I tried to import a picture to Wikimedia Commons.

Created a fix for this:
https://github.com/kaldari/iNaturalist2Commons/pull/8

And also this is needed:
https://phabricator.wikimedia.org/T275318#6846865

Tony has been notified of this issue previously, the post was made into a private message, which is why you can’t find it.

I confirm that the “static” URLs mentioned above don’t work, but some still do, for example:
https://static.inaturalist.org/photos/83192849/medium.jpg?1594183306
(at least it works for me at this time).

Both tasks I mentioned earlier are done and now it works again. Looks to me this issue can be closed.

@tiwane - Do you know if inaturalist-open-data.s3.amazonaws.com is a reliable domain name to use for the future, or is it possible it will change again? If it is likely to change, do you know how so? I’m asking because the domain name has to be whitelisted on Wikimedia’s end and we can use wildcards in the whitelist entry, for example, inaturalist-open-data.*.amazonaws.com. What would you suggest we add as the whitelist entry?

Hi all - this is because we’re in the process of making use of the Amazon Open Data Sponsorship Program - here’s some background:

All photos on iNaturalist are hosted on servers we rent from Amazon (specifically using their S3 service). You can think of this as a big bucket of images where each image is identified by a specific public URL. The iNaturalist database keeps track of which photo is filed under which URL allowing the platform to fetch and display photos as needed when you’re using the app or website. As iNaturalist grows, photo storage costs are growing too. We’re currently paying over $10,000 a month in photo storage costs and expect that these costs would double in the next couple of years under iNat’s growth trajectory and current system. This includes the cost of storing the photos (storage) and costs associated with sending the photos over the internet whenever anyone loads a page that displays one or otherwise downloads one (bandwidth).

Luckily, we’ve successfully applied to the Amazon Open Data Sponsorship Program (AODSP) which means Amazon will kindly pick up the bill associated with some of the photos in the bucket, under some conditions. We’re super excited about this because it will help us control costs and, as we’ll describe shortly, will make it a lot easier for groups hoping to access and use iNaturalist data to do so.

The big caveat is that only open-licensed (e.g. Creative Commons) and unlicensed (CC0 / public domain) data are eligible to be hosted in the AODSP. To accommodate this, we set up a new bucket that only contains this open-licensed and unlicensed content, while unlicensed photos remain in the old bucket. We’re also in the process of moving around 70 million licensed photos from the old bucket to the new bucket.

We wanted to let you all know about this now because you’ve already noticed that the URLs associated with photos changing as they are moved to the new bucket. But besides this nothing else should change.

Currently about 66% of iNaturalist photos have compatible licenses and thus are eligible to go into this bucket. Down the road we’d love to encourage people to license their photos if they haven’t already so that these data are more widely usable and so we can take advantage of the cost savings. But since changing your account photo license now triggers a job moving all your photos from one bucket to another please hold off on that for now until we announce this more formally in March.

Another goal of enrolling in this program is to make it easier for others to access iNaturalist photos for research and other uses. Right now this new AODSP bucket is hard to make sense of but we’re planning on including additional metadata to make it a usable standalone dataset. We will start posting a weekly snapshot of associated metadata and provide documentation on how you can make use of these data. We really hope we can build a more visible community of iNaturalist data users around these new features. Keep in mind that nothing is changing about who can access these data, rather we are reorganizing openly licensed photos into their own bucket to make them easier to be used for research, and to take advantage of Amazon’s support in helping iNaturalist offset costs.

10 Likes

There’s some more information on the upcoming Amazon Open Data Sponsorship Program launch in today’s blog post.

2 Likes