Download +10.000 pictures from iNat user

Hello iNat users,

This is my first post on this platform, so please bear with me. I have a question regarding downloading pictures using iNaturalist’s API. I’ve written an R script using the rinat package, and it works perfectly fine for me. However, I have two issues that I need help with.

Currently, I am trying to save my pictures on my computer using the iNaturalist API. I have written an R script to do this using the rinat package which works well. Here is the script:

Upload observations

obs ← get_inat_obs_user(username = “louis_aureglia”, maxresults = 10000)

Download

for (i in 1:nrow(obs)) {
url ← obs$image_url[i]
picture_raw ← readJPEG(getURLContent(url),native=TRUE)
plot(0:1,0:1,type=“n”,ann=FALSE,axes=FALSE)
picture_reel ← rasterImage(picture_raw,0,0,1,1)
picture_name ← paste0(obs$scientific_name[i], “_”, obs$id[i])
writeJPEG(picture_raw, file.path(paste0(“C:/Users/laureglia/Desktop/Projets/9. iNaturalist save/Pictures/”, picture_name, “.jpg”)))
}

However, I am facing two problems with this script:

  • Firstly, I have more than 10,000 observations, and I couldn’t download my older observations. The function get_inat_obs_user() doesn’t seem to select a precise date to upload my observations from the API. Do you have a solution to this?

  • Secondly, the quality of the pictures that were downloaded is not as good as the quality of the pictures that I uploaded initially. Do you know another way to download these pictures in better quality?

This is the first time that I am using the API, so feel free to correct me on any points. Thanks.

I can’t help with any of the API stuff, but a note about picture quality-- I think iNat compresses all images to a maximum of 2048x2048 pixels (or 2048 on longest side), so if you uploaded pictures larger than that, those extra pixels were lost.

4 Likes

It looks like rinat won’t let you add a date or taxon restriction to a user-based call, nor add a user restriction to the general obs call, so I’m not sure there’s a good way around the 10000 restriction with rinat. I think you would have to call the API yourself directly to get more than 10000.

It’s true that iNat downsamples photos to 2048x2048 maximum, but rinat isn’t even returning the highest quality image that iNat stores. It’s returning the “medium” image instead of the “original” image. Again, I don’t think rinat lets you select the image quality, but you could swap them yourself, e.g. https://inaturalist-open-data.s3.amazonaws.com/photos/353131176/medium.jpeg to https://inaturalist-open-data.s3.amazonaws.com/photos/353131176/original.jpeg.

I would personally not recommend using rinat in general because it’s using the v0 API instead of the v1 API (and seems unlikely to use the v2 API which is in development).

2 Likes

@jwidness

Thank you for your feedback. I didn’t realize that the pictures I downloaded were only in medium size and not the original size.Additionally, I now understand the reason for the 10,000 data restriction. I find it more efficient to download pictures directly from the CSV file, and the script is simpler to use. Thank you again for your help, I feel more confident in tackling this task now.

1. DL pictures with csv ----------------------------------------------------------------------

obs_xlsx ← read.xlsx(“C:/Users/laureglia/Desktop/Projets/9. iNaturalist save/2024_01_28_iNaturalist_LA.xlsx”)
str(obs_xlsx)
head(obs_xlsx$image_url)

Replace “medium” with “original”

obs_xlsx$image_original_url ← gsub(“medium”, “original”, obs_xlsx$image_url)
head(obs_xlsx$image_original_url)

Download

for (i in 1:nrow(obs_xlsx)) {
url ← obs_xlsx$image_original_url[i]
picture_raw ← readJPEG(getURLContent(url),native=TRUE)
plot(0:1,0:1,type=“n”,ann=FALSE,axes=FALSE)
picture_reel ← rasterImage(picture_raw,0,0,1,1)
picture_name ← paste0(obs_xlsx$scientific_name[i], “_”, obs_xlsx$id[i])
writeJPEG(picture_raw, file.path(paste0(“C:/Users/laureglia/Desktop/Projets/9. iNaturalist save/Pictures/”, picture_name, “.jpg”)))
}

"Thank you very much for your response. We can close the topic now :wink:

1 Like

It may also be worth noting here the API rate restrictions:

“Please note that we throttle API usage to a max of 100 requests per minute, though we ask that you try to keep it to 60 requests per minute or lower, and to keep under 10,000 requests per day. If we notice usage that has serious impact on our performance we may institute blocks without notification.”

For images not in the AWS data set:

“Downloading over 5 GB of media per hour or 24 GB of media per day may result in a permanent block”

3 Likes

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.