I’m hoping to do some species distribution modelling of my favourite species, but I worry about some areas having smaller samples because of fewer iNaturalist users. I could use population roughly, but I imagine some areas might have a higher proportion of users than others. Is there a way to download just the geographic distribution of observation density as a way of controlling for this?
Cheers and thanks!
Welcome to the Forum, @katieemarshall! Intriguing question. I’m sure there are people on the Forums who have experience with this sort of modeling who can help.
Just a word of caution. Even with a geographic distribution of observation density, I would still take iNaturalist data with a grain of salt. Most observers on iNaturalist will make observations that are quite skewed to the novel. So where a species is rare because it is near or outside its natural range those types of observations will have a much higher chance of being recorded versus potential observations in the core of a species range where it is common and likely to be overlooked.
when you go to the map view in the Explore page on the iNaturalist website, you’ll see that at low zoom levels, it is essentially a gridded density map. if you would be happy with being able to get the observation count associated with each of the grid cells in that map, then yes, it’s possible to get that data using the iNaturalist API’s Grid Tiles UTFGrid endpoint (https://api.inaturalist.org/v1/docs/#!/UTFGrid/get_grid_zoom_x_y_grid_json), though the finer the grid (the higher the zoom) the more data you would have to get and the more likely you are to run into iNatrualist’s API limits.
the alternative is to download your set of observations, define your own grid, and then figure out density within that grid yourself, but again, you’re limited, this time to something like 200,000 observations per download. (maybe you have more flexibility if you get the data out of GBIF though…)
you have a little more control if you go with the latter approach, but if you’re dealing with, say, millions of observations in your set of data or are working with lots of different sets of data, it might be easier to go with the first approach, at least as a quick way to visualize data at a high level for further exploration via the latter approach.
i’ve previously posted a few of things that sort of touch on how to exploit the UTFGrid data to get density info. you can take a look at these, if you’re interested, and let me know if you would like to know more about the general concept:
taking a look at the items in the list above, you might be able to adapt the second thing in the list to compare a set of observations against all iNaturalist observations (instead of comparing the same set of data from one period to another), if that’s sort of what you’re trying to see…
This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.