that would be interesting – a map that shows species diversity. should be doable for small sets of data (either small sets of taxa or observation sets <= 200K), but it probably would not super efficient at scale via the iNat API.
If you have access to ArcGIS you can do this without too much problem. It’s an interesting step for further analysis too. Arc runs spatial stats and actually generates “hot spots” where density of species is outside the expected value.
It’s worth noting here that the iNaturalist ‘heat maps’ aren’t real heat maps - they don’t show the density of observations, they only shade regions in which there is at least one observation. They do look nice though!
i don’t see where it’s not visualizing something like observation density.
the true algorithm probably does something like spatial aggregation of observations (summary) within a grid → (effectively truncate at some upper limit within each grid cell) → translate to grayscale → gaussian blur (for smoothing) → translate to color gradient, but it’s effectively a density visualization.
here’s a version that shows my own observations in my home area:
the places where i have the most observations show up as red, and the places where observe the least show up as purple. the granularity is not very fine, the color gradient scaling can feel somewhat arbitrary, and there’s a bit of a halo around each blob (because of the gaussian blurring step), but at the end of the day, it looks like a heatmap depicting density to me.
what do you see that makes you think it’s just depicting absence / presence?
To my eye, these three areas look around the same size and colour so in a real heatmap I would expect them to have roughly the same number of observations. But they all have wildly different numbers - 1 has 188 observations, 2 has 57 observations, and 3 has 14 observations. To me that suggests it’s just depicting presence/absence because I can’t work out why else they would all be the same, other than that there’s a similar area covered by the observations in all of them.
first, let’s compare your blob 1 and blob 2. each is comprised of 2 cells, but they are actually slightly different. blob 1 is actually just slightly larger than blob 2 and slightly more yellow, reflecting blob 1’s greater number of observations. there’s a blob just to the northwest of blob 1 that is also comprised of 2 cells, but each of those cells has only one observation each. so that blob is darker and smaller than blob 1.
the reason blob 3 is as big as blobs 1 and 2, even with fewer observations, is because it’s comprised of 3 cells vs 2 cells.
you can also see some blobs that are more obviously comprised of single cells. there are 3 surrounding your blob 1, and there are 2 to the west of blob 3. note that the ones near blob 3 are darker than the ones near blob 1. this is because the cells for the ones near blob 3 have multiple observations in them, while the ones near blob 1 all have just single observations.
so observation density definitely is captured in the heatmap visualization, although it’s not always easy to tease density apart form the other things going on in the map (which i would characterize as artifacts of the heatmap creation process that i noted in my earlier post).
what you’re seeing as absence / presence is probably a function of the way things are aggregated in a grid + the scaling of the individual cells in the grid (there’s relatively low color separation between the cells in the example because at this map zoom level, you need a lot more observations in a cell to approach the red side of the scale) + the smoothing. the first 2 of these are effectively granularity artifacts.
i know that’s probably not a satisfying answer, but at the end of the day, i think the heatmap visualization is good enough to be considered a “true” heatmap. you probably shouldn’t use it in a scientific paper, but it’s good enough to help you find where you have observation hotspots most of the time.
yes. one of the challenges for creating a good visualization is choosing a scale that fits with your data – because on the one hand, you don’t want to drown out the peaks, but at the same time, you don’t want to lose the variations that occur within the rest of your data.
just for example, below are three views of your observations around Cairns in my own custom visualization that represents observation density in a grid with a color gradient that goes from purple (low) to magenta (high) on a linear scale.
what’s the best scale for this particular set of data? is it better to be able to see all the gradations in some of the less dense areas? or is it better to be able to see the true peaks? or is it better to fall somewhere in between? (it just sort of depends on the use case, right?)
so because it’s impossible to say in advance what will be the best scale to use in any given case, iNat seems to just use a somewhat arbitrary scale for as the base for creating the heatmap, and you just have to accept that it’ll work out better in some cases than in others.