Where do atlas boundary definitions come from?

iNaturalist Standard places are mostly buffered GADM shapes. Where do the shapes for the atlases come from? Below I have screenshots of the standard places for Baja California Sur and Mexico (which don’t quite match each other), and also of the atlas outlines for these same places (which also don’t match each other or the place shapes).

Baja California Sur place

Mexico

Atlas showing the Baja California Sur outline in black and the Mexico outline in orange

4 Likes

@loarie I think you might be the only one who knows…

1 Like

the reason they look different is that the display is doing some on-the-fly simplifying of the boundaries for performance reasons. But it is annoying because we loose some precision that exists in the actual geometries

So all 4 things above (Baja California Sur place, Mexico place, Baja California Sur atlas, Mexico atlas) have the same geometry in the database, they’re just displayed differently?

Would changing the polygon for the Baja California Sur place change any of the other 3? I’m asking because there are known issues with the atlas shapes and I’d be happy to work on some of them, but I don’t really know the process for updating atlas shapes. Would it be better to contact GADM and ask them to update on their end?

Hi Jane,

The issue with Atlases is that the ‘seams’ need to be perfect (ie no cracks) so that we can do queries like: “show me all observations of this species not in the US, MX, CA” and we don’t get observations falling in a crack between MX and the US. This makes editing these boundaries hard, because if you mess with one boundary if no longer perfectly lines up with the other etc.

Some issues with atlases:

  1. Natural Earth has admin 0 (countries) and admin 1 (states) data but no admin 2 (counties) and also at 1:10m, its not very high resolution which opens lots of issues like “I was in Mexico, why is my observation showing up in the US” due to a too coarse approximation of the Rio Grande

  2. Thats why we’ve been using GADM, which is much higher resolution and has admin level 2 data. But the main problem with GADM is they maintain a very high res coastline associated with their places. Because we want to (a) include a bit of ocean and (b) make these geometries as simple as possible we have to do a ton of processing of the GADM shapes to create a coastal buffer while still having nice seams between the places.

  3. GADM level 2 data seems to be often ‘wrong’ so we’ve had to swap out the data for some countries (like MX) with alternative datasets which is a ton of work - and that ‘official’ MX dataset introduced all the issues you’ve mentioned like islands being in the wrong spot

  4. there’s ongoing requests to include some sort of Atlas places that cover the ocean. But what would these look like? See issue 5 here https://www.inaturalist.org/pages/atlases

Part of me regrets trying to take on admin level 2 (county) data globally as standard places / in atlases. Most of the issues seem to come from these data. Maintaining all this for just states and countries would be a much much easier task. And I think 90% of the atlas stuff could be done with just states and countries.

One idea is to do a more careful audit of the GADM data, ie first make sure everyone is onboard with the countries, then for each country try to make sure the community is on board with the level 1 (states) data. And then make a call about whether the level 2 data is ‘correct’ and worth including? It would be a lot of work, but it might be better to find these issues up front rather than discover them down the road.

Also Robert Hijmans behind GADM is excited to work with us within his busy schedule.

The tricky thing about atlases, is they are meant to essentially be a perfect grid of polygons that covers the earth surface that people can toggle on and off to approximate a species range. Doing so with a grid would be 1000% easier, but the idea is that by using political boundaries as ‘grid cells’, the cells will have more cultural meaning and we can tap into other datasets like the distribution map here http://www.plantsoftheworldonline.org/taxon/urn:lsid:ipni.org:names:77155191-1
But using real political boundaries also creates a lot of issues. Do people think this basic vision of atlases (ie using political boundaries to create a ‘grid’ that covers the earth’s surface in order to approximate species ranges) is useful? If so, I’d like to shore up the above issues and come up with a better process for maintaining these places.

4 Likes

Ok, lots of great information here, thanks!

The difficult part of the processing is making sure only the coastline gets buffered, correct? If that’s done manually, do you continue to pull new shapes from GADM as they update the files? The standard places page says iNat uses GADM 2.8, did iNat’s files get updated since then (i.e. GADM is currently at 3.6)?


Is it a long list of places not sourced to GADM? Like if I find an issue with an iNat atlas, 90% of the time it can be traced back to GADM? For example, the Itapiranga problem is actually an issue with the file from GADM:

But the problem with Isla Tortuga in Baja California Sur is not from GADM – their shape perfectly covers the island:


So if I want to correct a GADM atlas issue, what’s the best thing to do? Give a list to you to give to Robert? Use the contact form on the GADM website?


We could do only state level atlases, but I think that would be a lot of lost information, especially with states like Amazonas (~1,500,000 square km).


The other benefit is that frequently political boundaries are also geographic boundaries – things that historically prevented humans from moving around also prevented other things from moving around, so the boundaries often line up fairly nicely with actual ranges. From a user perspective, a political grid is preferable to an evenly spaced grid, despite the implementation issues. There are other grids I could imagine would be useful, e.g. biomes, or with major river boundaries. They would also be hard to implement, but would change less frequently than political boundaries.


Would the best course of action in terms of correcting iNat atlas shapes be to wait and see how this plays out first?

2 Likes

Thanks for all the thoughts. Should we start by doing a little audit of GADM 3.6 starting with admin_level 0? I made this little do with a row for each of the 256 GADM admin level 0 places:
https://docs.google.com/spreadsheets/d/1fbGmQD5lXCVcD_f8Qzn31J7P8Ex-y5DyRG96eIfOFfw/edit?usp=sharing
I also added a column with any discrepancies with Natural Earth.

I think we should follow Natural Earth as treating Caspian Sea as ocean (not an admin level 0 place) but other than that am happy continuing to follow GADM for level 0. What do you think?

Would it be possible to see differences between what iNat currently uses (I’m assuming still 2.8) and 3.6? The GADM changelog doesn’t really say much…

I don’t see any major issues with 3.6, but generally the problems I’ve been looking at are at a fine scale – missing islands, open seams, etc. I doubt many have been corrected in 3.6.

1 Like

Hmmm, I’m still not totally sure I understand. I downloaded the kml files from the Baja California Sur place and the Mexico place via the API, and what I got looked like the simplified atlas view, not the more precise shapes that display on the place pages.

It seems like what’s happening is that a simplified version of place shapes gets calculated and stored. This is the version that the API will return, and also the version used to display atlases. But it’s not in fact the version used to calculate whether a particular observation is within the atlas bounds or not – the true and more precise version is used for that.

I’m pretty sure this is what’s causing the issue with the raccoon atlas. The atlas view shows all the “out-of-atlas” observations within the green boundaries, but piecing together the shapes from the individual places (via screenshots, not the API), there’s a part that isn’t covered. This is exactly where the out of bounds observations are from.

Is it really necessary to simplify the shapes for display? The precise shapes display on the individual place pages. And would it be possible to have the API get the precise shapes? This is an issue both for people who download iNat’s simplified range maps, and also for simplified place boundaries. Any editing that’s done on the simplified version and then re-uploaded will overwrite the true more precise shapes.