Where do atlas boundary definitions come from?

Hi Jane,

The issue with Atlases is that the ‘seams’ need to be perfect (ie no cracks) so that we can do queries like: “show me all observations of this species not in the US, MX, CA” and we don’t get observations falling in a crack between MX and the US. This makes editing these boundaries hard, because if you mess with one boundary if no longer perfectly lines up with the other etc.

Some issues with atlases:

  1. Natural Earth has admin 0 (countries) and admin 1 (states) data but no admin 2 (counties) and also at 1:10m, its not very high resolution which opens lots of issues like “I was in Mexico, why is my observation showing up in the US” due to a too coarse approximation of the Rio Grande

  2. Thats why we’ve been using GADM, which is much higher resolution and has admin level 2 data. But the main problem with GADM is they maintain a very high res coastline associated with their places. Because we want to (a) include a bit of ocean and (b) make these geometries as simple as possible we have to do a ton of processing of the GADM shapes to create a coastal buffer while still having nice seams between the places.

  3. GADM level 2 data seems to be often ‘wrong’ so we’ve had to swap out the data for some countries (like MX) with alternative datasets which is a ton of work - and that ‘official’ MX dataset introduced all the issues you’ve mentioned like islands being in the wrong spot

  4. there’s ongoing requests to include some sort of Atlas places that cover the ocean. But what would these look like? See issue 5 here https://www.inaturalist.org/pages/atlases

Part of me regrets trying to take on admin level 2 (county) data globally as standard places / in atlases. Most of the issues seem to come from these data. Maintaining all this for just states and countries would be a much much easier task. And I think 90% of the atlas stuff could be done with just states and countries.

One idea is to do a more careful audit of the GADM data, ie first make sure everyone is onboard with the countries, then for each country try to make sure the community is on board with the level 1 (states) data. And then make a call about whether the level 2 data is ‘correct’ and worth including? It would be a lot of work, but it might be better to find these issues up front rather than discover them down the road.

Also Robert Hijmans behind GADM is excited to work with us within his busy schedule.

The tricky thing about atlases, is they are meant to essentially be a perfect grid of polygons that covers the earth surface that people can toggle on and off to approximate a species range. Doing so with a grid would be 1000% easier, but the idea is that by using political boundaries as ‘grid cells’, the cells will have more cultural meaning and we can tap into other datasets like the distribution map here http://www.plantsoftheworldonline.org/taxon/urn:lsid:ipni.org:names:77155191-1
But using real political boundaries also creates a lot of issues. Do people think this basic vision of atlases (ie using political boundaries to create a ‘grid’ that covers the earth’s surface in order to approximate species ranges) is useful? If so, I’d like to shore up the above issues and come up with a better process for maintaining these places.

5 Likes