I am a botanist at the Royal Botanic Garden Edinburgh, currently working on a checklist of the flora of the high Andes of Peru, defined here as areas above 4,500 m elevation.
I am hoping to create and curate an iNaturalist project called “High Andean Flora of Peru”, which would build enthusiasm and serve as a hub for Peruvian taxonomists to identify and discuss observations from above 4,500 m.
To do this, I would like to create a place called High Andes of Peru, defined using an elevation-based polygon outlining all areas above 4,500 m. This area is approximately 80,000 km², well within the size limit for new places. However, likely due to the high resolution of the polygon I generated, the KML file I attempted to upload is 55 MB, well above the file size limit for creating a place. I am reluctant to downgrade the resolution of the file as this would incorporate observations outside my study area.
Does anyone have suggestions for how to address this issue? Alternatively, would another curator be willing to create this place on my behalf? I would love to be able to use iNaturalist data in my research!
considering that there will already be some error introduced by the large accuracy radii of some observations, overly precise polygons might not help. especially if you’ll be studying species that are endangered and thus have their locations obscured.
Thank you for your reply. For my work, I can only use observations with precise geolocations. None of my checklist species are formally protected, so their locations will not be obscured unless it is by the user.
The size limit for curators is 5mb, which 55mb is an order of magnitude larger than, so I don’t think a curator adding is an option. Staff have noted that highly complex places place a major strain on iNat’s infrastructure. If high resolution filtering is needed, I’d suggest making a buffered, less complex place to add to iNat that includes all observations of interest and then filtering by your own high res criteria offline.
The size limit applies to everyone, so curators don’t have any special abilities here.
If you really don’t want to adjust the resolution of your file, the easiest solution might be to split your KML into multiple smaller KMLs that can be used collectively in searches or a project.
Alternatively, you can download the iNat data and process it in your own GIS software. That might not be practical if you have a lot of observations on your checklist, though.
As a possible solution to your challenge with the KML file, perhaps you could break up the 4,500 m polygons into smaller departmental places (Junín above 4,500 m, Pasco above 4,500 m). Then you could have a “child” collection project for each iNat place and an umbrella project for “High Andean Flora of Peru”.
If you did create places and projects of this sort, I’d be very interested to identify some of those observations.
If this works well, you might be able to create places for lower elevation slices (e.g. 3,500–4,500 m).
The polygon size limits are there so that no one can add a place that will have a severe negative impact on site performance. A 55mb file is far too complex to be added, I’m sorry.
In order to contribute a useful observation, there has to be geolocation x and y information. To the best of my information if x and y are available why should z (altitude) not be? Then you can filter your observations by altitude, and as I understood your request, that is what you require.
The interplay of observational accuracy and elevational gradients reduces the utility of your assumption on elevations. We can all think of examples–especially in the Andean region–where an x-y difference of just a hundred meters or so could make a difference in hundreds or thousands of meters of elevation! Two work arounds: (a) assume the false positive that locations are accurate to within some acceptable bounds and accept the elevational errors which might result, OR (b) download the data and cull any observations with an unacceptably high accuracy circle (e.g. > 100 m, > 500 m, etc.), then assume the elevational placements of the remaining subset are sufficiently accurate.
Tony, could you comment on @rupertclaytons’ suggestion for the possibility of creating a set of, say, 10 or 20 small polygons, each representing a localized collection project, then creating an umbrella project (or not) for the amalgamation of those?
Based on the size of you KML file, I am imagining your multi-polygon has a very large number of vertices. Have you considered compromising by creating a simpler convex polygon (bounding or internal) for each of your areas? You would have to experiment with such configurations minimize the eventual KML file, and be comfortable with the expected loss (or gain) of data points resulting from such simplified “study areas”.
Definitely what I would do: download (precisely georeferenced) iNat observations, then automatically select only the points falling within the bounds of the 4500m+ polygons.
But… @magsylombard is looking to create a place-based project to “build enthusiasm and serve as a hub for Peruvian taxonomists to identify and discuss observations from above 4,500 m.” Using a downloaded dataset would be fine for one individual’s research and analysis, but won’t allow a community of iNat users to collaborate on the same live data.
Yes, I think the best solution probably lies in combining a few approaches. First, we should admit that DEMs are particularly imprecise in steep terrain. Then, we should accept that the idea of creating “roughly accurate” places covering all terrain in a department above 4500 m as well as some below that.
We do need a way to generate the smaller KML files. That could be via copying and editing the 55 MB KML in another tool—breaking it into regional polygons and removing vertices to simplify. Or maybe it’s a basic as drawing new polygons by tracing the 4500 m contour.
The result could still be quite successful. I don’t think anyone is going be shocked to find a few observations included that actually were observed at 3800 m.
sounds (to me) like a very different endeavour, one that is best dealt with at the researcher -not platform- level. If it’s a matter of organizing social interactions between enthusiastic taxonomists, a forum thread or journal post (w/comments) goes a long way – assuming inaturalist is a suitable platform for that. Even a project using coarse/national boundaries would still allow discussing the topic, even if some invalid/low-elev/misplaced data would have to be filtered out later by the researcher (as is usual).
Does the place really need to be so precise? It seems to me that for your purposes it would not be the end of the world if a few patches of land that are, say, 4,450 m above sea level made it into the place, especially given standard error in location pins on iNat (which I think will be present even if you do focus on observations with low accuracy scores, say). It sounds like your gpx is extremely detailed so my guess would be you could save a lot on file size with small losses in overall accuracy by running an algorithm like this one: https://en.wikipedia.org/wiki/Ramer–Douglas–Peucker_algorithm and existing implementations for gpx files seem to already exist online which should make this very easy: https://gpx.studio/help/toolbar/minify
If you were to take this route, I personally would probably first redo the process you used to make the original 4,500m gpx track to make one with a slightly lower elevation threshold as an algorithm like RDP is as likely to add areas as subtract them, and I imagine for something like a checklist you may want to err on the side of including things just below the threshold rather than excluding things above it.
It should be OK as long as it’s within the restrictions we have for new places (area, number of observations, file size). Note that a user can only add three new places a day.
There are several major issues with elevation data in observations.
One is in the observation itself. Elevation is the least precise portion of GPS data and can vary substantially on the same device in the same location over just a few seconds.
Another is the location accuracy. Even if the uncertainty circle is not large often people placing locations after the fact from non-GPS enabled devices are less than precise about the the locations and this can easily result in observations being marked at very different elevations than they actually are.
Another is the inaccuracy of elevations in the basemap itself. Having worked in high elevations in the Andes and compared the GPS elevation, the map topographic elevation, and the Google Earth (and other dataset) elevations they can all vary quite a bit for the same point.
All that said it is worth noting that there are some high elevation Andean projects with large and complex shapes. Search for projects with páramo in the title for examples. Most of the large ones are in Colombia and Ecuador, but there is one in Peru.
Also, iNat observations have their locations linked with Macrostrat. It might be possible to come in from that direction instead, but if it is possible I don’t know how you’d go about it. @pisum might be able to offer some suggestions on that angle.