Where to draw the line between helpful and hurtful places regarding burden on servers

jfox16 · July 4, 2024, 6:35pm

Despite the definite need for improvements, like how places on maps stay orange when they should turn green, I enjoy the places feature of iNat. Sometimes, when deciding which state parks to visit, I’ll take a look at their checklist to get an idea of what to look for while there, and sometimes I even factor in whether I’m likely to find many new species to add to my life list while considering which parks to visit.

However, I am very aware of the burden this feature has on iNat’s servers. Because of that, I’ve been very careful about creating them, and to my recollection, I’ve only created one or two, I know one of them is for example a riverside park on 110 acres of land with ~500 observations.

But what would you consider to be the boundary between a place which has a net positive effect, helping to track the species present in a specific location, and those with a net negative effect, which burden the server without providing a practical purpose? Do places this small have a negligible effect on servers, or not?

jdmore · July 4, 2024, 8:42pm

iNaturalist staff have pretty much answered this question through the limitations they have put on place creation:

A new Place can’t contain more than 100,000 observations in its boundary

A new Place can’t be larger than about the size of the US state of Texas (about 695,662 sq. km or 268,596 sq. mi.)

You can only make 3 new Places per day

KML files can be no larger than 1 mb in size (5 mb for Curators)

As for having a net positive effect, I think that is highly case-specific, and probably impossible to measure.

cthawley · July 5, 2024, 12:28am

I think if a user is going to regularly make use of a place (make a project, track species there, work with others, use it as motivation to observe more) and it meets the iNat guidelines above, they should feel free to create it.

mferreira · July 5, 2024, 9:33am

Personally I have been rather liberal in defining new places, abiding by the aforementioned guidelines. I have defined new places

to intersect other places (for example, https://www.biodiversity4all.org/places/rede-natura-2000-na-lousa is the intersection of https://www.biodiversity4all.org/places/serra-da-lousa-natura-2000-sci and https://www.biodiversity4all.org/places/lousa--3),
to add administrative/management boundaries that do not fit within tradicional categories (are not countries, districts, municipalities, etc. - for example https://www.biodiversity4all.org/places/aigp-serra-da-lousa)
to delimit regions which might be particularly rich in biodiversity and thus deserve special attention (for example https://www.biodiversity4all.org/places/mata-do-sobral-co-pt),
to match geographic units that are used by several organizations in their maps and databases (for example UTM grid zones such as https://www.biodiversity4all.org/places/ne63-lousa-miranda-do-corvo-sul-trevim-gondramaz-co-pt).

Sometimes it’s frustrating not to be able to add more than 3 new places per day (I wish I could define several UTM grid zones at once, now that I am trying to add more records to the database https://flora-on.pt - each such place is defined by just 4 points, so it shouldn’t be much of a burden) but I’m fine with that restriction if it is necessary to prevent spam and abuse.

mferreira · July 5, 2024, 9:42am

Perhaps a user should be prompted to justify the creation of a new place and perhaps provide the URL for a detailed description of that place, in the same way that a user must provide some references to justify changes in the taxonomy…

cthawley · July 5, 2024, 12:19pm

The number of points used to define a place does affect “cost”, but also the size of the place and the number of observations it contains, with larger places having higher cost. So even a “simple” place that is large would place a burden on the system.

Specifically for UTMs, I think that it would be a large burden if many UTM grids were added to iNat. I also have a hard time seeing why UTM grids are particularly useful as iNat places specifically. For formal analysis or use with databases, I imagine that any points could be IDed by/assigned to UTM zone after download quite easily.

In regards to

for smaller places, I don’t think this is necessary myself. For one thing, there isn’t really a process whereby places are evaluated by curators or staff and acted upon. A place doesn’t change anything (like an ID) that other users depend on. Additionally, for many places, there might not be an online source for their boundaries, either because they are from part of the world where online sources for places don’t exist in many cases or because they are informal/user-defined solely (yard projects, parts of school grounds, etc.)

pisum · July 5, 2024, 1:00pm

the more often the place is actually used, the more justification for its existence. if you’re not going to use it, and nobody else is likely to use it either, or if you’re only going to use it once or twice, it probably doesn’t need to exist in the system.

to the extent that you can use the Explore page instead of a checklist, creating a place without a checklist should reduce the burden on the system.

you can get this information without needing a place.

you could approximate the boundaries of a place using a box or a circle, and then you could look for the species in that area that you haven’t seen. for example: https://www.inaturalist.org/observations?lat=34.43156871694205&lng=-81.6890998058282&place_id=any&radius=6.753734887265693&subview=map&unobserved_by_user_id=jfox16

here’s a somewhat taxonomic view of the information above (which is effectively the same information that you would get from the unobserved species view in your Dynamic Life List, specifying circle or box instead of a place): https://jumear.github.io/stirfry/iNat_observations_taxonomy?verifiable=true&lat=34.43156871694205&lng=-81.6890998058282&radius=6.753734887265693&unobserved_by_user_id=jfox16

this describes another way to visualize this kind of information without having to define any boundaries: https://forum.inaturalist.org/t/identifriday-is-the-happiest-day-of-the-week/26908/1631.

here’s another way to get similar information in a different way: https://forum.inaturalist.org/t/a-tool-to-help-you-fill-local-data-gaps-easily-missed/37575.

mferreira · July 5, 2024, 2:02pm

I use iNat extensively for mapping plant species that are either rare on their own (e.g. Potentilla montana) or indicative of rare habitats (e.g. Arenaria montana). Much of that information is then uploaded to the platform https://flora-on.pt which is maintained by the Portuguese Botanical Society and is the reference about plant distribuitions in Portugal. That database is organized by UTM grid zones. The assessment of the conservation status of a particular species takes into account the number of UTM grid cells where that species was detected (among several other parameters, of course). Therefore, it is relevant to
“swipe” UTM grid cells in search for endangered species.

That’s where iNaturalist comes about. In my (voluntary) field work I use the iNaturalist app all the time to check where I am, where I’ve been, where I saw that species before, where I am most likely to find it - in a particular UTM grid cell that I am exploring that day. For this purpose, I use the “explore” feature and search for observations in given place, that place being the UTM grid cell whose boundary has been previously defined and uploaded. By doing that, not only do I see the boundaries on the map but I also see what has already been observed in that cell.

Does this represent an excessive and unjustifiable burden on iNat servers? I don’t know. But this seems legitimate to me, it dramatically increases the efficiency of my field work, so I’ve been doing it.

pisum · July 5, 2024, 2:30pm

i don’t know what you’re doing exactly, but i suspect you might be able to do your analysis more efficiently outside of the system. i think this might describe something similar to what you’re doing: https://forum.inaturalist.org/t/self-made-distribution-maps-in-qgis/33891.

for UTM grid cells that you don’t use often you can always approximate the cell using a box instead of a place.

mferreira · July 5, 2024, 2:54pm

In the comment immediately before yours I described what I have been doing.

Defining boxes is trivial in the website, but that functionality does not exist in the mobile app (to my knowledge). I do all this mapping voluntarily with the resources available, which means using the cell phone rather than a tablet, for example. In the small screen of a cell phone it is not practical to open iNaturalist in the browser, so I only use the app during my field work - and the app only allows me to search for observations in a specific UTM grid cell if I previously defined that cell as a “place”. That’s the issue.

If the “explore” feature in the app is updated in order to allow the definition of SW and NE corners as can be done in the browser, then the definition of UTM grid cells becomes almost irrelevant, because the boundaries of those cells are approximately (though not exatly) along meridians and parallels.

jasonhernandez74 · July 5, 2024, 9:37pm

How regularly is “regularly”? If the place is a specific school grounds created for a class project, does conducting the project every school year make the place a net positive?

mferreira · July 12, 2024, 10:04am

Perhaps we would be able to make better decisions about creating or not creating new places if we understood better the computational burden that they represent.

When does a new place require the most computational time? The computational burden is high only when the new place is created, because the system needs to check all records in search for those which belong to that place? The computational burden is high every day / every hour / … because that assignment of observations to places is updated periodically? The computational burden is still high each time a new observation is uploaded or a new search is performed? If so, how high is it compared with the computational burden of the CV algorithm (which I expect to be immensely higher), the upload of the images, etc.? To what extent does the computational burden depend on the number of points of the place’s boundary? To what extent does it depend on the number of observations in that place?

We need to understand this better in order to weigh the pros and the cons of creating a new place - and only the programmers can elucidate us about this. @cthawley any clues?

Even if the programmers don’t disclose that information, perhaps they can estimate the burden of the places already created by a user and define a limit to that burded. Once the limit is exceeded, the user will not be allowed to create new places unless he/she deletes old places (but then, we also need to know whether the deletion process would decrease or increase the computational burden…).

By the way, so far I believe there’s no practical way to get a list of all places that I defined already: I have to remember their names and then search for each of them. Either I personally keep a record of all places that I created or at some point I might even forget some of them and lose the notion of how many they are. Making it easier to see that list might discourage the users from creating new places. Like: “Wow, I already defined 100 different places, I will refrain from defining a new one!”

system · September 10, 2024, 10:04am

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
New Limits to Place Creation News and Updates	27	4253	October 9, 2021
Places checklists' load on servers General	3	251	May 5, 2021
Creating places appropriate? General question	10	2552	September 23, 2019
Upcoming Limits on Making Projects, Places, and Messages News and Updates	59	8460	May 16, 2019
Etiquette for posting many observations of the same species General	37	3055	April 18, 2020

Where to draw the line between helpful and hurtful places regarding burden on servers

Related topics