Published article on approaches to reporting sensitive species

cmcheatle · January 3, 2021, 2:20pm

Placed in the curators section simply because only curators can update these data on iNat, by all means open to discussion for all.

Referenced in the year end GBIF newsletter.

https://docs.gbif.org/sensitive-species-best-practices/master/en/

An interesting read, to me the biggest takeaway was ‘document, document and then document some more’ something that iNat sorely lacks in terms of any framework for managing its workflow and decisions related to obscuring. Interesting too that the strong recommendation is for generalization, not randomization of locations, which is the opposite approach to what is done on iNat.

jakob · January 3, 2021, 3:10pm

Thanks for sharing the link, looks like a very useful read!

colincroft · January 3, 2021, 3:45pm

Thanks for sharing this. I’m sure your two concerns (‘something that iNat sorely lacks in terms of any framework…’ and ‘generalization, not randomization of locations, which is the opposite approach to what is done on iNat…’) are obvious to curators and iNat experts. Any chance you could elaborate a bit on these for the rest of us (or direct us to threads where these are discussed)? Thanks!

cmcheatle · January 3, 2021, 3:52pm

The discussions are really scattered and not centralized, but to summarize, which others can elaborate on if they wish would be:

any curator can change the obscuring status of any species (with a note that it is banned in Canada but not technically prevented), for any reason they see fit. There is no documentation of who made the change, no documentation or discussion required, and any such ‘review’ that takes place can be in any number of non standard places (forum, flags, working group projects etc)
generalization means showing the observation as being in a standard defined place, be that a grid square, political division etc. Ie show this observation as being in Toronto county, Canada. Randomization means showing the observation at a randomly calculated point. iNat does this for obscured species, they calculate a random location within 0.2 degrees of latitude and longitude, which works out to roughly a 22*22km box depending on exactly where on the planet you are and displays that.

colincroft · January 3, 2021, 7:13pm

This is very helpful–thank you!

cthawley · January 3, 2021, 10:29pm

I would say that iNat does both generalization and randomization for obscured data. The current obscuring process is pretty much generalization to a grid box. The randomization element is mostly for display on maps and such. I do think that obscured data could be more useful if they were generalized in a way that allowed standardization for some analyses (0.2 degree grids aren’t really used in most analyses that I know of).

The obscuring process also does generalize to locations to some degree. For example, if the whole obscuring box is in a location (eg Ontario), I think iNat does show that observation as occurring in Ontario. But there’s obviously a lot of noise in terms of whether whole boxes fall inside areas or not, and smaller areas won’t get any obscuring boxes inside them, so they’ll be data deficient.

Also, I agree this is a really good/interesting read. Thanks for posting!

cmcheatle · January 3, 2021, 11:19pm

The text string associated with the record is generalized. The coordinates displayed are randomized. As an example https://www.inaturalist.org/observations/66071087

To be clear it does not use predefined 0.2 degree grids, it calculates a grid of 0.2 degrees from the exact location as submitted for each individual record and randomizes within that.

astra_the_dragon · January 3, 2021, 11:28pm

really? if that were the case, it would be easy to calculate the true location. Am I misunderstanding?

cmcheatle · January 3, 2021, 11:30pm

How, it calculates the maximum and minimum coordinates possible using the 0.2 degrees and then chooses a random number within that range. So say for ease your observation action is at 42.2 north by 100.6 west.

It determines the range to use is from 42.0 to 42.4 north and 100.4 to 100.8 west and calculates a random latitude and latitude number within that range, using more decimal points than I typed here.

astra_the_dragon · January 3, 2021, 11:38pm

yes, but the center of that rectangle is the original location. surely the rectangle shown on the map, in which the pin is randomly located, is not centered on the true location…

cmcheatle · January 3, 2021, 11:50pm

The box displayed to the user is not centered on the original location. I’m not sure how it is calculated , perhaps around the randomized location, but it is not centred on the true one.

For example I know exactly where both these records are (in both cases I took the observer there to see it in case someone thinks I am using some other means to determine the actual location) and can assure you the true spot is not at the centre of either box.

https://inaturalist.ca/observations/9359274
https://inaturalist.ca/observations/58834480

bouteloua · January 3, 2021, 11:52pm

The rectangle grids themselves are fixed for everyone. You can see this when browsing around on the map in observation-dense areas:

cthawley · January 4, 2021, 1:33pm

To the best of my knowledge, this is incorrect. The grids are predefined. You can test this by making two observations relatively close to each other. They should then have the same obscuring box.

For example, take two of my obscured observations taken at more or less the same location (within 100m of each other):
https://www.inaturalist.org/observations/17295294
https://www.inaturalist.org/observations/13261346

The obscuring box has the same boundaries for both.

cmcheatle · January 4, 2021, 3:25pm

From looking at the code, with the caveat I don’t know Ruby, so am trying to decipher a language I dont know (and that’s assuming I found the right code) is that I was more correct in what I initially wrote than I thought. I wrote an example with 1 decimal point but wrote I assume they use more.

However, it appears they do just that, round the box used to 1 decimal point, thus effectively reusing the same grids.

I tried to find an example where nearby observation would happen to fall into different rounded grids.

I found this : https://inaturalist.ca/observations?place_id=6883&taxon_id=27137

All these observations are truly located on Pelee Island, that’s the only place in the province the species is found (which is completely open knowledge too hence writing it). Yet there is roughly 45km between the northern most and southern most obscured coordinates. All the island is about 11km long, which is roughly 0.1 degrees. And the northern reported points are about 20km north of the north end of the island.

So what it appears is happening is records at 42.8xxx are being rounded down to 42.8 and gong into that grid while ones at 42.7xxx are being rounded down and presumably going into that grid.

Interestingly there are effectively no records obscured to the east of the island which supports the rounding.

I’d suggest a compromise in saying we both have parts of the approach right, it is calculating the box based on the actual location, but doing it at a low level of precision (1 decimal point) which leads to the reuse of the same boxes.

Most importantly to confirm the original question, the box itself is not centred on the actual location.

system · March 5, 2021, 3:25pm

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Concern about disapearing species General projects	24	801	September 19, 2023
Changing default geoprivacy General	23	3064	October 22, 2020
Lightly obscure all records on iNaturalist Feature Requests geoprivacy	48	5411	October 25, 2019
Protocol on tagging specimens as obscured? General question	13	643	May 4, 2022
Obscuring on map vs lists General	56	4300	September 24, 2021

Published article on approaches to reporting sensitive species

Related Topics