Published article on approaches to reporting sensitive species

Placed in the curators section simply because only curators can update these data on iNat, by all means open to discussion for all.

Referenced in the year end GBIF newsletter.

https://docs.gbif.org/sensitive-species-best-practices/master/en/

An interesting read, to me the biggest takeaway was ‘document, document and then document some more’ something that iNat sorely lacks in terms of any framework for managing its workflow and decisions related to obscuring. Interesting too that the strong recommendation is for generalization, not randomization of locations, which is the opposite approach to what is done on iNat.

5 Likes

Thanks for sharing the link, looks like a very useful read!

1 Like

Thanks for sharing this. I’m sure your two concerns (‘something that iNat sorely lacks in terms of any framework…’ and ‘generalization, not randomization of locations, which is the opposite approach to what is done on iNat…’) are obvious to curators and iNat experts. Any chance you could elaborate a bit on these for the rest of us (or direct us to threads where these are discussed)? Thanks!

The discussions are really scattered and not centralized, but to summarize, which others can elaborate on if they wish would be:

  • any curator can change the obscuring status of any species (with a note that it is banned in Canada but not technically prevented), for any reason they see fit. There is no documentation of who made the change, no documentation or discussion required, and any such ‘review’ that takes place can be in any number of non standard places (forum, flags, working group projects etc)

  • generalization means showing the observation as being in a standard defined place, be that a grid square, political division etc. Ie show this observation as being in Toronto county, Canada. Randomization means showing the observation at a randomly calculated point. iNat does this for obscured species, they calculate a random location within 0.2 degrees of latitude and longitude, which works out to roughly a 22*22km box depending on exactly where on the planet you are and displays that.

5 Likes

This is very helpful–thank you!

I would say that iNat does both generalization and randomization for obscured data. The current obscuring process is pretty much generalization to a grid box. The randomization element is mostly for display on maps and such. I do think that obscured data could be more useful if they were generalized in a way that allowed standardization for some analyses (0.2 degree grids aren’t really used in most analyses that I know of).

The obscuring process also does generalize to locations to some degree. For example, if the whole obscuring box is in a location (eg Ontario), I think iNat does show that observation as occurring in Ontario. But there’s obviously a lot of noise in terms of whether whole boxes fall inside areas or not, and smaller areas won’t get any obscuring boxes inside them, so they’ll be data deficient.

Also, I agree this is a really good/interesting read. Thanks for posting!

2 Likes

The text string associated with the record is generalized. The coordinates displayed are randomized. As an example https://www.inaturalist.org/observations/66071087

To be clear it does not use predefined 0.2 degree grids, it calculates a grid of 0.2 degrees from the exact location as submitted for each individual record and randomizes within that.

really? if that were the case, it would be easy to calculate the true location. Am I misunderstanding?

How, it calculates the maximum and minimum coordinates possible using the 0.2 degrees and then chooses a random number within that range. So say for ease your observation action is at 42.2 north by 100.6 west.

It determines the range to use is from 42.0 to 42.4 north and 100.4 to 100.8 west and calculates a random latitude and latitude number within that range, using more decimal points than I typed here.

yes, but the center of that rectangle is the original location. surely the rectangle shown on the map, in which the pin is randomly located, is not centered on the true location…

The box displayed to the user is not centered on the original location. I’m not sure how it is calculated , perhaps around the randomized location, but it is not centred on the true one.

For example I know exactly where both these records are (in both cases I took the observer there to see it in case someone thinks I am using some other means to determine the actual location) and can assure you the true spot is not at the centre of either box.

https://inaturalist.ca/observations/9359274
https://inaturalist.ca/observations/58834480

The rectangle grids themselves are fixed for everyone. You can see this when browsing around on the map in observation-dense areas:

3 Likes

To the best of my knowledge, this is incorrect. The grids are predefined. You can test this by making two observations relatively close to each other. They should then have the same obscuring box.

For example, take two of my obscured observations taken at more or less the same location (within 100m of each other):
https://www.inaturalist.org/observations/17295294
https://www.inaturalist.org/observations/13261346

The obscuring box has the same boundaries for both.

2 Likes

From looking at the code, with the caveat I don’t know Ruby, so am trying to decipher a language I dont know (and that’s assuming I found the right code) is that I was more correct in what I initially wrote than I thought. I wrote an example with 1 decimal point but wrote I assume they use more.

However, it appears they do just that, round the box used to 1 decimal point, thus effectively reusing the same grids.

I tried to find an example where nearby observation would happen to fall into different rounded grids.

I found this : https://inaturalist.ca/observations?place_id=6883&taxon_id=27137

All these observations are truly located on Pelee Island, that’s the only place in the province the species is found (which is completely open knowledge too hence writing it). Yet there is roughly 45km between the northern most and southern most obscured coordinates. All the island is about 11km long, which is roughly 0.1 degrees. And the northern reported points are about 20km north of the north end of the island.

So what it appears is happening is records at 42.8xxx are being rounded down to 42.8 and gong into that grid while ones at 42.7xxx are being rounded down and presumably going into that grid.

Interestingly there are effectively no records obscured to the east of the island which supports the rounding.

I’d suggest a compromise in saying we both have parts of the approach right, it is calculating the box based on the actual location, but doing it at a low level of precision (1 decimal point) which leads to the reuse of the same boxes.

Most importantly to confirm the original question, the box itself is not centred on the actual location.