Obscure It or "Fudge It"?

We held an iNat event recently and there were a few experts on hand to help those who were new. At one point, there was a discussion about using “obscure” and different circumstances when it might be appropriate. Two circumstances discussed were if an observation was at your own residence or if it was a plant sensitive in your area.

A botanist in our group who is also an iNat expert (and who has used iNat extensively for land management research) suggested that rather than obscure, he preferred if people changed the “dotted” location by a few blocks (or more) and then draw a larger circle on the map to encompass the actual location within that circle. If I remember his reason correctly, it was because it gave those doing research a better idea of the location (vs. the larger obscured option which, here on the coast, many times shows things out in the ocean).

Wondering what other researchers think and prefer? Before I knew about the obscure option (when I was new), I did exactly what he is suggesting to protect plants that (here) are sensitive, animals that I know are hunted, and also my residence (which has a couple wild acres on it), but then learned about obscure and now use that…now I’m wondering if doing it the other way is more helpful for researchers?

I completely disagree with this. The true locations of obscured observations still have the potential to be available to researchers through multiple trust options. These generally require the researcher to know how to use them and the user to be responsive, but I think overall this is a better system. This is how it was designed. Sure, if you’re a researcher who doesn’t have an iNat account, “fudging it” may give a better general idea, but research isn’t quantified on general ideas.

How much this matters depends on the group, but for what I work on a difference in a few meters can be a huge difference. If someone is fudging data but I don’t know because I download from GBIF, that could really throw off my results. Or I would just exclude those records because of a large accuracy. If it’s obscured that’s at least noted in a standardized way.

For observations made at my urban apartment/balcony, I use a pinned location with a largish location circle (~500m) not centered on my residence. This gives a better idea of the location and habitat than obscuring (which would likely put some observations in agricultural areas) and keeps all observations made at the same site together on the map while concealing my exact address. Since this is about observer privacy rather than taxon privacy and my residence isn’t likely to be home to rare organisms with extremely limited habitat requirements, this seems like a better compromise than obscuring. Even if researchers can request access to the true coordinates of obscured observations, it requires an extra effort that isn’t going to be worthwhile in every case, so using a moderately large accuracy circle is likely to make the observations more useable for more research purposes. It is fairly obvious from my profile notes that my apartment observations do not include my exact address; if for some reason someone might need more specific information, they can ask, which would not be more work than asking me to trust them with true coordinates for obscured observations.

(This method might not be as feasible if you have a large rural property with several acres, as a circle large enough to obscure the location of your residence might also be large enough that valuable habitat information is lost.)

On the other hand, I would probably use the obscured function if the privacy concerns are related to the taxon being observed (rare species or a find for which I don’t want to reveal the true location at present), as this means the true coordinates are still recorded and can be made available by trusting users or changing the privacy settings of the observation.

I think there are several problems with that approach, especially if there is no further indication that the location is not precise.

  1. The data is simply wrong and corrupts scientific research;

  2. If the circle is too big, iNat will automatically label the location too imprecise and the observation will not show up on distribution maps anyway;

  3. You’re deceiving people who think the location is precise and waste their time as they travel to the location and find nothing there.

When I travel, I always have some target species that I want to see. Since I don’t have time to just try my luck like I would around my home town, I do a thorough preparation. This includes studying other people’s observations to maximize my chances once I get there. I absolutely hate it when I find an observation of an extremely rare butterfly, only to open the observation and find the pointer is in the middle of a city square. Some people, instead of selecting the exact location on the map, just type the name of the closest city and have the software put the pointer in the city center. But at least in this case, I know the location is imprecise and cannot be trusted. If the pointer is deliberately set at a random false location, I’d have no way of knowing that it cannot be trusted.

I couldn’t agree more. I know of a case where a prolific observer gave deliberately wrong locations for plants on the basis that “I know where it is”, without consideration for the fact that they were corrupting a public database.

Not at all. The situation being considered explicitly mentioned using a circle large enough to encompass the true location (i.e., the size of the circle clearly indicates that the pin is not precise). Nothing deceptive here. There is no reason to assume that an organism is located at the center of a large circle. If that were the case, the circle could have been made much smaller.

Is it not also possible to download the true metadata from the photos? I believe this option becomes unavailable if the observation is obscured.

I must add that in the UK, where records from iNat are collected by the national recording schemes, the ‘fudge it’ option is officially preferred and promoted. Most recording schemes simply discard observations with obscured locations, because they cannot access the true location from the downloaded data.

Such data is not ‘wrong’ at all, in any sense. All scientific data has uncertainty, and all scientists must assess that. If they do not, it is the scientist, not the data, who is corrupting research. If I give a measurement as “10 +/- 2” and the real value is 8.2, then my measurement is correct, a scientist who says it is ‘simply wrong’, is simply not reading the data correctly.

Agreed.

Those who argue that “researchers can always access the true location/date of obscured observations” fail to account for the fact that their hands would then tied in how they use that data. It has to be anonymized or subsumed with other observations to maintain the obscuration.

I maintain a database of observation data where users can access individual observations. It doesn’t display lat/long coordinates on the website, but it’s still possible that a user could glean more information about an obscured observation than would be available on the iNat website. We came to the conclusion that the only “safe” course of action is to exclude any observations that have the geoprivacy set to obscured/private. With observations that have taxon geoprivacy set to obscured, we do our own version of obscuration. In some ways, this goes far beyond what iNat does (for example, I obscure all associated observations made by the observer on the same date that might provide clues about where the sensitive observations were made).

I much prefer if folks “fudge” their locations for sensitive observations (within reason) , providing they adjust the uncertainty to an appropriate value. The same does not apply to dates however.

What stops someone with ill intent from creating a “fake” project in order to get access to obscured observations? I see a number of very similar lepidoptera related projects in our area. I have to wonder if some of them weren’t created just to give the creators access to obscured observations (not to imply that their purposes are nefarious).

A project creator can only access the true coordinates of obscured records from a given user if a) said user has actively joined the project, and b) that user actively agreed to trust the project admin with their hidden coordinates.

I often find myself double checking the veracity of the stated location/date for observations. My first stop is to check the the EXIF via the iNat website. I find that the lat/long is rarely displayed, either because the camera did not record it (many don’t), or because the submission method (eg. iPhone app) suppresses it.

But as technology evolves, this may become more of a concern.

There’s a separate discussion in progress around how to edit photo EXIF data in cases where the lat/long in the EXIF is misleading (ie. where a specimen is photographed at a location that doesn’t coincide with where it was originally collected).

Isn’t the second factor a blanket setting - ie. it applies to all projects that the observer has joined?

If so, someone setting up a “fake” project just has to pick a legitimate sounding name, and then send a reasonable sounding request to an observer asking them to join the project.

I’ve done this many times. Yes, the receiver of my request can find our website and see that we’re actually using observation data as advertised, but I wonder how many of these observers bother to check. I’m guessing if one were to set up a project purely for the purpose of getting access to obscured observations, more often than not, folks would join when asked.

no it isn’t; you give separate permission each time you join a different project

for sure, but if a bad actor wanted to do this, they needn’t even bother wasting their time setting up a project; you could just directly message a user either asking them to send them their coordinates, or ask them to ‘trust’ their account directly within iNat to access them

I think the answer will ultimately depend on each user’s case, what level of data obscuration/protection they need, and how they hope their data will be used. I use both methods myself. For observations at my residence, I fudge it (circle <1km which contains the true location). For most other sensitive organisms, I obscure (and take some additional precautions to prevent the true location from being inferred). For things I consider more sensitive, I fudge more on a case by case basis (much larger accuracy circle).

There’s always a tradeoff. For research use, common cutoffs are 30m, 100m, 500m, and 1km. Some researchers may be excluding observations at/above any of those accuracies. For some uses, obscured observations might be fine (e.g., county level). But if you can fudge in the lower end of those values, then that data might still be useful to researchers. If you have projects/users that are the main users of the type of data you collect that you are willing to trust, obscuration may still be very good for sharing the data for research. One advantage of obscuration others have noted is that the true original location is retained - one doesn’t need to worry about remembering it or storing it somewhere else, and it is easier to share with other users as needed.

I think an off center accuracy circle large enough to be truthful would make the data more valuable to science long term. Maybe in 50 or 100 years when you aren’t around to trust anyone with your obscured locations, and the organism in question hasn’t been reported in the county in decades, someone doing restoration work will want to know which areas it was once found in with much greater precision than the obscured rectangle.

Also, if you’re observing several of the same species over time (for example, returning to the same patch of rare flowers yearly) the “fudging it” method helps make it clear that these observations do represent just one patch.

There are other reasons for obscuring to a much wider area (protecting the private property of friends from trespassers seeking to verify or collect being number one for me) but from a purely “how useful will this be to science” perspective, I think an offset accuracy circle as small as one feels comfortable making it would be the most useful.

i’m not sure what additional precautions you’re taking, but if you input the true underlying coordinates on your observation, they’re never truly hidden in the system, even if you use the system’s obscure function, and even if you’re not participating in one of the affiliate networks that get the underlying information.

as far as i know, very few people know how to get to the underlying coordinates for any obscured observation, but it was still possible last time i checked.

this can be an advantage, especially when the user gets the information from GBIF rather than from iNaturalist directly.

the real advantage in my mind though, is that because you’re effectively lowering the precision of your location, it’s one additional layer of protection for the true location of your observation.

to do this successfully though, you need to make sure the photos you upload don’t contain the original true coordinates in their metadata, since those are recorded and made public for photos associated with observations that have not been obscured or made private using the system’s geoprivacy functions.

+1 for obscuring

I understand your point and I agree that the larger the circle, the bigger the chance the pinned location is not exact. However, until now I had always assumed people did that mainly because they simply couldn’t remember the exact location, even though they tried their best. I have never assumed that people would deliberately put the pin in a false location (instead of simply obscuring the observation) and I’m quite shocked that this apparently happens a lot.

The mere size of the circle absolutely does not indicate to me that the pin is precise or not. People have all kinds of reasons to deliberately make the circle bigger, even if the pin is exactly at the right spot. Just last week I heard of someone who would make the circles around bird and mammal observations extra large “to account for the fact that the animal moved around and could be found anywhere within a radius of several kilometres.” To me, there is no reason to assume that an organism is not located at the center of a large circle.

What is within reason and an appropriate value, though? Wouldn’t this be different for everyone? If you’re talking about big distances like several kilometres, why not just obscure the observation altogether? Or if it’s a much smaller distance, like within a short walking distance from the true location, you’d have to consider that someone who really wants to find the organism could just start looking around the false pin and find what they’re looking for anyway.

By doing thorough research like looking at old descriptions, Google Maps and height maps, I managed to pinpoint locations for rare butterflies in the vast emptiness of the Turkish steppe and the Arabian desert and found the butterflies within an hour. And that was without the help of any nearby iNat observations. If you’d put the pin of an organism that someone really wants to find at only a stone’s throw away of the actual location, that wouldn’t protect the true location at all.

How is an observation with a large location circle not centered on the location where the organism was seen “deliberately putting the pin in a false location”? It is merely not providing precise information about the exact location. This is the same regardless of whether I do not know exactly where I saw it or whether I do not wish to reveal exactly where I saw it. In neither case do I guarantee that the organism was in the middle of the circle; rather, I am claiming that the organism was somewhere in the circle – something that is true regardless of the reason for the size of the circle.

Nor does the fact that people sometimes choose a circle that is larger than necessary mean that the organism is at the center in all cases. As you note, people have lots of reasons for using large circles. It would be incorrect to assume that the center point is both known and correct and they “unnecessarily” made it larger.