The vanishing of a fellow iNatter

But this “gaping hole” is essentially true of almost all content posted on the internet and later deleted by the OP. Another user could have downloaded/copied it. The downloading process is made more efficient by the availability of an API, but is not exclusive to it.

So this “gaping hole” is really a feature of data on the internet in general - I don’t think it really says anything specific to iNaturalist that should strongly influence the discussion about this topic. People in general shouldn’t post anything online that they want to be sure isn’t accessible by others.

Despite this feature of the internet, almost all sites I know of allow users control over deleting their profile and most/all of their content. I looked into what other social networks allow for in terms of deletion. Here’s a summary for Twitter and Facebook based on the info I could find:

Twitter allows you to delete all of your content with a 30 day waiting period. You must deactivate your account for 30 days and not log in. If you log in during those 30 days, you can reactivate. However, if not, after 30 days all your content (tweets, pics, likes, everything) is deleted.

Facebook allows deactivation but also immediate deletion. Deactivation involves your profile disappearing, and most of your content becoming invisible to other users, but the account can be reactivated. Deletion involves deleting your profile and all content that you posted. It doesn’t however delete comments on other people’s content apparently and maybe some other things (this is a little fuzzy). Facebook explicitly gives a reminder that other people might have downloaded your content when you delete your account. There is also a 30 day grace period during which Facebook can recover your account after deletion.

7 Likes

My point is that iNat’s principle to always allow users to ‘control’ the use of their data is a hollow one while this feature exists (and, as you point out, while the Internet exists.)

Comparing the data integrity of iNat identifications to Tweets and Facebook posts seems a very long bow to draw. Just because they do it, it doesn’t mean that it’s suitable for iNat.

And I am not saying stop the API doing what it does. I am just pointing out the inconsistency. iNat’s principle may be well-meaning, but it is also unrealistic, unachievable and damaging to the body of knowledge stored here. Bulk data deletion should not be an option. Keep the data and anonymise the user.

10 Likes

I know people who have left the site but didn’t delete their account.

3 Likes

Should this process become a reality, don’t spend your time deleting observations. This can be automated by a software needing only a token to access the account whose observations have to be deleted. Proof of concept :

But in order to spare server ressources, this should be done directly on the server side, with one or a few SQL requests in the database. Should this process be approved, it wouldn’t make sense to have to do it by hand, or by a software using the API, even if the staff is busy. You are not asking for a sophisticated user interface, this is on the contrary something very simple, like :

That would have to be a question for our lawyers. I should note they told us that if we are contacted by next-of-kin, we were to forward that to our lawyers and we’ll go from there. As far as I know this hasn’t happened and neither as any sort of will involving iNat come to our attention, to my knowledge.

We have a merge tool that staff use for these requests. It’s easy to use, although depending on scale a merge can take a long time. The most time consuming part is getting people to email/message from both accounts to verify they own both of them.

We can’t control that, just like if I delete a tweet it might have been already screenshotted or archived by others. We can respond to requests for deletion to the best of our abilities, which involves what we control - that’s it.

10 Likes

What if, if a user deletes their account, all of their observations, IDs, ect. get merged with a “deleted user” account that only the staff can access? This way the account itself is deleted but all of said person’s postings still exist, just in a slightly different form.

For observations, there would be an issue with regard to the licence terms chosen by the author.

4 Likes

I get that. In which case why bother deleting data at all? “Because we can” doesn’t seem a great argument. The principle appears to serve nobody effectively, creates a rod for your own backs and threatens the integrity of the data collected here.

6 Likes

As I stated earlier, because some of us are quite happy having that option available to us and don’t want it taken away. If it makes you feel better about it, think of it as the person who is deleting the account that is deleting the data. But then that’s the heart of the problem really.

1 Like

The point isn’t that GBIF will delete it, the point is that once at GBIF, anyone can download a dataset and keep it. It’s in the wild - you don’t have control and you can’t delete once it has been picked up by someone. Russell notes the same is true with the iNat API.

There is a fundamental misconception that someone can post material on an open website (eg. iNat) and later ‘reclaim’ all rights. But that is not how it works. I have GBIF downloads from several years ago that include iNat observations. Hypothetically, if everyone of those observations was subsequently deleted or removed from the RG dataset, I would still have the information from that point in time. The download could not be re-created, but it still exists as a research tool.

Internet Archive
and here is an example of an archived observation: https://web.archive.org/web/20230110001533/https://www.inaturalist.org/observations/7897082
again, this will not be removed if the original iNat post is deleted

Fully agree with this sentiment.

1 Like

As I said, iNat has created a rod for its own backs. :man_shrugging:

I disagree that there is any “inconsistency.” iNat’s consistent principle is basically that it allows user control over what is and is not posted from their accounts within certain limits. Users can post content that meets the Terms of Use and, depending on the licenses that users select, iNat publishes it in different ways/to different partners. Users are fully informed of this datasharing. Users can also delete content from their account, in which case iNat doesn’t publish it anymore. iNat doesn’t promise or imply that users have ultimate control over their data beyond iNat, which seems (based on my reading) to be part of the objection.

I’ve looked for statements about ultimate control of data in iNat’s Terms of Use, Privacy Policy, and account creation process and not found any obvious inconsistencies. The ToU states: “If You delete Content, iNaturalist will use reasonable efforts to remove it from the Platform, but You acknowledge that caching or references to the Content may not be made unavailable immediately.” Which is specific to iNat and only promises reasonable efforts.

On the forum, tiwane has noted:

That statement is pretty clearly restricted to iNat, and provides both a principle but also a legal reason for so doing. iNat is not just deleting data

which is a straw man. This is one of the reasons that information about how other websites handle account deletion is relevant: because they are subject to the same laws as iNat and working within a similar set of rules/constraints.

I don’t think either of iNat’s statements/principles above are:

You ask:

but other users have already offered several potential reasons for why users might want to delete data previously, both in this thread and elsewhere, which you do not respond to specifically or acknowledge. Just because it was possible at some point in time for someone to have accessed/downloaded data a user posted (which we shouldn’t assume is a given), doesn’t mean that deleting that specific data later has no utility for a specific poster. At a broad level there have been many court cases and laws passed worldwide in the past five years explicitly giving users the ability to delete content that they post (or even content about them that they did not! in some cases). Evidently both users and governments consider rights to control/delete data that has been posted/gathered to be quite important.

To be clear, I personally do think that there are ways to improve the way that iNat approaches users who become/want to become inactive or wish to delete some or all of their data. However, I don’t think that arguments of “inconsistency” above are valid reasons to entirely remove the options for users to delete content they post.

12 Likes

I do think this is a useful exploration of alternatives for handling IDs associated with a deleted account. We should probably start by acknowledging that whatever approach we propose must comply with data protection legislation such as GDPR. This summary of GDPR’s right to be forgotten makes clear that there is a presumption that “an individual has the right to have their personal data erased” unless an exception applies. There is one exception that seems potentially relevant for iNat IDs:

  • The data represents important information that serves the public interest, scientific research, historical research, or statistical purposes and where erasure of the data would likely to impair or halt progress towards the achievement that was the goal of the processing.

iNat’s mission is “to connect people to nature and advance biodiversity science and conservation.” iNat staff have always made clear that the scientific value of the data never overrules the pre-eminence of connecting people to nature, and I interpret that to require respecting the individual choices of iNat users. But perhaps we can look at various ways to respectfully implement the right to be forgotten and choose the one that preserves the most scientific value. (It’s possible that other approaches could be appropriate where a user is closing their account but doesn’t want to invoke a right to be forgotten.)

[Edited to add this para] One side note… for those who have said “Why bother deleting data at all?” the two simple answers are (a) it is legally required in several jurisdictions and (b) it’s consistent with iNat’s commitment to respect users’ control over their own data. If a user says “I want to delete my data”, iNat should respect that for the data it controls. The existence of copies of that data made by others is not iNat’s responsibility.

As others have mentioned, some platforms provide the option to deactivate a user profile, and that might have some relevance for handling IDs by iNat users who have died, but it’s not an appropriate response to the right to be forgotten, where the departed identifier’s identity needs to be thoroughly obscured. I think this means that the user’s user name, real name and profile must be deleted. That may also be true in most instances for contributed content such as observation, photos and journal posts, although it may be worth considering how to treat material that was expressly provided under an open license.

IDs would seem to lack the type of creative component needed for copyright (although undoubtedly they require considerable skill). But if we see substantial scientific value in retaining ID records in some way, these must no longer be attributable to a user who wants to be forgotten. Given that, let’s consider our options for how such a user’s IDs might be handled.

Do we choose to include such IDs in the standard Community Taxon logic? Keeping them part of the ID logic would be least disruptive in the short term as we would avoid the phenomenon of many observations losing RG status or moving to higher taxon IDs. But we’re baking in a long-term problem where these orphaned identifications remain part of the logic forever regardless of whether they’re right or wrong. Of course, that’s a problem we already have on a larger scale with inactive users, but here we can’t kid ourselves that the user is going to return after a really long iNat break.

Because of the problems outlined above, I prefer the option where the departed user’s IDs are deleted and replaced with an auto-generated comment, perhaps like this:

  • This observation was identified as Genus specio on 9 January 2023 by a user who has since deleted their account.

The user identity has been “forgotten”. The Community ID won’t be based on an identification that the user is never going to reconsider. But the scientific information that someone once thought this thing was Genus specio has been preserved. Active iNat users can then include this piece of unattributed information in their own ID processes: “Do I think there’s good evidence that this is Genus specio or maybe something else?”

I was also somewhat tempted by the idea of using an auto-generated pseudonym for the username. That might preserve some value in terms of seeing that “deleted_user_12345” identified 47 other observations in the same genus. But we need to consider that users invoking the right to be forgotten may be motivated by safety concerns and anything that serves to create a composite portrait of the person’s activity is probably a bad idea.

Lastly, on a different point, for those people who choose to delete all their photos from their phone/camera/computer after uploading them, please remember that iNat is NOT a way to back up your photos.

10 Likes

Data from iNaturalist is used in state and local biodiversity record keeping projects. Would this data be lost if an account is deleted or closed due to death of the account holder?

If someone dies we don’t do anything to their data. Someone would have to log into the account and perform deletions.

I would assume that people/organizations that use data export it, but that’s up to them.

7 Likes

Thanks for this, it’s truly appreciated. While we may not agree it’s good to have the conversation. There may be some light amongst all the heat.

Yes I am aware of this. However this is a continuing sticking point - you appear to be conflating personal data (which GDPR covers) and data in general (which is explicitly out of GDPR’s scope.) I have indicated the difference in previous posts, and will do so once again. Here’s its definition:

“‘personal data’ means any information relating to an identified or identifiable natural person (‘data subject’); an identifiable natural person is one who can be identified, directly or indirectly, in particular by reference to an identifier such as a name, an identification number, location data, an online identifier or to one or more factors specific to the physical, physiological, genetic, mental, economic, cultural or social identity of that natural person”

So I don’t believe GDPR is forcing you to delete all the other non-personal data such as observations, identifications, etc. Data anonymisation/pseudonymisation (including obfuscation of data such as obs location) provide well-established solutions to the PII situation you find yourselves in. These are not novel - they are well-understood and already widely applied across the IT industry. And no, I Am Not A Lawyer but I have experience in tech, data and avoiding getting sued over the pat 30 years.

Mea culpa. I liked its pithiness. :upside_down_face:

My opinion is on the basis that GDPR doesn’t apply if you fix the data in such a way that removes the personal information (as above, anonymise/pseudonymise/obfuscate). If you remove that legal basis, you are left with a principle that you have decided upon and will live up to it in terms of your own servers, and quite narrowly in terms of its effect. I suppose that’s fine, achievable and realistic although it doesn’t strike me as very useful, and it seems quite Pyrrhic in nature. Still, that’s up to you and I’ll stop banging on about it :man_shrugging:

iNat, not other users, are the arbiters of the functionality of the system, which is why I tend to focus my messages to iNat’s representatives. Also I assume other users are more than capable of positively or negatively responding to my posts to the iNat team. And if they do, I will respond. Granted, I have seen a couple of other users saying things like “we want to be able to delete our own data so we support the feature remaining” but a) I don’t consider that a good reason; b) my responses would probably not survive moderation, and c) I honestly can’t recall any others. But perhaps I just haven’t been looking hard enough. I will endeavour to do so to in the future avoid further aspersions that I am singling out official iNat people for opprobrium.

Once again, both the links you provided were about personal data. As above, anonymise/pseudomyise/obfuscate the PII data, keep the non-PII data and the problem goes away. That straw man is certainly taking a beating today…

I agree - I don’t believe inconsistency is a good reason to remove the ability to delete. It’s was just an incidental observation.

As I have said here and in other threads I believe the main reasons why the data shouldn’t be deleted are because of the impacts on data integrity, data completeness and preservation of the body of knowledge.

Am I being super-altruistic? Not entirely although I do think those things are very important in terms of platform trust. I also know that some time in the future a deletion will result in yet another mess to clean up.

And yes you can argue that being able to delete data is a pro-trust issue too - I don’t envy the iNat product managers, but it’s good to provide them alternative perspectives, and perhaps some hard-won advice.

Thanks again!

6 Likes

This thread is starting to resemble the “Feature Requests” forum – you know, the forum where people suggest that their particular way of using iNat should be imposed on everyone.

1 Like

GDPR only applies to personal data - name, address, phone, email, etc. It doesn’t apply to the user’s observation data, or an identification data they have created. If you take the industry standard approach of personal data anonymisation (e.g. replacing “Russell Clarke” with “adlfkjhwdfsdalkjhdlfakj” in the database, you remove the personal data elements and GDPR ceases to apply, so you don’t need to use the exception for scientific research.

Which jurisdictions, and for what purpose? If the data is altered such that GDPR no longer applies, is this still necessary?

That’s exactly my preference too.

Yep, I prefer your “a user who has since deleted their account” rather than having a random string on the UI and API although internally in the iNat code there may be some need for some of that placeholder data - depends how it’s written.

3 Likes

I love the Internet.

Respectfully, I wonder if there is a way for some of those who have made their valid but already rather fully expressed beliefs to step back for a moment, in order to provide space for anyone else who may have questions, thoughts or ideas to step forward.

For example, and taking no position, I wonder on a logistical basis if “a user who has since deleted their account” would not be far easier to code and utilize less time for Staff or Volunteers but wonder how @sedgequeen responds to that suggestion.

I also think @rupertclayton has an excellent point that individualized anonymous identifications have the potential to become non-anonymized, especially if data about which anonymous ID corresponds to which original user’s account is stored anywhere after deletion.

4 Likes