Change account deletion functionality to allow account anonymisation and prevent deletion of IDs

danielmorton · July 9, 2024, 10:17pm

There has to be an algorithm that adds a bit of randomness to the queue. Something that bumps anything that falls below nth place back to the top with some small probability p. With n and p chosen by the engineer. Probably wouldn’t be hard to implement.

Anyone who makes a habit of that will get caught pretty easily. (i hope.)

pisum · July 9, 2024, 10:52pm

there’s not really a queue for these observations to go to the front of.

you can find observations using any number of screens – the Explore screen, the Identify screen, the Project screen, etc., and each screen provides the user various options for finding observations. for example, a user X might use the Explore screen to search for family Cicadidae observations, ordered by update date descending; and another user Y might use the Identify screen to search for Insects, ordered by creation date descending.

suppose you have an observation that had the following identifications prior to the deletion of user D’s account:

A: Eastern Carpenter Bee
B: Swamp Cicada (disagreement)
C: Swamp Cicada
D: Swamp Cicada

prior to the deletion, the observation taxon would have been Swamp Cicada (Research Grade), and after the deletion, it would be Winged and Once-Winged Insects (Needs ID).

so in your proposal, does the observation after user D deletion appear at the front of user X’s Explore page search (it was recently updated, but it’s no longer a Cicada)? Does it appear at the front of user Y’s Identify page search (it’s an Insect, but it’s not recently created)?

my opinion is that it should not go at the front of either user X nor user Y’s searches (unless user Y had already reviewed all needs ID Insect observations created after this observations) because it’s not what either X or Y asked for.

this is such a niche use case to create brand new functionality like this. for a case where a research grade observation reverted to needs ID (meaning that it would still likely include an identification of the pre-account-deletion observation ID), i don’t see how this is much different from a search for a search for needs ID observations containing an identification of that taxon, ordered by update date descending (ident_taxon_id=000&order_by=update_date). even if there were other observations updated after the date a particular identifier’s account was deleted, all the observations associated with that particular identifier would generally be clumped together with respect to update date.

the problem isn’t that they don’t show up at the front of an identify queue. it’s that folks aren’t notified of these kinds of changes. the lack of notification is really what needs to be solved, and there are many ways to do this. for example:

taylorse · July 9, 2024, 11:07pm

That’s kind of what I meant, though. Most people are mostly going to use the default sort. So, for example, you could theoretically change the “date added” to the date when the confirming ID was deleted. Or you could change the “date updated”, although that would not get it before nearly as many people (because again, many people are not changing the default sort).

I’m not sure that notifying the person who created the observation is going to be helpful, because what are they supposed to do about it? What you need is to get the attention of someone who can replace the lost identification. If there were some way to search for that, it would help, but it would still be a limited group of identifiers who would be even aware that was a thing they should look for. (Still, it might at least make some people happier.)

pisum · July 9, 2024, 11:17pm

such a person will already see those kinds of records via any search that includes needs ID. if they don’t get to the formerly RG ones because they’ve already identified what they’re willing to spend effort on in that paticular sitting, i’m not sure how putting those records at the front of the queue helps the overall situation.

cthawley · July 10, 2024, 12:41am

I disagree that all users have not given iNat (or anyone else) rights to their work in perpetuity. Some, indeed most, users certainly have - namely any users who have posted their content under one of the various CC licenses or as public domain. The CC licenses are perpetual. Anyone who accessed the user’s content when it was available under the CC license (which iNat certainly did as the user’s content was licensed as part of the process of posting it to iNat) is free to use the content in perpetuity under the terms of license they accessed it under.

This would be different for users who kept their content as “All Rights Reserved”. Though even then, they did grant iNat a license to post and host the content as outlined in the Terms:
“By submitting Content to iNaturalist for inclusion on the Platform, You grant iNaturalist a world-wide, royalty-free, and non-exclusive license to reproduce, modify, adapt, and publish the Content solely for the purpose of displaying, distributing, and promoting Your observations and journal via iNaturalist, and for the purpose of displaying or promoting the Content or iNaturalist itself in other venues, such as social media or software distribution platforms. We may repackage publicly available information associated with the Content in a machine-readable format for a handful of partners, including the Global Biodiversity Information Facility (“GBIF”) and the Amazon Web Services (“AWS”) Open Data Sponsorship Program, and others.”

Of course, the Terms also say
“If You delete Content, iNaturalist will use reasonable efforts to remove it from the Platform.” This is the only “internal” condition/statement (by my reading) that actually binds iNat to not host a user’s content in perpetuity. External factors such as GDPR or other laws may of course have an effect as well, but my limited understanding is that how and whether those laws would apply to iNat content is uncertain and “would need to be determined in litigation” which isn’t very satisfying.

danielmorton · July 10, 2024, 11:48am

And it’s enough. INaturalist has contractually obliged itself to remove all data from deleted accounts. That would certainly be the argument of any irate ex-user who decided to sue if iNaturalist were to hold on to their data.

Unless iNaturalist changes that clause in the Terms, their hands are tied. By their own choice.

opisska · July 10, 2024, 2:31pm

I mean yes, but as you say, it’s by iNats own choice. Obviously, when one thing is changed, others have to follow for consistency - so I would implicitly assume that if any form of non-deletion of data is implemented, the Terms are changed accordingly.

I don’t want to make several posts, but I would like to come back to something you said in another reply:

“But the information on iNaturalist isn’t recorded physically.” - This is something I really disagree with. Making a difference between “physical” and “digital” record is hypocritical - the entire world is moving towards everything being digital, why should we suddenly treat this information differently? It’s simply a different technology accomplishing the same thing.

m_whitson · July 10, 2024, 4:24pm

Forgive me if this was covered earlier, but right now the focus of argument seems to be whether users should be allowed to totally delete their accounts. If the iNat policy when they signed up was “yes”, then that shouldn’t change. (Changing the policy for new users would be a different feature request).

BUT, I don’t see any problem with allowing users a more options when they delete their accounts. It might be worth a try to see if a few options change behavior any, and if it does, maybe adding other, finer options after that.

People may delete accounts for different reasons (time management, loss of interest, angry at iNat, personal safety, trying to correct errors, etc.).

Options might be:
Close my account and…
Delete everything
Just delete my observations/journals/projects/comments but leave my IDs
Just delete my IDs but leave my observations/journals/projects/comments
Leave everything but anonymize it

Or something like that. It’s hard to know what exactly users want when they close their accounts if there’s only one option. If there were other options, then people might take advantage.

pisum · July 10, 2024, 6:59pm

you can’t delete the account but leave some things behind that are still tied to the account. it’s just not technically possible from a system perspective. because of licening terms (BY = attribution), you can’t anonymize most observations and media. one could make the argument that things licensed as CC0 could be moved to an anonymous bucket, but realisitically, because of the way data is shared / distributed, it’s not actually possible to fully anonymize things like observations. so conceptually, you can’t retain any observations anonymously.

so effectively, your choices are:

delete nothing (provide users a way to just inactivate their account rather than delete)
delete everything
delete everything, but strategically create other traces of the lost information in scome cases. examples:
- replace lost IDs of consequence with generic comments describing the effect on the affected observations
- replace lost IDs with new IDs tied to a generic account that are dated as of the date the original IDs are lost

that’s it. if anyone is trying to argue for any other options, it’s sort of pointless because it won’t be done.

m_whitson · July 10, 2024, 7:29pm

That makes sense, then.
Good summary of options, and those might still be worthwhile for iNat to try offering.

jasonhernandez74 · July 10, 2024, 9:16pm

Consider what other platforms do. Whenever they change their TOS, they send out a notification: “By continuing to use our service, you agree to the new Terms of Use,” or some clause to that effect. Always accompanied by a link to the new TOS page. Your choices as a user are either to leave the platform or accept the change.

Of course, that doesn’t sit well with all users. I’m sure a lot of us have seen those social media posts that say words to the effect, “[Name of platform], you do NOT have my permission to share my images/content/data/whatever.” These posts are entirely without legal standing because you cannot, as a user, unilaterally accept only part of a platform’s stated Terms of Service.

ItsMeLucy · July 10, 2024, 9:32pm

I wonder if we aren’t missing a workflow option.

Would it be possible to store identication histories on Observations, with or without dates, so that at a downward click, a detailed hx would appear? Something like:

upload date: Bees (Anthophila)
(xx/xx/xxxx:) Stingless Bees (Meliponini)
(xx/xx/xxxx:) Pectoral Robust-Stingless bee (Scaptotrigona pectoralis)
Current date: Stingless Bees (Meliponini)

In the above instance, even if the user who provided the more detailed ID has left iNaturalist, so that the Observation has returned all the way back to Tribe: Meliponini, the history of the species having once been there is still available, and the Observer can now review leaderboards and tag in other experts and hopefully get someone to see that perhaps there is enough info within the Observation to make a species level identification.

The above is simplified, because as I understand it, it is the loss of the “what was it?” that is frustrating, not the date it was named something or the who specifically named it. Those things don’t matter, correct? (I am married to a UNIX Engineer and a smidge of what he says about keeping things simple does get through.)

If I have misunderstood what people wish to retain from those departing, I apologize.

danielmorton · July 10, 2024, 10:23pm

How well would such a change to the TOS be received. INaturalist isn’t a shady actor like Google or Facebook and they don’t change TOS’s very often. (At least I can’t remember them doing so in the last four years.) Any change would have to be open and above board with ample time for users to jump ship. How many would, taking their data with them, one wonders.

pisum · July 10, 2024, 11:00pm

although storing a history of the observation-level id would provide information in situations where identifications were deleted for some reason, i’m not sure this is the path that leads to “keeping things simple”.

storing sort of an alternate record of identifications would be expensive in terms of both storage and ongoing performance, since each new or deleted identification would have to write out (or delete) an identification record, recalculate the observation id, and then – new in your proposed workflow – potentially write the new observation id history record.

most of the time, people would never look at that history. so it’s a very minor benefit for such a major implementation.

it also doesn’t solve the lack of notification issue when IDs are deleted.

that’s why i think the simplest solution is still just:

ItsMeLucy · July 10, 2024, 11:18pm

Sorry, I missed that you brought this up. I cannot see where anyone else did, though? So I am not sure how widespread this concern would be, especially if the previous information were viewable in a drop-down hx.

If people noted an Observation reverted to Needs ID, they could look and see if it remained at species and just needed another ID or if it needed to be re-identified to species. (Some people complain they get too many notifications, they get lost as is, so I am sort of of the mindset we cannot solve all the problems anyway.)

My husband threw out that it would be a simple code so I was going by what he said. He does significant coding so his use of “simple” may be erroneous in this case. I took his word for it, which may have been poor practice on my part.

I am always concerned about storage too, but I also wonder if it might not be time to clean out some of the years old, one day only accounts with zero observations that predate the “you must confirm your account by email” thing a year or two back. (There are quite a lot.)

Anyway, that was my idea, which ended with an apology in case I had misunderstood. Clearly I had. I’ll go back to lurking now.

pisum · July 10, 2024, 11:44pm

in the original post, it’s a bit indirect, but there is mention of lack of notification of change:

…

the core code that would drive your proposed workflow would not be difficult to code, but it has significant consequences and prerequisites. so the overall impact is significant.

we’re in agreement that we should “keep things simple”, but it’s always how to do that which is the hardest path to discover sometimes.

arboretum_amy · July 11, 2024, 1:23am

I apologize this slightly off topic, but if anyone wishes to compair iNat policy to that of Discord (this forum), see
https://forum.inaturalist.org/t/can-an-inatforum-account-be-deleted/1529/30
And
https://meta.discourse.org/t/anonymizing-users-in-discourse/86929

Interestingly what they have to say about Europe’s Right to Erasure is that administrators should ask a legal professional whether they are meeting the requirements.

lappelbaum · August 3, 2024, 7:45pm

An alternate idea I don’t think has been mentioned yet. If you personally think you would feel a loss when IDs are removed because you will no longer know what species that person put as an ID, you can take steps to avoid this becoming a problem in the future. For example someone gives an ID of Yellow-Banded Leptosteges Leptosteges flavicostella to a moth you observed*, but you don’t know if that is correct you can add a higher level ID that you are comfortable with (if you haven’t already), such as Pyralid and Crambid Snout Moths, and add a comment saying “I don’t know if this is Leptosteges flavicostella” or “I can’t confidently agree with the ID of Leptosteges flavicostella.”

*or one in a project you curate or a taxon group you are interested in, etc.

OR
If the ID that might be deleted is the community ID for the observation, you could export (download CSV) for that group of observations and that would allow you to save those IDs for your records. I think you can only do this for your own observations or those in traditional projects. Correct me if I’m wrong.

maxkirsch · August 4, 2024, 1:09am

I don’t think it’s been mentioned here yet, but regarding the possibility of iNat keeping information created by deleted accounts in an anonymized form, it’s probably worth noting that iNat already does that for a few content types, including flags (the flagger’s username is replaced by “deleted user”, but the flag and its reason remain) and anything recorded in taxon history.

thebeachcomber · December 26, 2024, 3:01am

Are there any updates on this @tiwane? I just discovered that another high volume identifier with thousands of IDs has recently deleted their account in the last couple of weeks, causing more chaos with lost IDs. I still strongly think this is a massive priority to do something about, literally anything other than maintaining the status quo

Topic		Replies	Views
Dealing with Account Deletion General otherprivacy	84	16860	June 1, 2019
Protection against account deletion? General	5	759	April 30, 2021
Add option 'Freeze account' Feature Requests	10	1094	October 23, 2020
Spam accounts - enhance users "Identifications" page to manage deleted content Feature Requests	39	2227	April 7, 2024
Inactive Account Deletion for Username General	9	407	March 23, 2024

Change account deletion functionality to allow account anonymisation and prevent deletion of IDs

Related topics