Platform(s): website, I think
URLs: mostly https://www.inaturalist.org/flags, but also any place where content can be flagged
Description of need:
As of now, content (in various contexts) can be flagged as
- copyright infringement
Also, some people upload the same content more than once. This causes many users to create flags for these duplicate observations. A quick perusal of the flags page shows that such duplicate flags for observations as a rule don’t get any followup. As of today, there are 142 pages of flags that match “dup”, 30 pages of which have been resolved. That’s 21 %. In comparison, of all observation flags for any reason excluding “dup”, 81 % have been resolved.
I sort of understand this, since for a “duplicate” flag, there is nothing for curators to do, and many observers don’t curate their own observations.
Feature request details: Create an additional category that allows users to flag observations explicitly as “duplicate”
This way, the process is structured. Users wouldn’t flag observations as “dupe”, “duplicado”, “repeat”, etc., and curators can then treat that flag in the same way spam and copyright flags are treated, namely filter against it, since there is nothing for them to do anyway:
This is similar to an older request that I closed, but we approved this one because a) sadly the merge/split tool that I was hoping we’d get to is still on the drawing board and b) I do think this would help with filtering flags.
However, if implemented we would either have to carve out an exception for observations flagged as duplicates and not automatically make them casual grade, or officially make observations marked as duplicates casual grade.
This is the main reason I flag observations as duplicates, but I’m not sure how appropriate that is. Otherwise I agree that flagging them isn’t super helpful, because curators can’t do anything about them.
Am I the only one who doesn’t really understand why duplicate observations are a problem?
Removal of duplicates will reduce the pool of organisms to be identified, so volunteers identifying can focus on legitimate observations.
Should it be a flag, though? Is that heavy handed and possibly mess up a legitimate duplicate observation? (like if someone wants to document a plant and a virus on the plant, but doesn’t get their identifications in fast enough?)
Often the observations flagged as duplicates have multiple individuals in the photo. Isn’t that perfectly valid, then?
It’s cheating of your data, it’s wrong data, it’s one specimen with specific markings recorded as multiple (up to ten+ times), it’s a disregard of work other observers do to exclude such accidents, etc.
But, if multiple had been present and seen the same thing, they could have all uploaded it at the same place and time and there would have been no issue. So what’s the difference if one person does it?
Observation is what 1 person met, multiple people can see one thing even without knowing that they do, so we can’t say one uploads and another one doesn’t, I’m not a fan of big groups observing the same specimen, but it’s within the rules, duplicates are not.
Plus you can use observation field to connect different observations of multiple users seeing 1 specimen, which is useful for wintering birds, trees and so on.
Maybe @schoenitz can add a clarification, but I would assume this is intended to apply to observations made by the same observer of the same organism. Not to the same photo used by different observers, and not to the same photo used more than once for different organisms (e.g. a bee on a flower or a male and female together). Any observation flagged in error can have its flag resolved.
Yes, this would only work for one observer uploading multiple observations of the same individual at the same (approximate) time and place. This happens when someone uploads the same picture as multiple observations and also when users upload a series of pics of the same individual organism individually and not combined into one observation.
Multiple users uploading their own observations of the same individual organism in the same time and place is ok (if annoying for IDers). That shouldn’t be flagged.
One other potential issue that intersects here is current difficulty in the interface of deleting pics from or adding pics to an existing observation (which has also been discussed on the forum, feature request here). This probably also contributes to @schoenitz 's observation that dupe flags aren’t addressed by users. Though to be fair, I think that there would still be a ton of dupes even if that process were easier.
I think that in general a flag for this would be a positive and would certainly help out IDers. One suggestion would be that flaggers could add the address or observation ID of the legit observation that has been duplicated to the flag or in a comment to help with understanding. My current practice is to post this info in a comment, but if a flag is available it would be better there. Having flagged observations go to casual to take them out of the ID pool would be my preference. Without that action, I think that the benefits of the proposal would be limited.
It should be added as a must to fill field of “legit” observation and if existing other duplicates.
No, this is strictly for separate observations that use the same identical picture. That is because this is the case that already gets flagged by the community in large numbers.
A discussion of closely related observation (same individual, place, time, but not literally the same picture) is a distraction for the purpose of this feature request. That situation rarely gets flagged.
Having said that, I also do not (as a rule) flag duplicates and if I do anything at all I leave a comment. I do not think duplicate observations are a problem that need to be “dealt with”. I expect (judging from experience, not data) that the frequency of duplicates is comparable to human observations, or test observations of food, etc., and those are officially tolerated.
My motivation calling for this flagging option is so that I can uncheck that box on the flags page. I want to look at the flags that maybe I can help with. I don’t want to see the large number of flags that aren’t actionable.
Tony, thanks for liking that old thread. I had searched the forums but missed that somehow. It helps reading what has been discussed before.
I can see that perhaps instead of making a flagging option, the discussion could be what the official guidelines should be w/r/t flagging of duplicates. If duplicates weren’t being flagged by the community my problem would also go away.
Could you please change title a little bit, so it says about same picture? As in practice what 90%+ of duplicates is are photos in the same series uploaded by users who don’t know that it’s possible to add many in one observation, thus this type of flag would be used for different uses than previous request.
An alternative approach would be to add duplicate to Data Quality. Flags often imply something is wrong (which is why many of us haven’t been flagging duplicates currently, although some do). Compared to making observations casual via DQA may be more fitting more since many duplicates are unintentional. On the other hand flagging would draw more attention to duplicates and may do more to prevent them, so may be ideal. Some previously also mentioned ways the system might be able automatically detect identical images (although duplicates only include those uploaded by the same observer), which if possible may be the best approach (no manual effort needed) or a good one to combine with manual approaches. Finally, when there are two identical photos, are either considered duplicates or only the latest-date one?
Those kinds of observations aren’t being flagged as “duplicates”. The feature request is about “duplicate” flags specifically, which as of now are in the “other” category.
They’re a “problem” in the sense that they’re usually caused by genuine mistakes, which we’d like to help users fix. A comment is often enough to achieve this - but when it isn’t, it’s helpful to the rest of the community to temporarily mark the observation as casual, so as to avoid unnecessary duplication of effort.
When multiple people upload the same organism, there is an opportunity to interact and engage with multiple people, and get them more interested in the natural world.
When one person does it, there is no such opportunity. There is just redundancy and drudgery.
And if they are going to become a regular iNat user, the sooner they learn not to burden the system with duplicates the better.