Duplicates are a major problem for both observers, identifiers and project managers.
Every image on the site has a checksum.
Every image a user tries to upload also has a checksum.
If a user is trying to upload a photo which has an identical checksum to an image which is already on the site, you could simply ask the user to double check whether they’re already uploaded this image.
This could prevent a whole lot of unnecessary duplicates.
Sounds like a great idea, as long as it is just a notification, and can be ignored when needed. For example there are sometimes instances where I have used the same exact image for multiple observations of different organisms (e.g., an insect on a flower).
I’d add that a link to the existing observation should be included in the warning message.
There are valid reasons for uploading the same image (one for a butterfly and one for the plant it’s on, for example), so a link provided would allow the user to determine if they want to continue with the upload or are making an unintentional duplicate.
[quote=“earthknight, post:15, topic:258”]
There are valid reasons for uploading the same image (one for a butterfly and one for the plant it’s on, for example),
[/quote] I do not think it is valid. In that case the image should be reused.
It might be months to years later and you’re uploading for different reasons. It’s not realistic to expect everyone to remember every single past photo they’ve taken, plus which ones might be suitable for duplication.
To make it easier for duplication is exactly why I suggested that if this is implemented (which I support in a slightly different form) that it is vital that a link is provided to the observation using the same image. That way the user can duplicate the observation instead of uploading a new image.
I’m not sure how it works and if this is specific to Flickr links, but if I choose a photo to import from Flickr that I’ve already imported earlier, it tells me: “Heads up: this photo is already associated with an existing observation.” I thought that was a smart alert and it would be nice if this could be expanded from Flickr imports to manual photo uploads as well. I don’t know anything about the technical side of how to possibly make it work though.
An efficient way to implement it: compute the checksum (for instance, MD5) of every photo uploaded, store the checksums in a database, compare the checksum of the next uploaded photo with all the checksums already in the database, if there is a match then this photo has been uploaded already.