C2PA Content Credentials and media assets in iNaturalist

it looks like more and more companies are adopting C2PA Content Credentials to store metadata about image, video, audio, and other files in a way that is tamper evident. among other things, it looks like Content Credentials can capture information about the creator and about the creation process (including whether or not something was created via AI, for example). in the context of image files, Content Credentials seems like a more modern and robust version of EXIF.

since it seems like more folks are starting to adopt this specification, i was curious to see how iNat handles Content Credentials, if at all. so i uploaded one of the image files from C2PA’s test file repository to the system, and it looks like the Content Credentials information from the file is not captured in the metadata captured on the iNat photos page (although EXIF information is captured). also, when i download the file from iNat, the Content Credentials have been stripped (just as EXIF is stripped).

(just for reference, there are web-based tools to inspect Content Credentials, such as https://contentcredentials.org/verify and https://contentauthenticity.adobe.com/inspect. there’s also a CLI tool that allows folks to inspect and add info to files: https://opensource.contentauthenticity.org/docs/c2patool .)

i can see there being privacy concerns related to recording and retaining Content Credentials, just as when dealing with EXIF metadata. but i wonder if there are some new applications of Content Credentials that would not have been possible with EXIF? for example, if AI-generated images include Content Credentials, then if the process information is captured when uploading such images to iNat, it might make it easier for folks to spot fake images loaded to the system. or it seems like Content Credentials can serve as a tamper evident alternative to watermarking (which i don’t think was possible in EXIF), although this function is defeated if Content Credentials are stripped. or maybe even if one were to purposely strip the original Content Credentials to preserve privacy, maybe new ones could be added back in such a way that downloaded images could link back to the observations or photo records in iNaturalist?

i’m not super familiar with this sort of stuff, but i wanted to put it out there, and i’d love to hear what others more knowledgeable about this stuff think about Content Credentials in the context of iNaturalist.

7 Likes

I have one, knee-jerk reaction to the idea so far, and it is this: I automatically side-eye anything that’s supported by Getty Images* and / or Alamy. (Alamy wasn’t mentioned in the supporter scroll, but Getty was. Granger also has a questionable record, but they weren’t mentioned either.)

Getty Images has a history of claiming rights management for images that they don’t—and can’t possibly —own. They sell ā€˜licenses’ for their digital copies of images in the public domain—not illegal, just sleazy—along with offering a free service that allows you to embed an image, and blow your website data privacy wide open at the same time.

As far as iNat goes, I know that there are a whole lot of pro and semi-pro photographers who use this site, and also have personal sites from which they sell prints, or offer licensing. Content scraping is a constant danger, for which Creative Commons is only a partial remedy: people who just assume that an image is there for the taking usually aren’t deterred by any rights that the creator retains. (Some CC photos have been found on at least one of the big photo archive sites, and weren’t uploaded by the photographer, but I can’t find the links right now.)

Something that creates a digital passport, or chain of provenance, is a good idea; particularly if it can document alterations to the original image. On the other hand, I’m not certain that this is the right something. Anytime a company with such a well-documented pattern of bad behavior signs on to something that’s ostensibly a way to keep a visual artist’s work from being misused, someone at that company thinks that they’ve found a profitable way around it.

The ability to identify AI created and enhanced photos is much needed, but my reservations aren’t about that issue.

Honestly, I’d like to know more about the people behind the Content Credentials, not just who’s signed on to the idea.

*Addendum: (Since Getty and Shutterstock announced a merger earlier this year, Shutterstock is on the naughty list, too.)

2 Likes

from https://c2pa.org:

C2PA unifies the efforts of the Adobe-led Content Authenticity Initiative (CAI) which focuses on systems to provide context and history for digital media, and Project Origin, a Microsoft- and BBC-led initiative that tackles disinformation in the digital news ecosystem.

here’s some additional context from a research engineer at BBC: https://www.bbc.com/rd/blog/2024-03-c2pa-verification-news-journalism-credentials.

i the C2PA spec will actually make it harder for images to be laundered through stock image sites, social media sites, etc. if image users prefer to use image files that they can trace all the way from camera/photographer through editing to distributor, then i think that cuts down on the revenue streams for launderers. similarly, if someone tries to launder a file by removing content credentials, and someone subsequently loads one with the original credentials, i think that makes it much easier to show that the one with the original credentials is the legitimate one.

although i don’t think it’s common, iNaturalist itself could be used to launder images, since it strips metadata (for privacy reasons). so i wonder if there’s a good way to preserve the provenance of images while still maintaining privacy? maybe in the future, iNat can offer folks a way to explicitly opt in to retain content credentials that provide authorship provenance but still remove geoprivacy related stuff? (i’m not sure how that works, or if it could even work like that.)

2 Likes

My initial reaction is ā€œUghh, yet another privacy headache I have to deal withā€. I’ve spent countless hours figuring out how to strip EXIF/IPTC data from my photos, which you would think would be easy, but every camera, file format, and image editing program stores the EXIF data differently, which is a huge failure of the protocol (or maybe a failure to adhere to the protocol). I’m extremely thankful that iNat strips EXIF data (and apparently Content Credentials) for me and I wish more websites did that or at least provided the option. I don’t really see how Content Credentials would help iNaturalist, other than maybe helping to filter out AI-generated media. Commercial photographers are still going to watermark their images regardless. The main driver of this technology is commercial rights enforcement, which is fine, but I don’t think it really has much to offer for iNaturalist.

2 Likes

i think this is fair. privacy is something that C2PA in its current form doesn’t seem to handle very well. i’m not sure if there ever will be anything that perfectly balances privacy against openness / transparency (just as there’s not a perfect way to protect true coordinates recorded in iNaturalist), but late last year, Meta joined the C2PA steering committee, and i’m hopeful that since they have a few of the most important social networks out there, they have the greatest incentive, resources, and expertise to figure out a reasonable way to make privacy protection a future part of specification.

1 Like