I agree, this is subjective and I never felt 100% comfortable with it. This was language I was advised to use by a member of the Swiss Digital Law Center. I suppose there is case law with examples of instances where a judge or jury had to decide whether or not something was art, which led to this “standard”.
While I agree that your usage of iNat images for neural net training likely falls under fair use in this case (as the usage is ephemeral, does not involve long-term storage or reproduction, and is for educational use), I’m very suspicious of the assertion that most iNat photos are not creative works protected by copyright in the first place.
Switzerland is party to the Berne Convention, which, in addition to specifying that each signatory state must afford works from other states equal protection as works from their own nationals, also specifies minimum standards for the protection of works (e.g. a minimum term of protection of 25 years for photographs and applied art).
It also defines what constitutes a “work” (emphasis mine):
(1) The expression “literary and artistic works” shall include every production in the literary, scientific and artistic domain, whatever may be the mode or form of its expression, such as books, pamphlets and other writings; lectures, addresses, sermons and other works of the same nature; dramatic or dramatico-musical works; choreographic works and entertainments in dumb show; musical compositions with or without words; cinematographic works to which are assimilated works expressed by a process analogous to cinematography; works of drawing, painting, architecture, sculpture, engraving and lithography; photographic works to which are assimilated works expressed by a process analogous to photography; works of applied art; illustrations, maps, plans, sketches and three-dimensional works relative to geography, topography, architecture or science.
— The Berne Convention, Article 2 Paragraph 1
The threshold of originality you’re likely referring to when talking about “individual character” and “particular choices about the lighting, angle, or timing” does indeed exist, but as far as I know has mostly been used to argue that photos produced by continuous or automated photography (e.g. CCTV cameras, trail cams etc) aren’t protected.
As far as I know, in every case where a human is responsible for pressing the shutter, the resulting image is recognised under the Berne Convention as a photographic work, and afforded all the protections which come along with that.
I am not a lawyer, and usually I would say trust the opinion of your contact at the Swiss Digital Law Center, but I’m extremely suspicious about their conclusion and would personally go looking for some second opinions if I thought the usage didn’t clearly fall under fair use!
It isn’t allowed to link them and they’re hidden/deleted now anyway, but you could imagine they might have looked something like one of these:
Ok that fishing duck is amazing
Grebes aren’t ducks.
You see? This is exactly the kind of thing I was referring to a month ago when I said:
I tried the prompt “iNaturalist Theronia atalantae” into Stable Diffusion online. The results looked nothing like Theronia atalantae. The AI did generate insect-like images but they were clearly not of this world. Even a simple " Mallard " prompt produced some odd-looking ducks.
It seems that, at least for Stable Diffusion, the algorithm is not intended to create/reproduce accurate representations of existing creatures.
there’s a page that allows you to search the training set used by Stable Diffusion. here are the results from 2 of 3 available sets for ‘leucomonia bethia’, which i think will shed light on why your generated images look the way they do:
there’s information about Stable Diffusion’s training here: https://github.com/CompVis/stable-diffusion/blob/main/Stable_Diffusion_v1_Model_Card.md#training.
just from my very limited poking at it, it doesn’t look to me like iNat’s observation dataset was used as part of the data set, but i didn’t actually try to analyze the set because it would be billions of records to plow through.
Thanks for your insight. Our project has concluded, but I’ll revisit this threshold with my colleagues the next time it comes up. I appreciate the context re: the Berne Convention.
I didn’t comment on this earlier because I was just flabbergasted by this assumption. When identifying stuff, I routinely come across images taken by professional photographers who use iNaturalist to figure out what they’ve photographed. Two in particular come to mind who are posting images mainly of particular taxa as they are working on putting together a field guide to be published once they’ve got them all covered. It’s not just all amateurs. Nature photography is a career choice for some and an art that takes a certain amount of patience and skill. That’s why there are so many nature photography contests out there. So to justify fair use, I would definitely look for other types of arguments and certainly pay attention to image licenses.
Ahhhh I think I figured out how the question of ‘individual character’ and ‘documentary in nature’ is relevant to copyright infringement of inat observations! But not in the way it is described as being applied:
Case 1: I post an observation of a plant with a specific sequence of shots in a specific order (say, shot of plant, wider context shot, close up of flowers, close up of leaves, underside of inflorescence). Another user posts a picture of the exact same kind of plant with the same sequence of shots in the same order. Because my shot sequence is documentary in nature and lacks individual character, and because they didn’t steal my actual photos, I probably can’t sue them for violating copyright on my shot sequence.
Case 2: A professional photographer posts an album which is artistic in nature and has a highly distinctive and unusual sequence of shots, locations, and editing style. A second photographer posts their own album with virtually identical sequence of shots, locations, and editing style. The first photographer probably can sue them for violating the copyright on their shot sequence, because it does have an individual character, and is not primarily documentary in nature, even thought the specific individual photographs are not being stolen.
Whether the ‘idea’ of a photo is distinctive enough to be copyrighted on its own is not relevant in the case of scraping photos, because we are talking about a use of the actual specific photos which for sure are protected by copyright, not about use of the idea of the photos, which is a fuzzy distinction that you’d have to determine on a case-by-case basis.
In terms of copyright, reproducing is very different to “stealing”.
I am not stealing 10% of the actual Mona Lisa, I am reproducing it.
What would you be comfortable with? 1%? 0.1%?
Someone only copying Mona Lisa’s eyes? a single pixel of a digital reproduction?
There has to be a threshold at which it becomes arbitrary.
If I use 1000 images of zebras to generate 1 new one using a neural network, is this so different to me as an artist drawing a zebra from memory using the 1000 images of zebras I´ve seen over my lifetime?
Everything an artist ever produces is to some extent copying a % of the work of the people they have seen before them. We do not exist in a mental vacuum, free from external influence.
So yes, to me it makes sense that reproducing some % of an image is fine by law.
I agree there needs to be better legal protections to cover the current shift with AI though.
I agree. What’s even the point of uploading AI images in the first place? This completely destroys the actual data on the entire platform. I’m hoping someone was just trying to see how accurately AI can generate different species without realizing that the AI images are detrimental to the data collected. Let’s hope this is not something that becomes commonplace. I love studying bumblebees and I would be very disappointed if people started adding endangered ones that are actually just AI generated. It defeats the whole purpose of this great app. It would be great if there was an AI button that could be used whenever someone uploads an AI image, much like the Captive/Cultivated one. (As long as the people using AI were adamant about marking them.) That being said, I still don’t agree with using AI on this platform, but I don’t believe that it will stop anytime soon.
Theres no need for an AI button because it is just the existing ‘no evidence of organism’ flag. Unless you mean AI image generators should be required to use a nearly undetectable and difficult to destroy watermark, something like https://www.digimarc.com/products/brand-protection (such watermarks are, for example, how they make photoshop refuse to let you edit pictures or scans of high-denomination US currency). I could be on board with making something like that either a regulatory requirement or industry standard.
We didn’t assume that all iNat photos were taken by amateurs, we assumed that most were. I know there are exceptions. FWIW, in my area of taxon expertise (snakes), it’s very rare that I come across a photo that has been taken by a professional nature photographer (probably <1/100) and those photos almost universally have ‘All rights reserved’ licenses (so we didn’t use them for our project).
The project, I assume, is fine as long as your only using photos that are licensed to creative commons (especially if its not for profit, which it seems like it is.)
but I’m also absolutely flabbergasted by the assertion that ‘individual character’ matters at all when it comes to the individual copyright of photographs. The only thing that matters is what the photo is licensed as, not a group trying to determine if a photo is professional or not.
EDIT: like, this sort of justification is how we’re getting AI art generators just carte blanche stealing people’s art for their algorithms.
hey y’all, i suspect this isn’t really the place (yet) to debate the ethics / appropriateness of using any particular set of photos for generative AI purposes. as far as i can tell, very few people really understand the mechanics of how generative AI works, even in broad strokes, which is somewhat important to even beginning to have a full discussion.
moreover, the tech has advanced well beyond the existing moral understanding and legal codes, and the debate in this space has only really just begun. i’m thinking that joining conversations in other forums (ex. legislatures, courts, etc.) will provide much more satisfying debate and impactful results, should you choose to continue the debate.
Honestly, you’re probably correct
For a relevant discussion about copyright, see stable diffusion having been asked 93,000 times (as of that article’s writing) to generate the work of one specific artist: https://www.technologyreview.com/2022/09/16/1059598/this-artist-is-dominating-ai-generated-art-and-hes-not-happy-about-it/
I’m not sure how a useful conversation can be had if restricted only to people who understand how it works technically. My understanding of how it works is, in detail, quite poor, but I would also guess nevertheless in approximately the 95th percentile. Actual policy and legal decisions will be made almost exclusively by people with almost no technical understanding; I doubt most judges have the faintest technical clue how email works.
I re-read my original post and realized that I should have written “we were advised” instead of “we reasoned that” because I was also surprised to learn that this was relevant. I want to emphasize that to the best of my knowledge, this standard is only applied in Swiss copyright law, and it’s only one of several standards that might be applied, the others being the ones I listed in my original post.