iNaturalist data on GBIF shows only CC BY-NC (excluding CC0 and CC BY)

Wouter: some very practical implications on the use of NC as a default (in case the link hasn’t been already shared:

Kimpel (2013), linked from that page, is an excellent read.


And very much agree that the prevalence of NC is not an explicit choice of the majority of users, but the effect of the default at scale.


Thanks very much @radrat!
I would like to add one more resource while we’re at it, the 2011 paper by Hagedorn et al., “Creative Commons licenses and the non-commercial condition: Implications for the re-use of biodiversity information”, explaining that NonCommercial is not what most people think it is, and a greater obstacle than one may think;

I can’t find a working link to the Kimpel paper, by the way :(

Found a copy on

For me it started out as an explicit choice, but I removed it as soon as it became apparent that it was unknowingly restricting non-commercial use right along with commercial use.


This is true, the CC-BY-SA license is not compatible with the CC BY-NC license or CC BY-ND licenses.

The CC-BY-SA license is the only license that effectively guarantees that your work will always remain free.

If I understand correctly, they are excluding the observations with the CC-BY-SA license for this work then…which…is bad because I think this is actually the best license to use because it’s the best at ensuring the data stays free and open.

If they are including CC-BY-SA then they are committing a copyright violation.

Is it correct for me to assume that you are excluding all records that use CC BY-SA and CC BY-ND then?

Why doesn’t GBIF just respect the individual licenses? This is clearly going against the spirit of the original licensors.

Even if it is legal, alienating or angering original copyright holders is never a good idea. And this is the sort of thing that irritates me a lot. Especially when people use the NC clause, one that is highly problematic for a long list of reasons.

This presumably won’t affect me because I’ve chosen the CC BY-SA license for all my contributions (for exactly this sort of reason.) But this is bad behavior that I would like to see GBIF do something about.

I also would encourage others to change to using the CC BY-SA license. It would prevent these sorts of charades if everyone used that license.

In case you missed the response, they’re actively working on fixing it:


I do respect your choice for CC BY-SA. Licenses are a personal choice and iNaturalist is great for supporting this personal choice.

Having said that, I disagree with your encouragement for using CC BY-SA. Since CC-BY-SA also hinders scientific re-use, which you acknowledge by mentioning that CC-BY-SA observations are not included in GBIF.

Creative commons makes total sense in the arts, where the composition of an art form is key. However, this is not the case in iNaturalist where one deposit recordings of observations with the mere objective of it getting annotated by the iNaturalist community at large. Then the question becomes how you license those annotations. If I, a proponent of CC0, would annotate one of your observations, you relicense my annotation as CC-BY-SA, by which you don’t respect my choice for CC0 as a leading license, by annotating your CC-BY-SA I have to share that annotation with your license. That is why ND, never makes sense. The ND part forbids derrivatives, which an observation with annotations is.

But let us stick with your default choice of CC-BY-SA, I have indicated at iNaturalist that all my contributions should be CC0, this applies also to my hypothetical identifications of your CC-BY-SA licenses. So basically, by doing so I am breaching your requirement to apply CC-BY-SA on your observations.

Then again there is this claim that one cannot apply copyright on facts. I have been trying to get this confirmed by different legal scholar, where the default answer is “it depends”, where they never continue that answer on what it depends. The closest answer I got is that these things are only solved through litigation.

Long story short, available license choices is a big mess. That is exactly why I am releasing all my contributions on iNaturalist under the CC0 license,. Maybe some will earn some money on my work, but the alternative is to litigate any breach of that license in court. For me that is not worth the effort.

I am into the game to have fun, not to empower a legal game.

Again, I don’t expect you to change to CC0, I am just objecting to your suggestion to change to a general license of CC-BY-SA. The issue with GBIF seems to be solved soon, so we can all just use our prefered license, which is a policy I hope more citizen science platforms would adopt.


How so? This is not at all intuitive to me, and I would argue the opposite.

CC-BY-SA ensures that republishing of the material happens only in freely accessible work, which in the long-run advances science. I think science is greatly held back by closed-access journals.

And no one is preventing people from citing CC-BY-SA work in closed-access articles, or using it under any use considered fair use. They just can’t republish it.

All CC-BY-SA does is prevent people from directly republishing the full material. The data itself can always be used freely.

Data is not copyrightable (i.e. the pieces of data themselves, the direct information), at least not under US and EU law. The copyright is only for the observation itself, including the text of it, and the photos. And if it’s published on a website (as it is with iNaturalist), it can easily be cited by a link. iNat has good and simple permanent URL schemes to make this stuff really easy to cite and link to. And quotes are protected under fair use, so someone could easily reference a small excerpt, legally, where relevant.

I see zero need for a less restrictive license than CC-BY-SA, for “scientific” purposes. If I am wrong, please by all means explain how and why.

I think @andrawaag is referring to the problem of license stacking, which CC-BY-* licenses are subject to. Generally speaking, CC0 is the least restrictive license in the CC family, CC BY is slightly more restrictive and CC BY SA is a bit more restrictive. So if you believe people should be able to remix and reuse biodiversity data with the least amount of legal headaches/restrictions, CC0 is the licensing option you should go for. Of course, your mileage may vary etc etc.


I’d argue that BY-SA prevents others building on your work and then license the result under a different license (either more open or more closed). You would not be protected from what happened here (re-licensing) any more or less than with the other licenses, because re-licensing is either illegal or meaningless: a license builds on intellectual property rights, so it’s only you as the holder of those rights that can put a license on it. The best case scenario for GBIF here is that there was no IPR to be had anyway so both the original and the changed licenses were meaningless. Either way, good that this is being solved. I think it is great that a forum post can lead to improving practices, kudos to all involved.

The reason that SA is not a part of the licenses GBIF accept is that the SA part refers to derivative works. It is very hard to define what derivative work of an observation would even be. (The photo has its own license, it’s not a part of the observation). A larger data set or database would not be derivative of an observation in it, so using it in a scientific paper would not affect the license of that paper either. So even if a fact would be copyrightable, the only practical effect would be some added confusion. Generally, more rules means more potential for work not being reused just to be on the safe side.

I personally like the idea of SA and other viral licenses, but not the consequences. It’s the most restrictive of the open licenses. CC0 says “go ahead, use this, you really won’t get in trouble” which is exactly what I want my data to say. Also, I want my data on GBIF so SA would never be an option for me here. (Side note: my pictures are CC BY, as receiving credit is not a right I can waive under Norwegian law. No one would get sued over not giving me credit, but I’m okay with reminding people in this way that giving credit is a good thing regardless.)

My own choices aside, I totally respect that others can have other goals with their data, and applaud anyone giving this thought and actually actively choosing a license :)


The license issue for the GBIF dataset has now been fixed!

Screenshot from 2020-02-19 17-51-23


Yeah, that’s right—thanks to a ‘hot-fix’ by our own @nvolik!

So, for instance, if you only CC0 records from iNat and eBird, you can combine datasets and filter by licence:

Thanks for the feedback. Feeling especially grateful for the work @nvolik and the rest of the Secretariat’s informatics team to upgrade the backend infrastructure over the past 2-3 years to make it technically feasible.

h/t as well to @pigeonspotters1888 for his analysis


I am really grateful for this solution, also that observations from eBird can be included now. Thanks to all involved.


By way of conclusion, my colleague @pigeonspotters1888 just shared a post detailing the findings of his analysis:


CC-BY-SA does allow derivative works, and the derivative works can be distributed under the same or any less-restrictive license. The derivative works can be included in a larger publication without requiring the larger publication to included in CC-BY-SA.

All true, except for the “any less-restrictive license” part. It has to be the same license (or the compatible Free Art License 1.3 or the GPLv3 license).


