Questions re. use of iNaturalist data in research: GBIF exports, licenses of observations

Hello

We (myself & colleagues) are planning to use iNaturalist data in a research project. We are interested only in the observation data (locations, dates, taxon ID) not on the photos/sounds. The analysis involves thousands of observations.

We were planning to export the dataset through GBIF (as recommended here) but have realised there are substantially fewer Research Grade observations on GBIF than on iNaturalist. For example, RG observations of amphibians, as of now:

Questions:

  1. Is the difference simply because observations with more restrictive licences (than CC0, CC BY or CC-BY NC) are not exported to GBIF? Or is there some additional filtering happening from iNaturalist to GBIF (besides what is mentioned here)?

  2. What, in practice, does it mean for an observation to be licenced, in terms of implications for its use in research? In this thread, some suggested that raw data are not copyrightable, in which case any observation data posted on iNat can be used in research without restrictions. If on the other hand one assumes the licences mean something, then what? For example:

  • observations with a -BY licence (attribution): to give credit to the observers, would it be appropriate to include (in annex) a table with all observations used, linking to the iNaturalist URL and listing the observers?
  • observations with a -ND licence (no derivatives or adaptations): shall we conclude that they cannot be used in any data analyses (which can be seen as transformations of the observation data)?

Many thanks in advance.

8 Likes

For your first question, I think that yes, it is in large part due to observations with all rights reserved. See also this discussion https://forum.inaturalist.org/t/all-rights-reserved-observations/20390. The question of how to give credit is also addressed and it looks like it could be as simple as citing the DOI associated with a GBIF download.

1 Like

For your first question, I believe that the answer is yes, only those licenses are exported to GBIF and that accounts for the difference.

As far as an observation license, there isn’t 100% clarity here. It’s likely that the data underlying the observation would not qualify for copyright (as far as they are just facts). However, there could be other content associated with the observation that would (comments, etc.).

GBIF (and other entities) obviously believe that the license for the observation does mean something based on their usage of it. On the forum, folks have noted that some organizations won’t use data (or derivatives) from unlicensed sources with one example being conservation organizations or governments that won’t use single observations or reports created from unlicensed data (I think that’s what I remember).

Another reason to not use the licensed observations is because the observers have basically stated their desire that their data shouldn’t be used without their express permission (depending on the license). So while it might be legally acceptable to do so, it isn’t necessarily nice (or ethical) to do so. (and yes, some people may not be aware of exactly how the license they chose will impact restrict usage of their observation data, but I don’t know that it’s fair to make that assumption for them).

One potential downside to using these data anyways could be if use of data that isn’t licensed becomes widespread, it could discourage observers from using iNat, if they are afraid that their observations will be used when they didn’t give permission.

For observations that allow sharing their observation with GBIF through their license, citing them should be covered via use of the doi that GBIF produces. If you do want to use data that aren’t licensed for GBIF, you could always contact the observer directly with DM or email (if available) to request permission to use it, but this obviously doesn’t work for 100,000’s of observations.

1 Like

As frousseau noted, it’s about the licensing, and the details can be found on GBIF.

Every GBIF dataset has an explanation of how the data were harvested. The GBIF harvest of iNat data is explained here https://www.gbif.org/dataset/50c9509d-22c7-4a22-a47d-8c48425ef4a7#description, and includes information about how to filter data through the iNat portal to create the same data set.

3 Likes

Many thanks @frousseu , @cthawley and @pholroyd for your thoughtful replies, which answer my questions completely.

1 Like

Am I understanding correctly that if an observation is “all rights reserved”, I would either need to get explicit permission from the user to include it in (e.g.) a database for a peer-reviewed paper? In other words, would I not be allowed to publish a link to to an “all-rights-reserved” observation in a peer-reviewed article, even if I’m not transforming or manipulating the data in any way, without getting explicit permission from the user?

As an example, say I’m collating records of a specific insect feeding on different flowering plants, the goal being to provide a list of all known host records of this insect, including a literature review and a synthesis of community/citizen science records (like iNat). I would not be using any iNat pictures without explicit permission from the observer, but I would plan on including a table with a list of all the plant hosts and citations of either the literature reference or iNaturalist of these records (with a full list of iNat observations in a supplemental table); I wouldn’t be manipulating the data in any way otherwise, except possibly to identify one of the species (the plant or the insect) in the photo. Is this a violation of “all rights reserved” observations?

For the record the majority of users I’ve reached out to about things like this are very willing to allow their data to be used for scientific/research purposes, but in some cases users are no longer active and/or are not responsive to messages and comments, which is my main concern.

1 Like

While the exact legality is unclear, yes, I would argue that usage would be violating the spirit of their license (and potentially the letter of the law as well) by using observations whose licenses don’t permit that without asking explicit permission.

You could publish the link for sure, but not use the observation in analyses (or a new creation), etc. I think publishing the table would fall under this. It’s very unlikely that a user would try to enforce their copyright on observation data (and from what I’ve read, it’s unlikely they’d win if they did), but with the license they are saying they don’t want someone else to use that data.

I think the new blog post on photo licenses explains some of the potential issues well (even though it wasn’t written about the observation license): “licensing your photos with a CC license lets a scientist publish your photo in a paper describing a new species or a novel phenomenon. It also allows scientists to use your photos and your data in ways that probably don’t require a license but where the laws in various places are vague or inconsistent. Imagine having to understand the laws in the country of origin of every single person who made every single piece of data you plan to use in your research. CC licenses help scientists avoid those kinds of headaches.”

4 Likes

Thanks for the response. Those are restrictions that I assumed applied to the photos or sounds uploaded by a user (which I had no intention of using without explicit permission) but I have a hard time understanding how that also applies to data underlying an observation - especially if some of those data (like IDs of things that aren’t the main focus of the observation) aren’t provided by the observer, but other users of the platform (or even not provided at all in any way on the link, just through the photo itself). I guess if I look at an image of an insect (to use my example from earlier) and can identify the plant it’s sitting on, writing that down somewhere counts as making a derivative of it (or “changing” it)?

That renders a lot of observations on iNaturalist useless for research if the user is no longer active on the site.

Venturing into realms that I am less familiar with here, but it seems to me that if you are making your own conclusions from the photo (and not really using any of the other observation data per se) it should be fine, even with the license.

The observation license seems to cover things like the location info, date, other tags the observer makes, description, etc. So if you’re not using that information, I don’t think it would be an issue (probably, I’m definitely not an authority).

But I think it’s key to emphasize that this is a really gray area without a clear answer that I am aware of. I would suggest checking out the beginning of this thread if you haven’t seen it before.

A little way down @kueda notes “I should also point out that while photos are subject to copyright in most jurisdictions, observations may not be, as they represent facts about the world and not necessarily the kinds of creative works copyright was designed to protect (unless you write a description). On iNat we assume observations are copyrightable, but the only way to really test this is in court.”

So I agree that it’s likely that that observation data wouldn’t be eligible for copyright, BUT there isn’t a definitive answer on this given the complexity of the law (potentially in many different countries) right now. I do think an important point is respecting the wishes of users however. In that thread, the OP discusses how iRecord harvesting their observations (when they had explicitly set their license to prevent others using the OP’s info) was discouraging them from participating in iNat further. So I think the negative consequences of using people’s data when they’ve said they don’t want others to do so via their license choice are real.

4 Likes

I appreciate the in-depth reply and discussion!

I do think an important point is respecting the wishes of users however.

Agreed completely; I guess I had not thought about how saying what an observation contains could (theoretically) violate a user’s wish of absolutely no reproduction or derivatives (though again, still hard for me to reconcile that wish with posting something to a public site). I have zero to none copyright law expertise though, so this has also made me rethink how I view observations and how I post my own observations (as in both cases I am hoping to make things useful for research, rather than being lost in the millions of observations posted).

1 Like

Yes, I had originally had my own observation license set to the iNat default, but after convos with others on the forum I changed my observation license to CC-BY to be more “open” and useful to others. And I hope other users do to!

I also agree that I don’t personally get why people would post things online to a cit sci website and then restrict to All Rights Reserved. I think that a lot of users may do this unintentionally for their observation license and just set it to match their photo license which is a shame since it keeps a lot of useful data from going out to GBIF. Hopefully that can change over time.

3 Likes

On this note, see this feature request:
https://forum.inaturalist.org/t/provide-separate-checkboxes-for-media-and-observation-data-licensing-upon-sign-up/16404

(I know you probably already voted for it cthawley, I’m just trying to increase the visibility of the feature request)

1 Like

Many thanks to all, this discussion and associated threads have been an eye opener on the issues of licencing. I am still puzzled that 40% of observations are licenced in such a way that excludes most uses in research, and I suspect most people did not intend it like that. I had myself not realised that by having photos with a CC-BY-NC license I was excluding them from Wikimedia, which was never my intention. I selected Non-Commercial not because I need to protect any rights (I am not planning to make any money from this) but from a vague feeling of “I don’t want mean people making money from my lovely observations”; when in fact there are many good commercial ways of using these data (e.g. in field guides or environmental impact assessments).

So like @cthawley I just went back to my own observations/photos/sounds and changed the licence to CC-BY, so that they can be used not only in GBIF but also in Wikimedia.

The change can be easily done in: Account Settings → Content & Display → Licensing
(covering either just the new observations or changing the license in the existing ones)

6 Likes

With respect to citing an observation on iNat in a table as illustrating an example of an interaction, that should fall under “Fair Use” and/or is similar to citing a copyrighted book or journal article. Licensing and copyright primarily deal with redistribution or replication, not with simple citation. If you don’t feel confident about it, I suggest talking to a university reference librarian. Most are well-versed in fair use for scholarly work, or can direct you to more resources.

2 Likes

I agree, I think you can definitely cite observations. I think the issue with the license would be if you are making use of the data that go along with the observation (you’re reprinting coords/location/time, making a map, analyzing phenology using the date, whatever). I think this type of thing would violate the “spirit” of the license since you’d essentially be redistributing those data.

To be clear though, I don’t personally think those observation data are actually eligible for copyright since they are just facts (total layman’s opinion). My only real concern is users feeling that their wishes for their observations haven’t been respected and disengaging with iNat/the community.

There’s also a question of whether it would be most appropriate to let the user/s know that you are using their observations/data in a publication. I think most scientists have an expectation that anything they publish is fair game for others to build on as long as they cite, but the general public may not have this expectation. I’ve coauthored short notes with multiple iNat users based on their observations which has been a positive for both me and them, but this isn’t really feasible at the scale of a large table of observations, etc.

4 Likes

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.