Pros and cons of iNat regional partnerships? (The CalBioDEx test project)

Fact sheet: https://www.calacademy.org/sites/default/files/assets/docs/cbcs_california_biodiversity_data_exchange_flyer_260109.pdf

Tony mentioned CalBioDEx (the California Biodiversity Data Exchange) project. The fact sheet linked above states that this is “A first-of-its-kind partnership to make iNaturalist data more available to California state entities and to serve as a model for other states”.

It looks like iNaturalist is using this as a test project to see how it works to integrate directly with FLADs (First-Level Administrative Divisions, which are entities like States, Provinces, Departments, and Autonomous Regions, each with its own government). California is a great place to test it IMO, since it is the most biodiverse state in the USA.

Many of the pros of this approach are already listed in the fact sheet. But looking at the cons, I can imagine there being a high administrative burden for all parties if dozens of FLADs around the world eventually want their own data exchanges.

What other pros, or cons, or perspectives, did I miss?

In general I’m supportive of better coordination with state governments and similar entities. I do suspect this initiative may raise a few concerns (some hypothetical at this time). Some questions that come to mind:

  1. The CalBioDEx announcement mentions there will be a “forum to discuss important topics regarding community-derived California biodiversity data, such as issues related to data sensitivity/obscuration, data sharing, taxonomy, and expert engagement through data curation and identification communities.” Is this forum public? Or is it only available to project participants?
  2. What non-public data is being shared with California Department of Fish and Wildlife and California Academy of Sciences? (I think the only additional data being shared are non-public observation locations, but I’m not sure from the announcement.)
  3. Are obscured locations only being shared when this is due to taxon geoprivacy, or is iNat also sharing locations where observers individually chose the “Obscured” or “Private” setting? (If individually obscured/private locations are shared, the privacy concerns become a lot more complex.)
  4. How does this data sharing interact with the iNat user agreement and the general expectation on iNat that users retain some control over what they share?
  5. What commitments have CDFW and CAS made to manage and protect this data?
  6. Currently, iNat automatically obscures the location of many species with Near Threatened or higher protected status. Only a small portion of these actually face threats (such as poaching) that are exacerbated by precise location data. Opening taxon geoprivacy where there is no poaching threat would provide valuable data to all researchers, not just state agencies. Is there a risk that an initiative such as CalBioDEx may reduce interest in “fixing” taxon geoprivacy more generally?
  7. Conservation commitment by state-level governments varies widely. The announcement implies that the CalBioDEx initiative may be extended to other U.S. states and presumably to other government entities worldwide. Does iNat plan to limit data sharing to those governments that agree to certain principles? For example, taxon geoprivacy for the Gray Wolf (Canis lupis) is open in the U.S. (due to federal delisting) but is obscured in both California and Wyoming where the species is ranked as Critically Imperiled (S1). Wyoming already allows Gray Wolf hunting (despite the listing status). Would iNat enter into a similar data sharing partnership with Wyoming Game and Fish Department? Would iNat have a problem if that data was used to allocate wolf hunting permits? How about if recent wolf observations were used to direct trophy hunters to prime locations?

That seems like plenty to get the discussion started!

I don’t have any iNat observations in California or the US. i do have observations in countries, where i don’t want to share the very few coordinates i obscured (by hand) with state authorities. I don’t believe that having a job in a corrupt government or state agency will promote somebody into being a trustworty person.

You will have to rely on the trustworthiness of organizations on an individual basis. I agree, not every country’s official “nature” departments can be trusted. Even those which used to be trustworthy can change when new administrations come in, or when infiltrated by corrupt people.

Even so, I trend towards trust and data freedom. Overall, the sort of people who accept the low wages, sweaty work, and lack of prestige of a job in a government natural-resources agency are those who love nature and want to preserve its resources. There’s no real reason for poachers to sign up for these jobs, and who else is gonna exploit access to the data?

In many countries goverment jobs are exceptionally well paid, and the ministry of environment and other positions in this field of work go to political friends irrespective of knowledge, competence and motivation.

It’s a developed country presumption that jobs in enviroment and conservation are filled with motivated and competent people. In the developing world these jobs are connected to big money from abroad, so political friends of the goverment are shoved in, even if they don’t have the tiniest clue.

Sure there might be a good apple in a barrel of bad ones… you just have to search long enough.

The only difference between collectors and poachers is the govermental permit. I have myself observed collections by officials of 2 of what we then thought were the last 4 individuals of a species. Gladly i found more individuals afterwards, but you might guess it → i did no longer show them to officials.

I should emphasize that at this point we do not know that iNat plans to share location data for observations where the observer chose to obscure the location or make it private. In fact, I would expect that sharing precise locations applies only to “taxon geoprivacy” (i.e. observations automatically obscured because the species is sensitive) and not to what we might call “personal geoprivacy”. The latter capability is often suggested as a way to preserve location privacy for people concerned about sharing the location of their home or work addresses, so I would be surprised if iNat felt able to share those location data in any way.

But we don’t know for sure that iNat is making this distinction, which is why I asked the question. Also, at this time California is the only place where iNat plans to share data with state authorities, although the announcement strongly implies this is intended as a test that could be expanded to other U.S. states, and potentially to other countries.

Concerns like these make clear why iNat should provide more clarity on what data are being shared, who will have access, how they will protect the data, and more generally on what criteria will be used to decide which governement agencies qualify to receive data.

To me, the best framework might be to require data recipients to sign an enforceable agreement to adhere to strong principles for how the data are used (not shared beyond signatories, only used to protect the species concerned), and to limit the data sharing to taxon geoprivacy, with personal geoprivacy remaining unchanged. That way, any iNat user can still make a personal choice to obscure locations for particular observations.

Yes, as far as I know, iNat staff have stated that they will not share true locations with user-chosen geoprivacy with scientific bodies - only taxon geoprivacy.

This

is actually incorrect. iNat already shares the true locations of observations with taxon geoprivacy with its portals in the iNaturalist Network, see:

https://help.inaturalist.org/en/support/solutions/articles/151000171181-what-s-the-inaturalist-network-which-site-should-i-affiliate-with-

Users can choose to affiliate with those portals to additionally give them access to their observations with user-selected taxon geoprivacy (much like trusting a project/other user), but don’t need to.

If you have a source, I’d be interested to read that. I didn’t see this distinction clearly stated in the announcement. Given the community alarm about the Google AI grant, I was somewhat expecting iNat staff would over-communicate around future initiatives. (I appreciate that this one is not the same thing by a long shot.)

Thanks for making this clear. I probably should have written “California is the only place where iNat plans to share obscured location data with state authorities without requiring the observer to give their approval.”

I’ll say that I think I’m still in favor of this initiative, but I would like iNat to state more clearly exactly what will be shared and how it will be protected.

I still don’t think this

is correct.

If you read the Help Page I linked above it says:

“They [network sites] have access to true coordinates within their geographic areas that are automatically obscured from public view in order to better protect threatened species.”

Those sites have access to all true locations for observations obscured via taxon geoprivacy regardless of observer approval.

I guess I was interpreting the access to the true coordinates by iNaturalist Network sites as optional for the user because they don’t have to choose to affiliate with a particular iNaturalist Network. Prior to CalBioDEx anyone can set up their account at inaturalist.org and retain full control over observation geoprivacy (to the extent available).

This

is only the case for observations with user-selected geoprivacy, not taxon geoprivacy. My understanding based on that article is that the networks can access all true observation coordinates for observations with taxon geoprivacy in their geographic area regardless of whether users have joined the specific network or not. Maybe @thebeachcomber can confirm?

yes this is correct. So if eg a tourist from the US visits Australia and uploads iNat records from here, the true coordinates of their taxon geoprivacy observations will be available for researchers even though they haven’t joined iNatAU.

But,

is not correct, at least for Australia. We do not give out the true coordinates for user-selected geoprivacy at all, regardless of whether someone has joined the network or not. Only taxon geoprivacy affected observations are available

Hi! Thank you all for being interested in the California Biodiversity Data Exchange (CalBioDEx) and with our goals. I am going to answer your concerns below, but as I haven’t been part of this community forum before, I wanted to say hi! I am the facilitator and data manager for this collaboration, and am very excited to be part of something that allows Californians on iNaturalist to be able to participate in species conservation at the state level. iNaturalist is now one of the largest modern data sources for biodiversity data thanks to all of us observing and identifying, and we really want the people charged with protecting our T&E species in California to have the most up to date information they can, so that they are able to make the best decisions for the species that they are charged to protect. The CalBioDEx is not a test done on a whim. It is a pilot scheme that was born from years of discussion between the partners involved, about how biodiversity conservation cannot work at scale without the value of participatory science involvement, and especially how Natural Heritage programs such as the CNDDB (California Natural Diversity Database, managed by the California Department of Fish and Wildlife, CDFW) could better serve their function with more timely data ingestion. California is the only state in the US to do this so far and it’s a great place to work through many of the complexities of this type of partnership, because it is such a biodiverse area, and has an enormous amount of iNat observations! It is similar to how country level partnerships, such as iNaturalist.ca, already work as iNaturalist Networks. However, because there is no place currently that a iNat user can choose to opt in to associate with us, the sharing is limited to taxon geoprivacy. I think of the CalBioDEx as a “lite” version of being a network partner!

The California Biodiversity Data Exchange collaboration consists of staff of iNat, CDFW and California Academy of Sciences. We meet quarterly to discuss all kinds of topics of relevance to the groups goals, not necessarily just about iNaturalist data. There is no separate public forum in a similar way to this iNaturalist forum.

This is correct, the only non-public data that Cal Academy receives from iNaturalist is the true coordinates of observations in California that have been obscured under taxon geoprivacy. Twice a year I, as data manager for the collaboration, receive a tabular DarwinCore Archive dataset of observation records. No image jpegs or sound recordings are provided. Cal Academy has a legal agreement with iNat to protect that data.

The true coordinates of observations in California that have been obscured under taxon geoprivacy are being shared with the collaboration, and only if they are not obscured as well under user privacy settings. No observations that an iNaturalist user has chosen to make private or has obscured themselves will have their true coordinates shared with us.

The data sharing fits within iNaturalist terms and conditions that users agree to when they sign up for an account, in that iNaturalist can share data at its discretion to affiliated organizations for conservation and research of T&E species. iNat users of observation records in California have the same level of control over what they share with CalBioDEx as they had before this collaboration existed, because the current agreement does not include true coordinates of “private” or “user-obscured” observations.

All members of CalBioDex are fully committed to managing the California data securely and protecting the sensitive species coordinates within it. I have many years experience with managing sensitive species data. At the same time we are actively balancing how these data can facilitate research to further conservation efforts of our sensitive species and better protect them and their habitats. Data license agreements between the partners exist that outline the terms and conditions of use.

With regard to external researchers, making data available is a stop gap measure until the CNDDB has caught up with the influx of iNaturalist data. All data requests are handled manually, and all data recipients must sign data license agreements. The collaboration created a draft rubric as part of the evaluation process. Among other things, any potential data requester needs to verify their position within a conservation agency, one that has a mandate to manage and protect T&E species, and provide a valid, project based, reason for the request. Academics will need to provide things like a thesis proposal that links to species protection or conservation, plan for data security etc. All requestors are carefully evaluated.

I see this as more of a question for iNaturalist and part of the bigger conversation about taxon geoprivacy that is prevalent on many of these forum threads than just the CalBioDex. However, I would say the value of increasing the number of true coordinate records of sensitive species in the CNDDB is that this provides a much greater understanding of true species distribution which in turn feeds back into Nature Serve rank decisions at the state level for individual species. So by allowing true coordinates to flow to the State, we may help to “fix” the delisting of species that are no longer sensitive, or offer protections for those that have become sensitive.

Sorry this post was long, but there were a lot of questions to answer! Thankyou @rupertclayton for inviting me to join this discussion.

My comment was not about whether Australia (or any other iNat portal/network site) gives out true locations to others, just on whether the user can join the portal and, by doing so, give access for the portal itself to their user-obscured locations. If the user does not join the portal, the portal will not have access to their user-obscured coordinates, regardless of whether the observations are in the portal’s area or not.

right my bad, I misread; yes this is correct!

Thanks very much for these detailed responses @longhairedlizzy. They go a long way towards addressing the concerns I had about what data will be shared and how it will be managed and protected. I’m very much in favor of natural heritage management making use of timely, detailed data from platforms such as iNat.

I think only my last two questions are still “open”, and I’d welcome any input from iNat staff and other community members. My last question (“What if Wyoming uses iNat data to tell hunters where to find wolves?”) is maybe something to revisit when iNat looks to expand the data-sharing initiative.

That leaves my other question: “Will CalBioDEx reduce the incentive for iNat to review taxon geoprivacy designations?” I agree that this is primarily an issue for iNat (rather than CAS or CDFW), but I do think iNat staff needs to consider this risk.

As background, let me state that California (and other regions) have a great number of threatened and endangered species. Most of these species face a sad range of familiar threats (land development, climate change, habitat changes) that are not exacerbated by sharing precise location data. Where precise location data do increase the overall threat (e.g. poaching of some cacti, orchids, carnivorous plants) taxon geoprivacy is very much the right way to address this. But assessing poaching risk takes some work, and understandably everyone wants to err on the side of caution. Consequently, we have current situation on iNat, where almost every taxon with NT or higher protected status is auto-obscured, these decisions are reviewed only when an iNat curator chooses to invest the time, and reviews typically take a few weeks of discussion to ensure consensus. As a wild guess, I’d say that 90% of NT, VU, EN and CR taxa are currently auto-obscured and of those only 10% are really appropriate for taxon geoprivacy.

It’s good that CalBioDEx removes that barrier for state wildlife managers and accredited researchers. But one major value of iNat is that anyone, including citizen scientists with an interest, can view and use detailed data about the natural world. I’m concerned that CalBioDEx will remove much of the incentive to fix the current situation in which many taxa have locations unnecessarily auto-obscured, because state biologists will say “Well, our team makes the decisions and we now have detailed location data, so everything is fixed.” What remains broken in this scenario is that everyone else sees obscured location data for a large number of threatened species that really don’t need it.

What I would like to see is some mechanism, maybe as part of a state agency or non-profit assessment process, that assesses geoprivacy risk and determines whether location data needs to be protected. This should be independent of listing status and based on a real world answer to questions like:

  1. Does the availability of precise location data pose a realistic threat to this species? Example: Documented poaching incidents for this species or a similar species. Counter-example: Hypothetical poaching risks not supported by actual cases.
  2. Does sharing location data increase that risk beyond what’s already available from public data? Example: Sharing nest location data for an endangered bird. Counter-example: Sharing location data for an endangered but well-known plant that was described from a single, protected location.
  3. Does the risk from sharing location data outweigh any benefit that precise location data might bring? Knowing precise locations can help uncover new populations, track population dynamics and help activists oppose/mitigate developments on land where threatened species are present.

I would see it as being a designation that can change over time, too. So that when a new risk, such as Dudleya poaching, comes up additional protection can be applied. I realize that once data are public, they’re never truly private again, but adding geoprivacy can still mitigate new threats when they appear.

I share this concern and have seen at least one comment from a governmental biologist with access to obscured data making this argument on a taxon flag.