How confident can we be that iNaturalist data will be preserved?

Having it on GBIF is definitely reassuring. Do you know if GBIF backing up the images and the identification records? From what I can tell GBIF links to the iNaturalist photos, but maybe they are saving the photos as well.

Thanks!

Andy

1 Like

I’m afraid I can’t give you an definite answer on that. From what I see, they do present the record with pictures, but I am not certain whether this is just linked or backed up.

Additionally - I forgot to mention this earlier - because GBIF is a scientific data infrastructure only research grade observations are imported to their database.

1 Like

I’m sorry to hear about that Janet. It’s all too common for these projects to get abandoned. I’m glad they have preserved the data for now at least.

1 Like

I believe that GBIF is data only, photos are not archived by them. Importantly, only some observation data is sent to GBIF by iNat. That means that, currently, things like Annotations (eg. Insect Life Stage) and fields that may be able to map directly to GBIF (eg. “Count”) do not go to GBIF.

1 Like

i think GBIF gets only the Research Grade observations, and if there’s an obscured observation, GBIF shows the fake obscured coordinates.

Yeah and the obscuring has been especially concerning as more things are being auto obscured for various reasons, in some cases without a clear reason.

GBIF does not appear to get updated IDs (or it just takes very long to do so), and would only reflect a fraction of most people’s data. I suspect the iNat staff will refrain from commenting here for obvious reasons – no one knows if/when funding may run out, and what would happen in that eventuality.

I hope that in that unfortunate scenario, we’d at least be able to get a file that represents all our sightings, their photos, tags, comments, and so on. So that if the API ever resurfaces, we can just import our iNat data completely.

2 Likes

Good points. It seems like there should be a contingency plan for the community as well as one for the staff. Having people preserve their own observations seems like a good way to preserve and then reassemble the data.

1 Like

there’s nothing that prevents you from being able to download your data and photos now, if you want to.

Just like I don’t rely on Facebook to store my family pics in perpetuity, I definitely don’t rely on iNaturalist for photo storage. They’re compressed versions anyway. People should be using other cloud and physical back-ups for photo storage.

FWIW, similar questions have been asked and responded to by staff in the past, e.g. here’s what Scott said in August 2017 (which, as such, may now be outdated):

@loarie: iNat’s assets are currently stored on Amazon Web Service, and the database is stored at Rackspace, we have backups at Datapipe.

iNat is owned by the California Academy of Sciences a museum that has been responsible for maintaining one of the world’s largest natural history collections for over 100 years. iNaturalist has been online since 2008 and we certainly expect to be around for another 10 years. The iNat program is a ‘core’ part of the Museum budget (ie not soft money) but I’d be lying if I said program-focus in the non-profit world isn’t volatile. However, we have 3 years of funding ‘in hand’ which is about as good as anyone can expect in the non-profit world. Furthermore, much of our past, current, and future work is to ensure the long term sustainability of iNat through fund-raising and partnerships.

And, as of June 2017 iNat is “jointly supported” by CAS and National Geographic Society. There is also a handy donate button at the bottom of each page on the website and apps settings pages. ;)

7 Likes

Is there an iNat store yet? You know…shirts, mugs, stickers…must be worth something funding-wise. Or maybe that would somehow violate the funding guidelines that iNat already has (being a “non-profit” system).

4 Likes

No, but they’re working on it.

3 Likes

@andy71 – I had the same concern as you when I first started using iNat and hesitated for a few years in putting too much effort into posting records because I wanted to see if it would really last. I’m still not certain of its likelihood of being around long-term although I obviously hope it is. But I will continue to post records occasionally. Nothing lasts forever and we have yet to see how viable such internet database systems are over decades or longer.

2 Likes

Yeah, I download my own observations every year or two so I can put them on an arcgis map. That functions as a backup. I also have kept all my photos though more because it’s easy than because I am worried.

2 Likes

I appreciate this question being asked. I have been thinking about this for the past few months in light of recent current events myspace, flickr, Google+, etc. I was going to post a similar question but found this in a quick search before posting.

Users are entrusting exorbitant amounts of precious and valuable data to this platform and I am very curious to know what kind of assurances are in place that this data is secure in the short- and long-term. Yes, it is prudent for users to backup their own data but I’d love to know what measures are being taken by the platform to ensure the security and longevity of their content.

  • Larry
3 Likes

Thanks for all the great responses to the questions I asked above.

Nothing lasts forever and I don’t expect iNaturalist data or any museum collections last forever. However, museums and collections make a commitment to preserve and protect their physical collections in perpetuity, and I believe they write up plans in case their collections need to be transferred, moved etc.

As a researcher, I have used the Data Dryad repository to preserve data from research publications. I dug a little deeper into Dryad to get a sense of it’s preservation policies:

https://datadryad.org/pages/policies#preservation

Dryad follows the Open Archival Information System (OAIS) policy which is a standard model for preserving digital data. I don’t know if iNaturalist has a long-term preservation policy, but replicating what Dryad has committed to would essentially be the kind of commitment or assurance that I’d like to see from iNaturalist.

I think it would give us users peace of mind if the Cal Academy could explicitly commit to providing the same level of preservation for the iNaturalist photos, identifications and metadata as they do for their physical specimens. Perhaps they already have, I’m not sure.

I don’t want to presume to speak for other users here, so I’ll leave it as a question: what kind of institutional data preservation commitment would you like to see going forward? Is this a reasonable thing to ask for?

4 Likes

the best kind of institutional commitment comes in the form of an endowment dedicated to whatever purpose you want to keep going. a $1MM fund with a 4% per year target withdrawal rate should last indefinitely if managed reasonably and throw off at least $40,000 per year most years.

1 Like

This is a real issue.
But it is not just one of “preservation” but also of usability.
So iSpot still exists as a ciitizen science tool, but it has been so degraded by sloppy updates and lack of interested programmers. It is unusable.
In South Africa, our virtual museums funding stops this year. Millions of records will be archived and available only to researchers, unless local enthusiasts find the funds to save them. We are toying with bringing some of it across to iNaturalist, as one possible option (if the users buy in to this and agree, but many use the sites because of features they dont like about iNaturalist). But our (the southern African community) participation on iNat is on condition that if iNat folds, it will hand over all our southern Africa data (pictures, observations and IDs) and the software to run it, to the South African National Biodiversity Institute to keep a local version going.

6 Likes

[Edit, answered some of the questions I posed.]

If I’m reading it right there were 1.3M observations in around the first week of May, that’s a lot of data to handle for backup. And a lot of cost.

Relevant info that’s not mentioned is the licensing of the data, I was assuming it was something like OPL that Wikipedia use, but can’t see that info?

Another relevant bit of info is the accounts for the project, are they open?

Further who in particular controls the project, can it be sold as a commercial asset and locked away? That’s as big a risk as it being left to rot IMO.

Is the website and app open source (allowing a community to continue the project more easily)?

Finally, how much data is there currently from the project; are people free to acquire it (at minimal cost, ie bandwidth cost, or the cost of actually administrating the copying) can it be torrented, for example?

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.