iNaturalist is Down

I was just on the INaturalist website on my IPhone when it said “INaturalist website is down”. There was no scheduled maintenance today, so I’m a little confused.
Has this happened before and is it just me, or is this happening to other people as well.

13 Likes

This has happened before, we shouldn’t worry about it. Staff will be on it as soon as possible.

You can check if it’s just you here: https://downforeveryoneorjustme.com

2 Likes

same, opera on desktop

1 Like

I audibly gasped when I saw that. I think I might have a teensy tiny iNat addiction. :weary:

26 Likes

We’re aware of it and looking into it. We’ll update when we can.

22 Likes

Hopefully the staff are aware of it and looking into it. And will update when they can.

3 Likes

Unbelievable! Now I have to wait to upload my moths :face_with_head_bandage:

6 Likes

You should go observe more moths to occupy yourself in the meantime.

23 Likes

Hey folks. We are indeed down and we may not get back on our feet for a while. Our deepest apologies to those of you who had events planned or were otherwise relying on iNat being up this weekend. Some important points:

  1. No critical data has been lost. If you were in the middle of an upload when the outage started, it’s possible that didn’t get through, but to our knowledge the database is fine and all the media are too.
  2. Some notifications may be gone. Not all notification data is in the database, so it’s possible notifications from the past week may be gone. We’ll see what we can do about regenerating them, but no promises.
  3. We may not be operational until Sunday. We don’t like it either, but that’s our current estimate.

This one came out of left field. We rent servers from Microsoft Azure, and unfortunately, those servers have never been as stable as we would like, so we’ve built redundancy into our cluster as anyone would, with replica databases, even more replica search indexes, many servers providing web results, and many serving the API. If any one or even a few of these servers go down, you shouldn’t notice b/c of this redundancy, and the servers that go down should get back to their jobs when they come back up. But when the backups and the backups to the backups all go down… you get this: all of our database servers went down at once, and so did the majority of our search index servers. If it was just the database machines it wouldn’t have been so bad. The site would have gone down, but we could have got it back again in an hour or so. But the search indices hold a lot of ephemeral data that takes a long time to rebuild. If one or two go down the cluster can recover, but in this case all but one went down, and we can’t recover from that without a long and slow re-indexing process. Hence the extensive probable downtime.

Again, we are really sorry about this. We’re doing everything we can to get things running again as soon as we can, and when we’re out of panic mode we’ll be investigating ways to prevent this in the future.

101 Likes

Thanks for the detailed update and best of luck with it! Hopefully it doesn’t eat up your whole weekend!

4 Likes

Have you tried turning it off and turning it on again? :laughing:

Sorry, as a CTO for a tech startup I feel your pain. Best of luck herding those felines back to life.

14 Likes

That’s pretty funny :) Thanks for the laugh haha

2 Likes

Azure mysteriously doing this for us is what got us into this mess, so… no, we’re not flicking the power switches.

16 Likes

Microsoft always knows what’s best for you…

1 Like

Goodness! I hadn’t realised I had become so dependent on iNaturalist until I opened the site this morning and was aware of a creeping sense of dread and almost panic at seeing it unexpectedly down! Best of luck guys, just don’t know what I (and numerous others) would do without you.

18 Likes

I’ve automatically tried to open the website five times since I saw it was down, I think I have a problem.

25 Likes

Darn, I was about to upload my first cetacean observations :(

Good luck Staff and thanks for all your hard work!

4 Likes

I guess you’re in West US 2? From a sysadmin: Thank you for your service, and sorry about your weekend :(

8 Likes

Yup, that’s us. I guess Azure is finally reporting the incident. Some of our machines are back up, others coming up and going back down again. All thanks go to @pleary.

13 Likes

Thanks everyone for bearing with this outage.

It was indeed caused by a major outage at Microsoft Azure in the West US 2 region where we rent our servers. Unfortunately, we have to rebuild the search indices which will take a bit of time - and it looks like Azures yet not fully recovered so we’re still partially waiting. We also may need to get some sleep since this happened around midnight EST where our mighty dev-ops team of 1 is located.

We hope we’re on track to be up sometime Saturday or Sunday, but we’re not yet prepared to give a proper estimate. Again our apologies for this inconvenience. We’ll keep this thread updated as we know more.

Happy Friday… and thank you @pleary

45 Likes