Hardware Fail and unable to boot, support gives up

Linode has really failed me this time around. After 5+ years of service, they have left me in a irrecoverable state, and I'm not sure I can continue with Linode.

17 days ago, there was support ticket opened: where Linode administrators detected "an issue" affecting the physical hardware on my Linode. I was on vacation at the time, so had no clue this happened. Of course this brought down all my web sites; they attempted to resolve this with the last message being:

__"At this time, our administration team has fully resolved this issue on the host and your Linode is currently booting. There is no need to issue boot jobs for your Linode at this time.

Once this has completed, an emergency migration will be configured to another host immediately. Once on that new host, your Linode should boot successfully and be fully accessible."__

Of course, what happened was that my Linode was not able to boot successfully. And was left in such a state for the 2 weeks whilst I was on holiday without internet access.

Over the last week, I've been trying to bring this back up with countless messages with Support, and have tried everything. After 7 days, support gave up and told me to go to the forums as they claimed this has gone "beyond the scope of their support services."

I've tried everything in Rescue mode, to mount and boot up the linode, but it fails with message: "Linode failed to boot for unknown reason."

The sequence before it fails is:

Key type dns_resolver registered

Key type ceph registered

libceph: loaded (mon/osd proto 15/24)

mce: Unable to init device /dev/mcelog (rc: -5)

Loading compiled-in X.509 certificates

registered taskstats version 1

I'm using a old Debian release and have tried to change the configuration profile to 3.15, but it still does not boot.

I'm about to give up completely, just hoping anyone out there that can help with this.

3 Replies

I'm amazed that after 5+ years you don't have a viable emergency recover plan. Hardware/software/network/people bork all the time, any decent sysadmin plans for such events and has a bare metal recovery plan laid out, tested, and ready to deploy.

What you should be doing is spinning up a new Linode, running your automated installation/configuration scripts, and copying your latest data from your last off-system backup.

Doesn't sound like you have any of those options available.

Oh well, live and learn. Next time, hopefully, you'll be better prepared.

Create a new disk, install a new Debian system there, and use it to mount and recover the information in your old system. 1 GB should be enough for a base system having ssh and rsync.

If you can't create a new disk because you already spent your 24576 MB of disk space, you could upgrade your linode temporarily while you recover the info in your old system.

Thanks for the suggestion. I managed to mount it, and retrieve my content files, however, all the database files are gone as it appears the whole var directory got taken out by the disk failure.

Sigh, lesson learned indeed. Complacency got the better of me. Always have a backup is the lesson learned.

Reply

Please enter an answer
Tips:

You can mention users to notify them: @username

You can use Markdown to format your question. For more examples see the Markdown Cheatsheet.

> I’m a blockquote.

I’m a blockquote.

[I'm a link] (https://www.google.com)

I'm a link

**I am bold** I am bold

*I am italicized* I am italicized

Community Code of Conduct