Downtimes due to server maintenance? Avoidable?
Are they announced ahead of time?
I'm specifically referring to planned maintenance - for example, the server (not the VPSes, the server that hosts them) needs to be rebooted to install patches or upgrade the kernel or whatever reason.
I'm thinking of hosting a forum on my linode and if there's going to be something like a couple hours of downtime every six months or something, I'd like to know ahead of time and think about ways to plan around it.
What is linode's recommendation for keeping a site visible during a planned maintenance?
I'm not asking about true HA with a cluster, failover, etc. I'm just hosting a forum where I'd prefer to avoid it being down for a few hours, not running the New York Stock Exchange trading floor
14 Replies
>
In my experience, host outages are very infrequent. For my Atlanta Linode, the last hiccup I'm aware of is a forced reboot a couple years ago to upgrade from User Mode Linux to Xen. You can see status announcements at 99.99% uptime
@Vance:
>, or your lost time is refunded back to your account. 99.99% uptime
Yes, I realize that, but three nines is still 8.76 hours per year.
Is systems maintenance part of that 8.76 hours, or outside it? (the definition of what's covered by an SLA is not standard in the industry alas)
I'm not questioning linode's excellent reputation. I'm just wondering if they periodically say "you're on a server that will be down for 3 hours this Sunday while we do maintenance".
In the past year I have 99.54% uptime (5h 34 mins downtime), the longest downtime was 1 hour 20 minutes due to a network failure in Dallas (unexpected not planned), the rest are short 5 minute intervals while I did maintenance on the server myself.
We give a minimum of 7 days (usually more) notice for planned maintenance and you are notified via support ticket.
We open support tickets for all emergency maintenance events as well. These are rare and are usually resolved in 5-30 minutes. We've seen just about everything that can go wrong and have procedures in place to minimize their impact.
Any widespread issues or network maintenance announcements are posted on
-Tom
@[url:
http://en.wikipedia.org/wiki/Xen#Virtualmachinemigration"]Administrators can "live migrate" Xen virtual machines between physical hosts across a LAN without loss of availability. During this procedure, the LAN iteratively copies the memory of the virtual machine to the destination without stopping its execution. The process requires a stoppage of around 60–300 ms to perform final synchronization before the virtual machine begins executing at its final destination, providing an illusion of seamless migration.
I've assumed Linode uses this feature to do host maintenance without causing downtime for us. Anybody know for sure?
@obs:
In the past year I have 99.54% uptime (5h 34 mins downtime)
5h 34min downtime in a year == 99.936% uptime.
@hybinet:
@obs:In the past year I have 99.54% uptime (5h 34 mins downtime)
5h 34min downtime in a year == 99.936% uptime.
That was from pingdom..probably not a year then
@funkytastic:
I've assumed Linode uses this feature to do host maintenance without causing downtime for us. Anybody know for sure?
I'm pretty sure they don't. I think I recall the feature (or a request for it) being discussed in the past in the forum and the answer was negative. Personally I'm glad anyway - less complexity involved, and I find the straightforward design of the Linode setup (standalone host, local BBU RAID 10 array, etc…) attractive. No magic involved :-)
I'm sure there's a constant stream of swappable parts being replaced (drives in the arrays, maybe power supplies, etc…) but I don't think guests are ever transparently migrated.
I think it's just pretty rare for host maintenance that intrudes on the guest. Of course, my view is limited to my relatively small sample set (5-7 Linodes), but in the past year for example, there definitely hasn't been any scheduled host maintenance affecting them.
In the unscheduled category, one Linode had two host reboots for an issue (a reboot can be a 30-40 minute outage depending on where in the boot sequence your guest is). I also had one instance when a data center power maintenance was going to take one of my hosts offline for several hours, so Linode set up migrations to a different host that was not going to be impacted in advance of the outage. That was a brief outage to migrate, but I got to choose the time, and that was the DC and not really a Linode maintenance.
For my main 3 production Linodes that have been constantly monitored for the past year, they have been reachable (both ping and service checks) 99.953%, 99.966% and 99.932% of the time, under a 5-minute polling granularity. That should be conservative due to a few brief outages on the monitoring network connection that aren't completely factored out.
All three nodes have had system uptimes of over a year at some point, and in general anything less has been my doing, such as when I restarted one Linode late last month to finally get the memory upgrade from 360 to 512 (until then, it had been up continuously since being created in Jan, 2010).
I'm sure there are exceptions, but in my own experience, the Linode hosts themselves (and whatever processes Linode uses to manage them) are simply very reliable.
-- David
@obs:
@hybinet:
@obs:In the past year I have 99.54% uptime (5h 34 mins downtime)
5h 34min downtime in a year == 99.936% uptime.That was from pingdom..probably not a year then
1/1 to now maybe?
If uptime is that critical that you can't survive the odd scheduled maintenance, then the logical approach is to build an HA setup involving linodes in two data centers. Such a setup can be done starting at $34/mth (when prepaid). That should allow you to achieve a few more nines.