Alternative to Linode's backup service

Linode's backups are too unreliable to be considered as our primary backup option.

Presumably the 'homebrew' version (assuming fairly standard web server) is to copy home directories, web root and database files to another location on a (nightly) basis. Any recommendations for where to put these back up files, and how to make sure services are not interrupted when copying them across?

Unless there are compelling reasons to store data locally (HDD in office), I'd prefer 'cloud' storage for easy remote access.

I had a look in the library, but didn't see anything. As usual I probably missed something blindingly obvious.

Ta

12 Replies

Take a look at duplicity. I've also heard good things about tarsnap.

Usual procedure is the same as you'd do for Linode's backup service: a nightly database dump (e.g. mysqldump –single-transaction), followed by a backup of everything you hold dear.

I use Duplicity with Amazon S3 as my backups backup. Simple and easy to setup.

@Serial Cookie:

I use Duplicity with Amazon S3 as my backups backup. Simple and easy to setup.
I'm curious… are you using a regular account with Amazon or using their free start up account?

I, for one, am using a normal paid account. Two big reasons:

1) AWS didn't extend the freebies to existing customers. (Which is annoying, but hey, Amazon is a ruthlessly efficient company.)

2) I have a lot of data up there: last month, we was billed about 270 GB-Mo, for a total bill of about $30.

If you've got the free tier, there you go.

Our approach is to just do a two-fold solution:

1) Linode internal backups, for maximum restore speed

2) Off-site nightly incremental data-only backups on a fast residential pipe (20 megabit upstream), for relatively fast repairs or rebuilds, because ironically my home internet connection is faster upstream than our office's

If there is a disaster on the host of some kind, we can spin up a new linode from our nightly backup and be back up and running in minutes. If something is wrong with the Linode backups (an entire Linode DC disappears, for example, or Linode the company goes poof), we can just deploy a new machine, configure the SQL/web/mail servers, and restore all the data. Takes a few hours, but we're back up and running without an extended outage. Of course, in either case, we'd have to figure out what to do with credit card transactions that happened after the most recent backup, last time we had this sort of problem we had to go back to the credit card processing logs and re-key the data manually, but that's not a big deal.

The offsite backup is pretty simple: a nightly cron job on my file server triggers a MySQL dump on the linode, waits for it to finish, rsyncs the changes, then makes a ZFS snapshot (the fileserver is Ubuntu Server 12.04 LTS running the ZFS kernel module from zfsonlinux with 2x5 disks in raidz arrays in a single pool). I'm thinking of setting up a third tier of backups on our office server. No raid or incremental or anything, just an emergency third tier that keeps two or three tarballs automatically, just in case.

@Guspaz:

Our approach is to just do a two-fold solution:

1) Linode internal backups, for maximum restore speed

2) Off-site nightly incremental data-only backups on a fast residential pipe (20 megabit upstream), for relatively fast repairs or rebuilds, because ironically my home internet connection is faster upstream than our office's
I take the same approach, at least for my primary Linodes. For development or test boxes I just use (2). While I've had few issues with Linode's backup service myself, if there are some failures (but not a complete DC) this also supports using whatever the most recent successful backup is to get an (older) baseline, so not quite having to do a full re-install, and then do a restore from (2) to catch up.

For (2) in my case, I use bacula (operating from a server at my home office) that handles backups for all of my distributed machines (including Linodes) under a classic daily incremental, weekly differential, monthly full approach, with varying retention periods depending on server. Bacula has a definite learning curve, but once you're over the hump it's extremely flexible. Storage wise everything ends up stored in local file "volumes" on the storage server, which if desired, can be efficiently distributed around to other machines (or S3) for redundant copies.

– David

It's a hand-written bash script for me. I've been meaning to add support for automatically thinning out older backups, because as it stands, I've got like two years worth of nightly ZFS snapshots, and that can get rather slow… In fact, at this point, it would probably take hours to delete an older snapshot… I actually migrated the entire ZFS pool from Solaris to Linux, and all the snapshots came with it.

Thanks for the replies.

Duplicity looks interesting, but what does it offer beyond rsync (am I going a bit crack-a-nut-with-a-hammer with my simple requirements)?

And it seems that building a local machine with a wad-load of hdd is the preferred way? I must admit I'm a bit 'meh' when it comes to aws.

duplicity stores the state of the remote side locally, so it does not have to interact heavily with the remote side. This is critical with things like S3, which charge for operations and outgoing transfer. (It also supports S3 and a bunch of other things – even FTP! -- which rsync does not.)

You can back up to a local machine if you want. However, restore speed will be atrocious, unless you live in a datacenter. What use are backups if you can only restore from them at 1 Mb/s?

I installed duplicity after reading your post. Looked interesting. For me it is a decent fit since my home office has a cable business connection 50M/10M so restores would not be that big of a deal.

I only did a small test to see what it looked like and it seemed to work well. I did a restore test and that wasn't too slow. Certainly reasonable if needed.

I determined that AWS was going to be more pricy that I thought after I added up what I would be sending though.

@hoopycat:

duplicity stores the state of the remote side locally, so it does not have to interact heavily with the remote side. This is critical with things like S3, which charge for operations and outgoing transfer. (It also supports S3 and a bunch of other things – even FTP! -- which rsync does not.)

You can back up to a local machine if you want. However, restore speed will be atrocious, unless you live in a datacenter. What use are backups if you can only restore from them at 1 Mb/s?

Lots of people have more than 1 Mbps of upstream these days. Bell Canada's standard connection these days is 15/10 VDSL2, for example, although most of the cable companies aren't doing DOCSIS 3 upstream bonding and still limit upstream to two or three megs.

@Dweeber:

@Serial Cookie:

I use Duplicity with Amazon S3 as my backups backup. Simple and easy to setup.
I'm curious… are you using a regular account with Amazon or using their free start up account?

It's the free 12month starter service. I'll re-evaluate after the 12 months are up but as Amazon doesn't charge for incoming bandwidth and I won't be storing more than 5-10Gb, the disk prices won't be that much.

Reply

Please enter an answer
Tips:

You can mention users to notify them: @username

You can use Markdown to format your question. For more examples see the Markdown Cheatsheet.

> I’m a blockquote.

I’m a blockquote.

[I'm a link] (https://www.google.com)

I'm a link

**I am bold** I am bold

*I am italicized* I am italicized

Community Code of Conduct