Linode + S3 backup? – Idle musings

I've just installed S3fs as per this article on the Linode Wiki and got one of my Amazon S3 storage buckets automatically mounting as a volume on my Linode.

Now I'm wondering; Could I use this bucket [in its guise as a volume on my Linode] as a destination for backing up my laptop at home? I'm thinking something along the lines of a cron job using rsync [or similar] to ssh into my linode and make an incremental backup to the mounted S3 volume. As I said, I'm a bit vague on some of the nitty-gritty details and security implications, so any thoughts gratefully received. Here are some of the issues I'm thinking about so far:

1: Security: How secure is the mounted S3 volume from outside access? The wiki article instructions mount it as root and it's outside the obvious publically accessible parts of my Linode, but I'm not quite clear on how easy it would be for some ne'er-do-well to access such a mounted volume on my Linode

2: Sync'ing: I've not used rsync before and my reading is turning up contradictory opinions as to its suitability for this kind of backup; some people claiming that rsync cannot do block-level synchronisation, if the backup is encrypted [which obviously it would have to be] and other people claiming it can

3: Speed: I'm presuming there would be a performance hit, given that any such backup would be sending data from my laptop to my Linode, which would then in turn be sending that data onto S3. The question is, would the convenience of having an S3 bucket mounted as a 'regular' volume, rather than having to deal with Amazon's proprietory API and data format outweigh the inconvenience of the slower access?

4: Bandwidth: Piping everything to S3 via my Linode will obviously involve using a fairly huge chunk of bandwidth for the first backup but, assuming some block-level sync'ing mechanism can be put in place, subsequent backups shouldn't be too hungry, should they?

I'm currently using Jungle Disk to do my backups to S3 _but, especially given that JD uses proprietory encryption and data storage methods, I'm a bit concerned about the long-term viability of this setup. I'd much rather try and 'roll my own' in some way. I did check out SparkleShare but I got put off by the fact that it uses the mono framework, which is an implementation Microsoft .NET [shudder!]

Any thoughts?_

4 Replies

Remember, you can reach your bucket from home, too. No need to pipe stuff through your Linode.

I personally use duplicity, which works remarkably well with S3 (or other massive-but-faraway devices) as a backend. It also avoids the whole S3fs thing.

Yeah. I read up on Duplicity too. I was kind of put off by this page, where the devs seem to be acknowledging that tar isn't really the ideal format, for this kind of backup, but don't seem to have any viable alternative

@hoopycat:

Remember, you can reach your bucket from home, too. No need to pipe stuff through your Linode…

True. I use expandrive to mount an S3 bucket, as a local volume on my laptop [used to use [url=https://code.google.com/p/macfuse]macfuse, but it seems to have been abandoned] so, you're right, there's no reason to make my Linode play 'piggy-in-the-middle'.

I guess I just got a bit carried away with the excitement of having grafted potentially infinite storage onto my Linode via S3fs and was looking round for something to use it for! :oops:

Duplicity does work around a number of tar's limitations, by using external indexes and by chunking stuff across many archives (which improves the seekability of a single file). The use of tar isn't a problem at all for backups, since the info needed to do an effective rsync is stored in the indexes and cached locally. It's a bit annoying for a restore, since you'll need to pull at least one whole chunk to get a file.

Fortunately, it handles that for you, and reads from S3 are pretty spry. And one good thing about tar under the hood: if all else fails, you can probably get at your data using gpg and tar.

To be honest, I don't generally think too much about it. It just works, it can do an incremental off-site backup of a Linode with 18 GB of stuff in about 10 minutes, and I can easily do single-file restores. It is part of a complete backup strategy, alongside the Linode Backup Service. :-)

Thanks. Maybe i'll take another look at Duplicity.

Reply

Please enter an answer
Tips:

You can mention users to notify them: @username

You can use Markdown to format your question. For more examples see the Markdown Cheatsheet.

> I’m a blockquote.

I’m a blockquote.

[I'm a link] (https://www.google.com)

I'm a link

**I am bold** I am bold

*I am italicized* I am italicized

Community Code of Conduct