Linode hardware question
Backups are still not easy; nor is that the case when one has limited drive space.
The largest drive for instance at I own is a 100GB disk. I need to figure out a method of backup that just works. Either that or wait until Linode provides backups for all datacenters.
Any suggestions on what I should do?
I don't want to move my server from the Dallas, TX datacenter.
Thanks.
Regards, –Keith
24 Replies
dallas165:
dallas158:
host15:
These are only situations in which data was lost due to a double disk failure. They don't include human error: people mess up and unintentionally destroy data on their Linodes on a regular basis. You don't hear about those, but they demonstrate a very important point: RAID is not a backup. It will merely ensure that your mistakes are instantly and reliably copied across multiple physical disks.
You can either play the odds, or you can back up your data using one or more methods. Linode's backup service is being beta-tested in all datacenters as we speak (open a ticket to participate), but even that won't protect against datacenter loss, a protracted network outage, or Linode being suddenly unable to provide service to you. Also, it's in beta, so if you rely on it as your only solution right now, you're a fool.
There are many options:
1) I use something called BackupPC on a local server, which is overkill for 90% of situations, but it is useful if you're backing up a lot of computers and have the hardware to spare. Downside: my workstation and the backup server are less than a foot apart, so I must carry backups offsite on thumb drives. Also, my upstream bandwidth is considerably slow, so my restoration plan has this secondary to Linode's backup service.
2) Automatic rsync to Amazon S3, rsync.net, or another third party provider is a fairly decent way to do it. This will cost money, but it's probably well spent. There's numerous other tutorials out there on it.
3) Manual rsync or tar to your local workstation. I did this back when I ran Windows at home, but while it's better than nothing, a backup solution that isn't automatic isn't a solution. Data integrity is too important to leave to humans.
James
It's as easy to use as tar, it's cheap and it's secure.
I backup a bunch of linodes and retain a bunch of snaphots. Even though I don't have 100GB of data, there's block level compression and de-duping so they're surprisingly compact.
I don't have a sidekick cell phone to place backups on.
The resulting backup would need to have as much compressing as possible owing to the fact that the drives I own are smaller in size than most of the stuff I would need backed up.
I know that RAID is no substitute for proper backups. But backups are often a pain, and are not easy to configure.
The last thing I want to have happen however is for my Linode to suddenly disappear, have data whiped out that I really cannot aford to reinstall at any caust, that kind of thing.
I do try to not reinstall my servers constantly like most of you hard-core Linoders.
Once I set it up, I leave it working unless an act of God renders my Linode shattered permanently. Or if I forget to backup the Linode.
While syncing the Linode is fine, I don't know the best way to do it.
I would wish to backup the entire profile with all attached images etc, so that if something happened to the Linode, it would be a mear matter of moments to fix it.
Not that I think it will, but still.
I am paranoied about my Linode hosted on Dalass87.linode.com exploding or worse beyond the ability of Chris or anyone else on Linode staff to fix.
Is there a set price for the Linode backup service once it is not in beta anymore?
If Linode staff wish to jump into this post, I welcome it.
I'm a blind computer user; and backups are difficult enouh to make them work.
Automatic backups are important. But if they criple one's budget..
Regards, –Keith
Thanks.
Apparently they got back "most" of the data.
I have two backups that I run on a regular basis:
1. Manual rdiff-backup daily to my home machine, and
2. Automatic gmail backup script.
Keep in mind that neither of these solutions cost any money.
I don't have the room for an extra computer in my dorm at college so I run my rdiff-backup from a virtual machine on my home computer. I simply have a VMWare installation with an Ubuntu 9.04 server install. Every morning I boot the VM, run a bash script which runs the rdiff-backup command, and then shuts down the VM. It's not automatic but could be made so easily if I had a spare machine to keep online 24/7. I have never had to restore a backup from my VM, but it appears to be as simple as running a command to revert to a specific date/time.
The other backup method I use is a custom built one that can back up directories to gmail account. The script does the following:
1. Compresses all the files in the specified directory using tar. It exports the file list to a text file in /tmp/.
2. Encrypts the .tar.gz using GnuPG.
3. Splits the encrypted file into 24.5mb chunks (file.aa, file.ab, etc).
4. Starts mutt and attaches the file list file and the encrypted file to an email which is sent to a storage gmail account.
I currently dump all htdocs, MySQL databases, and configuration files to gmails accounts. MySQL databases are backed up nightly and htdocs are backed up weekly. I end up using about 300mb or so per week so I manually have to sign up for a new gmail account every month or so.
I have successfully restored MySQL databases and htdocs from this backup method. It is a bit tedious, but could be simplified with a restore script.
It's not very professional, but the data is secure and backed up in three places (Home, Google, and Linode Backup Beta).
Thanks,
Smark
PS. Sorry about any typos, writing from my iPhone and still getting used to the keyboard.
Duplicity
Duplicity takes all the pain out of doing compressed, incremental, and (if you so choose) encrypted backups. I'm currently backing up to Amazon S3 with it, but it can use a number of different storage back-ends, including FTP, local disk, SCP, rsync, etc. I have mine configured to do a full backup on the first day of each month, and then do incremental backups each day in-between the fulls.
Like I said, you have the option of backing up to S3, but if you want a near zero-cost option, just fire up a linux box at home with some disk and then use Duplicity's scp backend option to back up to your home server.
I'm currently using the following script for my duplicity backups. I found this example script posted somewhere online, but for the life of me, I can't locate the original author now.
!/bin/bash
# Set up some variables for logging
LOGFILE="/var/log/duplicity.log"
DAILYLOGFILE="/var/log/duplicity.daily.log"
HOST=`hostname`
DATE=`date +%Y-%m-%d`
MAILADDR="user@example.com"
# Clear the old daily log file
cat /dev/null > ${DAILYLOGFILE}
# Trace function for logging, don't change this
trace () {
stamp=`date +%Y-%m-%d_%H:%M:%S`
echo "$stamp: $*" >> ${DAILYLOGFILE}
}
# Export some ENV variables so you don't have to type anything
export AWS_ACCESS_KEY_ID="YOUR_ACCESS_KEY"
export AWS_SECRET_ACCESS_KEY="YOUR_SECRET"
export PASSPHRASE="YOUR_GPG_PASSPHRASE"
# Your GPG key
GPG_KEY=YOUR_GPG_KEY
# How long to keep backups for
OLDER_THAN="6M"
# The source of your backup
SOURCE=/
# The destination
# Note that the bucket need not exist
# but does need to be unique amongst all
# Amazon S3 users. So, choose wisely.
DEST="s3+http://your.s3.bucket/"
FULL=
if [ $(date +%d) -eq 1 ]; then
FULL=full
fi;
trace "Backup for local filesystem started"
trace "... removing old backups"
duplicity remove-older-than ${OLDER_THAN} ${DEST} >> ${DAILYLOGFILE} 2>&1
trace "... backing up filesystem"
duplicity ${FULL} --volsize=250 --include=/etc --include=/home --include=/root --include=/var/log --include=/var/lib/mailman --exclude=/** ${SOURCE} ${DEST} >> ${DAILYLOGFILE} 2>&1
trace "Backup for local filesystem complete"
trace "------------------------------------"
# Send the daily log file by email
cat "$DAILYLOGFILE" | mail -s "Duplicity Backup Log for $HOST - $DATE" $MAILADDR
# Append the daily log file to the main log file
cat "$DAILYLOGFILE" >> $LOGFILE
# Reset the ENV variables. Don't need them sitting around
export AWS_ACCESS_KEY_ID=
export AWS_SECRET_ACCESS_KEY=
export PASSPHRASE=
You'll obviously need to modify this script to suit your own environment.
If you want/need to periodically back up your linode disk images, you can do that, but you'll need server downtime. This process involves shutting down the server, booting up the finnix recovery image, and then using dd and scp to copy the image down to a local drive. It works, but can take a long time depending on how fast your internet connection is.
A final note - yes, backups are a pain. I don't think you'll find a single sysadmin on this planet that would refute this claim. My view, however, is that ensuring that you have good, reliable backups (and restore procedures) is the primary task for a sysadmin. There will surely be some pain points when you're getting backups set up, but you'll certainly learn through the process, and will be able to more quickly get backups set up for any server you work on in the future.
automysqlbackup
I've used duplicity before too, but there are two major differences that make me choose rdiff-backup. First, the most recent backup is stored as a normal file system. This makes for quick and easy restores, also doing diffs or other comparisons of current files to the last backup. A restore to the most recent backup does not even require the rdiff-backup software, you could do an scp or the like. Second, you can have rdiff-backup automatically drop off old backups without having to do a full backup again. So you can do a one time sync and you don't have to download all XX GB of your linode again. With duplicity you need to do a full backup every once in a while because otherwise it will take a really long time to restore. If you want to remove a full backup you must also delete the incremental backups that depend on it. With rdiff-backup you can just say, delete backups older than 1 year.
Is it extremely caustly?
Just curious.
Also, what do many of you think about the Linode backup service beta so far? I haven't seen posts in that forum sinse April of 2009. One from July of 2009 I think, but ..
I'm still wondering if I should open a support ticket to turn that on or not. I rairly try to open support tickets.
But I could if trying that would be nice.
I think h aving options to backup a Linode's disk images from whtin the Linode itself is nice.
Only question I have is:
What is the easiest way to backup an image?
That doesn't backup attached profiles, does it?
I'd still have to reinstall MySQL etc, right? Or would backing up the image save me from having to do that.
But Just that alone wouldn't save my dedicated IP and all that.
I'm also curious, but what have you folks noticed
if say:
Your Linode in your datacenter has a disk crash.
Then you are moved to another Linode.
Would you still have to create a new DNS entry? Or would that stuff remain the same.
Just curious.
Linode is really nice.
I haven't experienced issues thus far, but that isn't to say they don't happen.
I read those links that HoopyCat posted earlier in this thread.
Linode rocks!
Keep up the good work, Linodeness!
I love hosting my stuff on Linode.
Have any of you tested the self-serv Linode resize function yet?
I'm curious if I still need to shut down the Linode before doing one of those, or if that shuts it down for that given user.
Thanks again!
Keep those posts coming.
@Keith-BlindUser:
How much does S3 caust?
Is it extremely caustly?
Just curious.
Storage: $0.150 per GB per month
Transfer inbound: $0.100 per GB
Transfer outbound: $0.170 per GB
@Keith-BlindUser:
I'm also curious, but what have you folks noticed
if say:
Your Linode in your datacenter has a disk crash.
Then you are moved to another Linode.
Would you still have to create a new DNS entry? Or would that stuff remain the same.
Just curious.
As long as you stay in the same datacenter, your IP address does not change.
@Keith-BlindUser:
How much does S3 caust?
Is it extremely caustly?
I personally think that S3 pricing is pretty decent for moderate amounts of data. Amazon has apage that details this info, but the short version is that they charge $0.15 per GB per month. There are also fees associated with data transfer in and out of S3, but in my experience, those charges never amount to much. pricing
I host a dozen or so websites on my linodes, mostly wordpress and drupal installs. With backing up all of their documentroots, mysql databases, and a few other random directories, I have ~1.5GB in S3 and my daily incremental backups are about 5MB or so. My monthly backup bill is under a dollar for that server, so not too bad.
But I guess my point being is that I wish to backup the entire Linode disk image itself.
So: what steps would be the best for this?
Thanks.
In your case, instead of setting up another linode for the "receiving end", you'll be using a local linux box at home (or wherever).
That answers my question entirely, but the one thing that I still need to know is:
Does Linux have any compression tools in general (to reduce) the resulting backup itself?
Suppose, that you have a drive taht is for instance, a little smaller than your Linode image, (but you still need that entire image) in case you ever decide to restore/coppy it over to a new Linode, etc.
What could you do in that case? Thanks!
If there is no compression, that won't work, as 1. My money budget is tite. 2. I don't have enouh to purchase a lot of drives. 3. I can't do so for now anyways, (yet need the backups)
So whatever you can tell me is great with regards to compressing a backup so that the space it does take up won't be as large as what it might really be.
DD copies files as large as the disk, so if you have a 700GB linode or whatever, and you have a disk smaller than that..you can't back it up without compressing.
One thing to consider - due to the fact that the image backups require downtime for your server, you really don't want to rely on these image backups as your primary backup method. You really ought to have another backup method that will back up your data in between your "full" image backups. These backups will take the form of using rdiff-backup, duplicity, or whatever. You certainly don't want to be having to take down your server for hours at a time each day to back up the disk image, do you?
Desktop:
nc -l 9001 > /media/megalith/Backups/my_image.img.gz
Server:
pv /dev/sda | gzip –fast | nc mydesktopip_here 9001
Alternatively, you can decompress on the remote end if you intend to mount it. Additionally, since you will be CPU limited trying to do any gzip at all, it's highly recommended to use "pigz" for the compression instead of gzip.
pigz is a parallel (multithreaded) version of gzip. It's a drop-in replacement (supports the exact same options, produces the exact same output, messages, etc) except it will use the four cores in the linode whereas gzip will only use one.
I also recommend filling the disk with a single large zero'd file (IE, cat or dd /dev/zero to a file until you have very little disk space left).
This is because if you make a disk image, it will store every sector of the disk as-is, regardless of if the data was deleted or not. So your "empty space" takes up real space. By zeroing it out, gzip can compress that to almost nothing and save a ton of space.
@anderiv:
@Keith-BlindUser:How much does S3 caust?
Is it extremely caustly?
I have ~1.5GB in S3 and my daily incremental backups are about 5MB or so. My monthly backup bill is under a dollar for that server, so not too bad.
Similar situation here. I have a few hundred megs of backups on S3 with daily incremental backups and it costs me about $0.30 a month. Totally worth it…
@OverlordQ:
What do you guys use to backup to S3 though?
As stated earlier in this thread, I use Duplicity