Using volumes for backup
We have a setup, where we use one Linode instance as backup server for our other nodes.
The backup server is a plain Ubuntu 20.04 server, has 6 volumes of 1.5TB each and little to no extra configuration or tooling.
/mnt/daily
/mnt/day-1
/mnt/day-2
/mnt/day-3
/mnt/day-4
/mnt/day-5
All our nodes use rsync to send data to /mnt/daily
We then have a script on the server rotating the backups on weekdays to have easy access to the last 5 days of backup.
We also send the data off to s3 via s3cmd
for longtime storage.
The setup works, but we have issues with the backup server randomly shutting down without any warning. It is then restarted via Lassie.
First we thought that it was due to an fstrim
issue - the service runs runs once a week, but stopping that did little to the problem.
I ran a disk check on all volumes, but they reported no errors.
Then we installed a new server from scratch, and moved the volumes to that server - still the same pattern.
Now it looks like an issue with the "rotate" rsync - but at this point we'r just guessing.
Any pointers to the "random shutdown" issue here would be greatly appreciated.
2 Replies
The best advice we can give you to investigate why Lassie may have rebooted your Linode can be found in the responses to this post:
That said, I'm seeing an error in your Linode's console before the Lassie reboot indicating that the kernel was upset. I'm unable to share the specific error message on this public page, but I'm leaving a note on your account in case you'd like to open a ticket; our Support folks will happily provide it to you. If it's the same issue each time, you can likely reproduce the error by disabling Lassie and checking your LISH console when the Linode fails.
Are there any other scheduled jobs or resource intensive tasks being performed at around the same time as the rsync? This could be the cause of the issue.