Amazon S3 as backup solution

Amazon S3 as backup solution

A complete writeup on setting up Amazon S3 automated backup from your Ubuntu server.

Software versions might be outdated when you read this, but the setup should still work.

Case: You have an Ubuntu server (will also work on CentOS with a few modifications to commands). You need cheap and reliable backup for your files.

Solution: Amazon S3 and a fairly simple server configuration.

NOTE:

  • All commands require sudo access.
  • Decryption keys are only available when logged in as root (sudo su).
  • If your Amazon S3 bucket is not in Europe you need to leave out the "--s3-european-buckets" part of commands.

Install needed packages

python-dev and build-essential may not be pre-installed, fetch via apt-get.

apt-get update apt-get install python-dev build-essential

Install librsync

Get librsync from http://downloads.sourceforge.net/project/librsync/librsync/0.9.7/librsync-0.9.7.tar.gz and install with ./configure, make and make install.

wget http://downloads.sourceforge.net/project/librsync/librsync/0.9.7/librsync-0.9.7.tar.gz tar -zxvf librsync-0.9.7.tar.gz cd librsync-0.9.7 ./configure make && make install

Install Duplicity

Download source from http://code.launchpad.net/duplicity/0.6-series/0.6.21/+download/duplicity-0.6.21.tar.gz and install with Python. This guide is tested with version 0.6.21.

wget http://code.launchpad.net/duplicity/0.6-series/0.6.21/+download/duplicity-0.6.21.tar.gz tar -zxvf duplicity-0.6.21.tar.gz cd duplicity-0.6.21 python setup.py install

(In case of fPIC error: External link)

Install Boto

(http://github.com/boto/boto)

git clone git://github.com/boto/boto.git cd boto python setup.py install

NOTE: If script (further down) cannot pull PASSPHRASE from Boto config, insert this line near top of script:

export PASSPHRASE=YourPassword

Set up GPG encryption and Amazon authentication codes

GPG encryption

NOTE: Jump to next header if you already have a key pair on another server!

gpg --gen-key

Choose key type: (1) DSA and Elgamal (default).

Accept default values. Then use this passphrase:

YourPassword

The key generation requires a lot of data for randomizing. Paste a big block of text or two, keep up until system seems to hang. Then wait for generation to complete. Then find your key:

gpg --list-keys

The value you want is the one after "pub 1024D/", 8 characters and/or numbers. This is later used as the $KEY value in backup script.

Already have encryption key pair on another server?

We want to use the same encryption keys for all the servers, that way we always have a backup of the keys, and we can restore to another server if one is completely lost. Export like this from a server which already have the key pair:

gpg --export -a YourKeysUID > public.key gpg --export-secret-key -a YourKeysUID > private.key

(You will find YourKeysUID in the above run key listing, use the name without the attached email address)

Then import to new server:

gpg --import public.key gpg --allow-secret-key-import --import private.key

New key must then be edited to set a trust level:

gpg --edit-key YourKeyID

Command> trust

Choose level 5.

Command> save

Amazon credentials

vim /etc/boto.cfg

Add these values, they will work with previously set up passphrase, and includes our Amazon account's keys:

[Credentials] PASSPHRASE=yourpassword aws_access_key_id = yourkey aws_secret_access_key = yoursecretkey

Set up script for running backups

This is a template script for your backups. The following variables MUST be customized:

  • SERVER (never the same, or else structure on Amazon will be a mess)

These values can be customized at will:

  • DUPLICITY
  • FOLDER
  • KEY
  • LOG (Adapt to your log location and naming convention)
  • --timeout (In seconds, may be adjusted to allow for larger backups)
  • -v (Verbosity, 1-9, gives enough info on 5)

#!/bin/sh # Script runs Duplicity for nightly backups to Amazon S3 # # Define location of Duplicity DUPLICITY=/usr/bin/duplicity # # Root backup folder (everything in folder will be synced) FOLDER=/home/user # # Server name will also be folder name in Amazon S3 file bucket SERVER=serverx # # Encrypt key from GPG KEY=yourkeyID # # Log location LOG=/var/log/backup/backup_to_amazon_s3.$(/bin/date +%Y%m%d).log # # Script to run duplicity --encrypt-key="$KEY" $FOLDER s3+http://backup_bucket/$SERVER --s3-european-buckets --s3-use-new-style --timeout=7200 -v 5 | tee $LOG 2>&1

Add script to crontab:

# Backup to Amazon S3 00 01 * * * /root/backup_to_amazon_s3.sh > /var/log/backup/backup_to_amazon_s3.log 2>&1

Restoring folder structures or single files

Because the backup is stored encrypted in packages, restore must be done via command line. It is recommended to restore to a temporary folder on the server before moving files to where they should be.

NOTE: Restore requires the secret key which was used when encrypting the backup.

In this example these are the values:

  • Remote folder: s3+http://backup_bucket/serverx
  • Local temporary folder: /path/to/restore

duplicity s3+http://backup/serverx /path/to/restore --s3-european-buckets --s3-use-new-style -v 5

You will need a passphrase (it is always the same):

GnuPG passphrase: YourPassword

Restore one single file

--file-to-restore <path.do.folder>/<filename>

Example:

duplicity --file-to-restore example.com/public_html/index.php s3+http://backup_bucket/serverx /path/to/restore/index.php --s3-european-buckets --s3-use-new-style -v 5

Restore a specific path with subfolders

--file-to-restore <path.to.folder>

Example:

duplicity --file-to-restore example.com/public_html s3+http://backup_bucket/serverx /path/to/restore --s3-european-buckets --s3-use-new-style -v 5

Restore from a point in time

-t YYYY-MM-DDTHH:MM:SS

Example:

duplicity -t 2010-09-22T01:10:00 s3+http://backup_bucket/serverx /path/to/restore --s3-european-buckets --s3-use-new-style -v 5

 

Point in time and Path can be combined!

Restore Step-by-step

1) Log into the server where you want to restore your data.

2) Make sure you are root (sudo su). Or else decrypt will not work and restore will fail.

3) Create an empty temporary folder where you want to restore data to.

4) Select an example from above ("Single file", "Specific path with subfolders" or "Point in time". (NOTE: "Point in time" can be combined with one of the others!)

5) Edit "s3+http://backup_bucket/serverx" in your selected example to reflect which server the backup was taken from, exchanging "serverx" with the correct input. If unsure, all folders can be seen on https://console.aws.amazon.com/s3/ under "backup_bucket".

6) Edit "/path/to/restore" to your temporary folder path, or just "." if you are at that path.

7) Edit what and/or when you want to restore from (the first variable in the examples).

8) Run command. When asked for password, enter: YourPassword

That's it, you are now the master of this!

Got input?

I appreciate any comments on this solution, whether it is anything which could be done better, anything which is outdated and no longer works (and what now works), etc.

Jan-Helge Hansen

Jan-Helge Hansen

Les flere artikler fra Jan-Helge Hansen.

Prosjektkoordinator i Frontkom, ivrig opptatt av SEO og sosiale medier. Finnes blant annet på . Blir glad av raske nettsider.