Best practices for secure rsync backups

19 Jul 2021security sysadmin

If we've learned anything from the massive OVHCloud fire and recent rise of ransomware attacks, it's that backups can no longer be left as an afterthought. When done right, backup systems can be done quickly without compromising security. When done wrong, and they often are, backup systems can fail to protect their hosts in an incident.

Incremental backups ftw

After a machine is fully backed up, subsequent backups can either repeat the full backup or back up only what's changed. This latter type is called an incremental backup. If very little has changed since the last backup, this can dramatically reduce the time and bandwidth costs of running the backup. Running full backups on a moderately-sized server can be so resource-intensive that backups are only performed weekly or monthly. However, losing a week or month of work can be devastating. Using incremental backups makes it economical to run them more frequently: daily or every few hours even.

rsync is a powerful command-line remote sync utility. It performs incremental backups by checking the hashes and modification times of files on the disk. Rsync is well-vetted and trusted open-source technology. It's all we need for a sophisticated system, without relying on bulky third-party backup software that could leave the system open to supply-chain attacks.

Hardware

It may be tempting to use your existing infrastructure for backups, but it's well worthwhile to dedicate a separate machine for this purpose. Imagine having two servers that need to be backed up, so your I.T. department backs them up to each other. If one server is breached in a ransomware attack, the attackers will have access to the backups of the second server. This backup could contain information like keyfiles and passwords that is then used to infect the second server with the same ransomware.

Servers with public-facing services are the most vulnerable, because each public website and mailserver provides an attack vector for zero-day exploits. And since our backups contain secrets, they need to be kept in an environment with top security. This can be done by using a dedicated device with the highest level of security. This machine should be heavily firewalled from the internet and should have no ports open.

The backup server can be hosted on a cloud, however it's much cheaper and more secure to self-host. Cloud hosts tend to over charge for large amounts of storage. It's much more economical to run SSD storage on locally-hosted bare metal. Since there are no public services on the backup server, it can be securely hosted on a local network without opening ports. Single board computers like the Raspberry Pi, or even an old laptop with a few SSDs plugged in can host large amounts of storage with very low overhead.

Access

To ensure that the backup server is safe from network traversal, we'll limit user access to it to the administrators. Not even the backed-up devices should have access to the backup server. Imagine that a public-facing server is breached and the data is ransomed. If that server has access to the backup server, the backups can be accessed and deleted by the attackers. Without the backups, the data can't be restored without paying the hackers.

The simple way to prevent this is to run the backups backwards. That is, instead of commanding the public-facing server to send its files to the backup server, the backup server will access the public-facing one and retrieve them. That is, the backup server has access to the public-facing server, but not the other way around.

Preliminary hardening

So, we will set up a local dedicated machine with large storage drives. The drives can be luks-encrypted for extra security. For this example, we will be using two debian machines. We'll use the root user on the backup server, which we'll call bkp. The public-facing server, srv needs to have its /srv directory backed up. We'll store it on bkp in the directory, /mnt/ssd/srv. We will use the backup account on srv. This user comes with debian systems or can be created with sudo useradd backup.

On debian, the default backup user has its shell set to /sbin/nologin. We need to change that to a real shell for rsync to work.

root@srv:~ # usermod -s /bin/sh backup

Next, we need the root user on bkp to be able to log in with ssh as the backup user on srv. This should be done with public key auth and PasswordAuthentication should be turned off in srv's ssh daemon configuration. These things are outside the scope of this article but easy to find online.

For something as vital as a backup server, it's a good idea to implement additional security layers, like single packet authentication or whitelisting a VPN subnet. This is also outside the scope of this article.

Permissions

For fine-grained control, we're going to use access control lists on the server. First, install them:

root@srv:~ # apt install acl

Next, we'll take specify that backup user has no access to anything on the entire filesystem:

root@srv:~ # setfacl -d -R -m user:backup:0 / 2>/dev/null

Then, we can give it read-only access to the directories we want backed up.

root@srv:~ # find /srv -type d -exec setfacl -d -m user:backup:5 {} \;
root@srv:~ # find /srv -type f -exec setfacl -m user:backup:4 {} \;

The backup user needs to be able to run its own shell, as well as rsync (and, notably, nothing else!) We'll set this by simply removing the ACL from those files.

root@srv:~ # setfacl -bn $(which rsync)
root@srv:~ # setfacl -bn $(which sh)

SSH and gpg private keys need to be accessible by the owner only or those programs won't accept them. So we'll have trouble if the backup user can read them. Remove the ACLs from these files too to prevent this.

root@srv:~ # setfacl -bn /etc/ssh/*_key
root@srv:~ # setfacl -bn /home/*/.ssh/id_*
root@srv:~ # setfacl -R -bn /etc/apt/trusted.gpg*
root@srv:~ # setfacl -R -bn /root/.gnupg
root@srv:~ # setfacl -R -bn /home/*/.gnupg

This means ssh and gpg keys won't be backed up and will need to backed up separately.

If docker is running, you'll also need to remove the ACLs from the docker root directory. Otherwise, new containers will inherit the ACLs from the host machine and can star throwing permissions errors.

root@srv:~ # docker info | grep "Docker Root Dir"
Docker Root Dir: /var/lib/docker
root@srv:~ # setfacl -R -d -bn /var/lib/docker

Then again you might need to backup your docker volumes. In this case, reapply the ACL to the volumes directory:

root@srv:~ # find /var/lib/docker/volumes -type d -exec setfacl -d -m user:backup:5 {} \;
root@srv:~ # find /var/lib/docker/volumes -type f -exec setfacl -m user:backup:4 {} \;

That should do it. If you get permission-denied errors anywhere, remove the ACL as above and check that backup has access there as needed.

Initial backup

Now we're ready to run the initial backup. Use -vv and -e 'ssh -vv' with your rsync command if debugging is necessary.

root@bkp:~ # rsync -aXvz --delete srv:/srv/ /mnt/ssd/srv

Don't forget the trailing slash in srv:/srv/! If you exclude it, the directory itself will be copied over as /mnt/ssd/srv/srv.

Hardening

Once the initial backup has completed, we will set the server to restrict what the backup user can do on its side. Well backup can only run sh and rsync already, but we don't want it to be able to try anything funny with sh. So we can restrict the user to a single command. First we need to find out that command is. Not the one that we run on bkp, but the one that is executed on srv's end.

This is actually not too hard. We just need the verbose output from ssh on the server-side. Get it by adding -e 'ssh -v':

root@bkp:~ # rsync -aXvz --delete srv:/srv/ /mnt/ssd/srv

This will produce a lot of output. Find the line that looks like this:

debug1: Sending command: rsync --server -vlogDtprz --delete . /srv

You can cancel the backup once that output is acquired, but you can also let it run to completion if you want. It should go run as an incremental sync and be dramatically faster than the initial backup.

This command can be put in our /etc/ssh/sshd_config on srv. We'll put it in a match block that only applies to the backup user. This will restrict that user to only be able to run that one command through ssh.

...
# Limit backups to rsync
Match User backup
  ForceCommand rsync --server --sender -vlogDtprz . /srv
  X11Forwarding no
  AllowTcpForwarding no

What if a clever hacker figures out how to log in with backup's ssh key, but not as the backup user? We can actually restrict the keypair to this command too. Modify the line in ~backup/.ssh/authorized_keys to include the restrictions before the public key:

command="rsync --server --sender -vlogDtprz . /srv",no-pty,no-port-forwarding,no-agent-forwarding,no-X11-forwarding ssh-rsa AAAAB3NzaC1y... root@bkp

Automation

Our system is working as intended now, but we're not done yet. We now have to automate the process using systemd. This can be done in a number of different ways using our working rsync command. At the most basic, that rsync command can be dropped into a new backup.service file like this:

[Unit]
Description=Pull backups from srv

[Service]
Type=oneshot
ExecStart=/usr/bin/rsync -aXz --delete srv:/srv/ /mnt/ssd/srv

[Install]
WantedBy=multi-user.target

Then, you can run backups with systemctl start backup. Neat. Now we can add the service to a new backup.timer to have it run automatically:

[Unit]
Description=Pull backups tuesdays at 1am

[Timer]
OnCalendar=tuesday *-*-* 01:00:00
Persistent=true
Unit=backup.service

[Install]
WantedBy=multi-user.target

Don't forget to enable the service on boot with systemctl enable backup.timer!

Monitoring

Next, some service monitors must be configured. This is not optional: there are many stories of sysadmins needing to restore backups after an incident, only to find that the backup service threw an error years ago and hasn't run since! We need to be notified when our backups stop backing up.

An good way to get these notifications is by email. Laeffe at Northern Light Labs wrote a great article on how to set up a systemd service to email you if another service fails. You can even have the error logs piped into the email.

Redundancy

As a last step, get another backup server and repeat all these steps again on it. We need to have two copies of our backup, just in case a catastrophe wipes out one of them with the original data. It's not unheard of. The two backups should be geographically isolated from each other, if possible.

< Back to all posts