|
|
Title: My Backup Strategy |
|
|
Date: 2021-04-08 |
|
|
Category: Writing |
|
|
Summary: Details about the backup system for all of my data. |
|
|
Wide: true |
|
|
Short: 1 |
|
|
|
|
|
[TOC] |
|
|
|
|
|
Regularly backing up all the data I care about is very important to me. This |
|
|
article outlines my strategy to make sure I never lose essential data. |
|
|
|
|
|
## Motivation |
|
|
|
|
|
Backups should be as automatic as possible. This ensures laziness and |
|
|
forgetfulness won't interfere with the regularity. |
|
|
|
|
|
All software used to create and store the backups should be free and open source |
|
|
so I'm not depending on the survival of a company. |
|
|
|
|
|
Backups need to be tested to ensure they are correct and happening regularly. |
|
|
Multiple copies of the backups should exist, including at least one offsite to |
|
|
protect against my building burning down. |
|
|
|
|
|
Backups should also be incremental when possible (rather than mirror copies) so |
|
|
an accidental deletion isn't propagated into the backups, making the file |
|
|
irrecoverable. |
|
|
|
|
|
## Strategy |
|
|
|
|
|
I have one backup folder `/mnt/backup` on my media server at home that serves as |
|
|
the destination for all my backup sources. All scheduled automatic backups write |
|
|
to their own subfolder inside of it. |
|
|
|
|
|
This backup folder is then synced to encrypted 2.5" 1 TB hard drives which I |
|
|
rotate between my bag, offsite, and my parents' house. |
|
|
|
|
|
## Backup Sources |
|
|
|
|
|
I use the tool `rdiff-backup` extensively because it allows me to take |
|
|
incremental backups locally or over SSH. It acts very similar to `rsync` and has |
|
|
no configuration. |
|
|
|
|
|
### Email |
|
|
|
|
|
I have every email since 2010 backed up continuously in case my email provider |
|
|
disappears. |
|
|
|
|
|
I use `offlineimap` to sync my mail to the directory `~/email` on my media |
|
|
server as a Maildir. Since offlineimap is only a syncing tool, the emails need |
|
|
to be copied elsewhere to be backed up. I run `rdiff-backup` from a weekly cron |
|
|
job: |
|
|
|
|
|
<span class="aside">I'll explain what backup_check.txt does below</span> |
|
|
|
|
|
``` |
|
|
*/15 * * * * offlineimap > /var/log/offlineimap.log 2>&1 |
|
|
00 12 * * 1 date -Iseconds > /home/email/email/backup_check.txt |
|
|
|
|
|
20 12 * * 1 rdiff-backup /home/email/email /mnt/backup/local/email/ |
|
|
40 12 * * 1 rdiff-backup --remove-older-than 12B --force /mnt/backup/local/email/ |
|
|
``` |
|
|
|
|
|
Here's my `.offlineimaprc` for reference: |
|
|
|
|
|
``` |
|
|
[general] |
|
|
accounts = main |
|
|
[Account main] |
|
|
localrepository = Local |
|
|
remoterepository = Remote |
|
|
[Repository Local] |
|
|
type = Maildir |
|
|
localfolders = ~/email |
|
|
[Repository Remote] |
|
|
type = IMAP |
|
|
readonly = True |
|
|
folderfilter = lambda foldername: foldername not in ['Trash', 'Spam', 'Drafts'] |
|
|
remotehost = example.com |
|
|
remoteuser = mail@example.com |
|
|
remotepass = supersecret |
|
|
sslcacertfile = /etc/ssl/certs/ca-certificates.crt |
|
|
``` |
|
|
|
|
|
### Notes |
|
|
|
|
|
I use Standard Notes to take notes and wrote the tool |
|
|
[standardnotes-fs](https://github.com/tannercollin/standardnotes-fs) to mount my |
|
|
notes as a file system to view and edit them as plain text files. |
|
|
|
|
|
I take weekly backups of the mounted file system on my media server with cron: |
|
|
|
|
|
``` |
|
|
00 12 * * 1 date -Iseconds > /home/notes/notes/backup_check.txt |
|
|
15 12 * * 1 rdiff-backup /home/notes/notes /mnt/backup/local/notes/ |
|
|
``` |
|
|
|
|
|
### Nextcloud |
|
|
|
|
|
I self-host a Nextcloud instance to store all my personal documents (non-code |
|
|
projects, tax forms, spreadsheets, etc.). Since it's only a syncing software, |
|
|
the files need to be copied elsewhere to be backed up. |
|
|
|
|
|
I take weekly backups of the Nextcloud data folder with cron: |
|
|
|
|
|
``` |
|
|
00 12 * * 1 rdiff-backup /var/www/nextcloud/data/tanner/files /mnt/backup/local/nextcloud/ |
|
|
30 12 * * 1 rdiff-backup --remove-older-than 12B --force /mnt/backup/local/nextcloud/ |
|
|
``` |
|
|
|
|
|
### Gitea |
|
|
|
|
|
I self-host a Gitea instance to store all my git repositories for code-based |
|
|
projects. My home folder is also a git repo so I can easily sync my config files |
|
|
and password database between servers and machines. |
|
|
|
|
|
I take weekly backups of the Gitea data folder with cron: |
|
|
|
|
|
``` |
|
|
00 12 * * 1 date -Iseconds > /home/gitea/gitea/data/backup_check.txt |
|
|
10 12 * * 1 rdiff-backup --exclude **data/indexers --exclude **data/sessions /home/gitea/gitea/data /mnt/backup/local/gitea/ |
|
|
35 12 * * 1 rdiff-backup --remove-older-than 12B --force /mnt/backup/local/gitea/ |
|
|
``` |
|
|
|
|
|
### Telegram |
|
|
|
|
|
Telegram Messenger is my main app for communication. My parents, most of my |
|
|
friends, and friend groups are on there so I don't want to lose those messages |
|
|
in case Telegram disappears or my account gets banned. |
|
|
|
|
|
<span class="aside">Saves the messages to a sqlite db</span> |
|
|
|
|
|
Telegram includes a data export feature, but it can't be automated. Instead I |
|
|
run the deprecated software |
|
|
[telegram-export](https://github.com/expectocode/telegram-export) hourly with |
|
|
cron: |
|
|
|
|
|
``` |
|
|
0 * * * * bash -c 'timeout 50m /home/tanner/opt/telegram-export/env/bin/python -m telegram_export' > /var/log/telegramexport.log 2>&1 |
|
|
``` |
|
|
|
|
|
It likes to hang, so `timeout` kills it if it's still running after 50 minutes. |
|
|
Hasn't corrupted the database yet. |
|
|
|
|
|
### Phone |
|
|
|
|
|
[Signal |
|
|
Messenger](https://play.google.com/store/apps/details?id=org.thoughtcrime.securesms&hl=en_CA&gl=US) |
|
|
automatically exports a copy of my text messages database, and |
|
|
[Aegis](https://play.google.com/store/apps/details?id=com.beemdevelopment.aegis&hl=en_CA&gl=US) |
|
|
allows me to export an encrypted JSON file of my two-factor authentication |
|
|
codes. |
|
|
|
|
|
I mount my phone's internal storage as a file system on my desktop using |
|
|
[adbfs-rootless](https://github.com/spion/adbfs-rootless). I then rsync the |
|
|
files over to my media server: |
|
|
|
|
|
``` |
|
|
$ ./adbfs ~/mntphone |
|
|
$ time rsync -Wav \ |
|
|
--exclude '*cache' --exclude nobackup \ |
|
|
--exclude '*thumb*' --exclude 'Telegram *' \ |
|
|
--exclude 'collection.media' \ |
|
|
--exclude 'org.thunderdog.challegram' \ |
|
|
--exclude '.trashed-*' --exclude '.pending-*' \ |
|
|
~/mntphone/storage/emulated/0/ \ |
|
|
localmediaserver:/mnt/backup/files/phone/ |
|
|
``` |
|
|
|
|
|
Unfortunately this is a manual process because I need to plug my phone in each |
|
|
time. Ideally it would happen automatically while I'm asleep and the phone is |
|
|
charging. |
|
|
|
|
|
### Miscellaneous Files |
|
|
|
|
|
The directory `/backup/files` is a repository for any kind of files I want to |
|
|
keep forever. My phone data, old archives, computer files, Minecraft worlds, |
|
|
files from previous jobs, and so on. |
|
|
|
|
|
All the files will be included in the 1 TB hard drive backup rotations. |
|
|
|
|
|
### Web Services |
|
|
|
|
|
Web services that I run like [txt.t0.vc](https://txt.t0.vc) and |
|
|
[QotNews](https://news.t0.vc) are backed up daily, weekly, and monthly depending |
|
|
on how frequently the data changes. |
|
|
|
|
|
I run `rdiff-backup` on the remote server with cron: |
|
|
|
|
|
``` |
|
|
00 14 * * * date -Iseconds > /home/tanner/tbot/t0txt/data/backup_check.txt |
|
|
|
|
|
04 14 * * * rdiff-backup /home/tanner/tbot/t0txt/data tbotbak@remotebackup::/mnt/backup/remote/tbotbak/daily/t0txt/ |
|
|
14 14 * * * rdiff-backup --remove-older-than 12B --force tbotbak@remotebackup::/mnt/backup/remote/tbotbak/daily/t0txt/ |
|
|
|
|
|
24 14 * * 1 rdiff-backup /home/tanner/tbot/t0txt/data tbotbak@remotebackup::/mnt/backup/remote/tbotbak/weekly/t0txt/ |
|
|
34 14 * * 1 rdiff-backup --remove-older-than 12B --force tbotbak@remotebackup::/mnt/backup/remote/tbotbak/weekly/t0txt/ |
|
|
|
|
|
44 14 1 * * rdiff-backup /home/tanner/tbot/t0txt/data tbotbak@remotebackup::/mnt/backup/remote/tbotbak/monthly/t0txt/ |
|
|
55 14 1 * * rdiff-backup --remove-older-than 12B --force tbotbak@remotebackup::/mnt/backup/remote/tbotbak/monthly/t0txt/ |
|
|
``` |
|
|
|
|
|
The `tbotbak` user has write access to the `/mnt/backup/remote/tbotbak` |
|
|
directory only. It has its own passwordless SSH key that's only permitted to run |
|
|
the `rdiff-backup --server` command for security. |
|
|
|
|
|
### Protospace |
|
|
|
|
|
I run a lot of services for [Protospace](https://protospace.ca/), my city's |
|
|
makerspace. |
|
|
|
|
|
The member portal I wrote called [Spaceport](https://my.protospace.ca/) creates |
|
|
an archive I download daily: |
|
|
|
|
|
``` |
|
|
40 10 * * * wget --content-disposition \ |
|
|
--header="Authorization: secretkeygoeshere" \ |
|
|
--directory-prefix /mnt/backup/remote/portalbak/ \ |
|
|
--no-verbose --append-output=/var/log/portalbackup.log \ |
|
|
https://api.my.protospace.ca/backup/ |
|
|
``` |
|
|
|
|
|
The website and [wiki](https://wiki.protospace.ca) that I sysadmin get |
|
|
backed up weekly: |
|
|
|
|
|
``` |
|
|
0 12 * * 1 mysqldump --all-databases > /var/www/dump.sql |
|
|
15 12 * * 1 date -Iseconds > /var/www/backup_check.txt |
|
|
20 12 * * 1 rdiff-backup /var/www pshostbak@remotebackup::/mnt/backup/remote/pshostbak/weekly/www/ |
|
|
``` |
|
|
|
|
|
The Protospace [Minecraft |
|
|
server](http://games.protospace.ca:8123/?worldname=world&mapname=flat&zoom=3&x=74&y=64&z=354) |
|
|
I run gets backed up daily: |
|
|
|
|
|
``` |
|
|
00 15 * * * date -Iseconds > /home/tanner/minecraft/backup_check.txt |
|
|
00 15 * * * rdiff-backup --exclude **CoreProtect --exclude **dynmap /home/tanner/minecraft psminebak@remotebackup::/mnt/backup/remote/psminebak/ |
|
|
30 15 * * * rdiff-backup --remove-older-than 12B --force psminebak@remotebackup::/mnt/backup/remote/psminebak/ |
|
|
``` |
|
|
|
|
|
I also back up our Google Drive with rclone: |
|
|
|
|
|
``` |
|
|
45 12 * * 1 rclone copy -v protospace: /mnt/backup/files/protospace/google-drive/ |
|
|
``` |
|
|
|
|
|
## Backup Copies |
|
|
|
|
|
My backup folder `/mnt/backup` now looks like this: |
|
|
|
|
|
``` |
|
|
/mnt/backup/ |
|
|
├── files |
|
|
│ ├── docs |
|
|
│ ├── phone |
|
|
│ ├── protospace |
|
|
│ ├── telegram |
|
|
│ ├── usbsticks |
|
|
│ └── ... and so on |
|
|
├── local |
|
|
│ ├── email |
|
|
│ ├── gitea |
|
|
│ ├── nextcloud |
|
|
│ └── notes |
|
|
└── remote |
|
|
├── portalbak |
|
|
├── pshostbak |
|
|
├── psminebak |
|
|
├── tbotbak |
|
|
└── telebak |
|
|
``` |
|
|
|
|
|
This directory tree is the master backup and I make a copy of the entire tree |
|
|
every Saturday to a hard drive. |
|
|
|
|
|
The directory is copied over with the following script: |
|
|
|
|
|
```text |
|
|
#!/bin/bash |
|
|
|
|
|
cryptsetup luksOpen /dev/sdf external |
|
|
mount /dev/mapper/external /mnt/external |
|
|
|
|
|
time rsync -av --delete /mnt/backup/local/ /mnt/external/backup/local/ |
|
|
time rsync -av --delete /mnt/backup/remote/ /mnt/external/backup/remote/ |
|
|
time rdiff-backup --force -v5 /mnt/backup/files/ /mnt/external/backup/files/ |
|
|
|
|
|
python3 /home/tanner/scripts/checkbackup.py |
|
|
|
|
|
umount /mnt/external |
|
|
cryptsetup luksClose external |
|
|
``` |
|
|
|
|
|
I wrote a Python script `checkbackup.py` that goes through each backup and |
|
|
compares the timestamp in `backup_check.txt` files to the current time. This |
|
|
makes sure that the cron ran, backups were taken, and transferred over |
|
|
correctly. |
|
|
|
|
|
## Rotating Hard Drives |
|
|
|
|
|
I rotate through 2.5" 1 TB hard drives each Saturday when I do a backup. They |
|
|
are quite cheap at [$65 CAD](https://www.memoryexpress.com/Products/MX65194) |
|
|
each so I can have a bunch floating around. |
|
|
|
|
|
|
|
|
I keep one connected to the server, one in my bag, one offsite, one at my |
|
|
mother's house, and one at my dad's house. Every Saturday I run the script above |
|
|
to take a copy and then swap the drive with the one in my bag. It then gets |
|
|
<span class="aside">I go back home about twice per year</span> |
|
|
swapped when I visit my offsite location. Same for when I visit my parents. This |
|
|
means that all hard drives eventually get rotated through with new data and |
|
|
don't sit too long unpowered. |
|
|
|
|
|
The drives are all encrypted with full-disk LUKS encryption using a password I'm |
|
|
unlikely to forget. |
|
|
|
|
|
I run the check-summing `btrfs` file system on them in RAID-1 to protect against |
|
|
bitrot. This means I can only use 0.5 TB of storage for my backups, but the data |
|
|
is stored redundantly. |
|
|
|
|
|
Here's how I set up new hard drives to do this: |
|
|
|
|
|
``` |
|
|
$ sudo cryptsetup luksOpen /dev/sdf external |
|
|
$ sudo mkfs.btrfs -f -m dup -d dup /dev/mapper/external |
|
|
$ sudo mount /dev/mapper/external /mnt/external/ |
|
|
$ sudo mkdir /mnt/external/backup |
|
|
$ sudo chown -R tanner:tanner /mnt/external/backup |
|
|
$ sudo umount /mnt/external |
|
|
$ sudo cryptsetup luksClose external |
|
|
``` |
|
|
|
|
|
## Future Improvements |
|
|
|
|
|
I'm working on a system to automatically back up all my home directories to my |
|
|
media server. I need this to grab Bash histories and code that's |
|
|
work-in-progress. I've been burned by not having this once when a server died. |
|
|
|
|
|
I'd like to automate backing up my phone by connecting it to a Raspberry Pi when |
|
|
I go to sleep. |
|
|
|
|
|
I need to get better at fully testing my backups by restoring them on a blank |
|
|
machine.
|
|
|
|