Add My Backup Strategy article
This commit is contained in:
parent
21c25413fc
commit
8fb1b29aef
337
content/backup-strategy.md
Normal file
337
content/backup-strategy.md
Normal file
|
@ -0,0 +1,337 @@
|
|||
Title: My Backup Strategy
|
||||
Date: 2021-04-08
|
||||
Category: Writing
|
||||
Summary: Details about the backup system for my data.
|
||||
Wide: true
|
||||
|
||||
[TOC]
|
||||
|
||||
Regularly backing up all the data I care about is very important to me. This
|
||||
article outlines my strategy to make sure I never lose essential data.
|
||||
|
||||
## Motivation
|
||||
|
||||
Backups should be as automatic as possible. This ensures laziness and
|
||||
forgetfulness won't interfere with the regularity.
|
||||
|
||||
All software used to create and store the backups should be free and open source
|
||||
so I'm not depending on the survival of a company.
|
||||
|
||||
Backups need to be tested to ensure they are correct and happening regularly.
|
||||
Multiple copies of the backups should exist, including at least one offsite to
|
||||
protect against my building burning down.
|
||||
|
||||
Backups should also be incremental when possible (rather than mirror copies) so
|
||||
an accidental deletion isn't propagated into the backups, making the file
|
||||
irrecoverable.
|
||||
|
||||
## Strategy
|
||||
|
||||
I have one backup folder `/mnt/backup` on my media server at home that serves as
|
||||
the destination for all my backup sources. All scheduled automatic backups write
|
||||
to their own subfolder inside of it.
|
||||
|
||||
This backup folder is then synced to encrypted 2.5" 1 TB hard drives which I
|
||||
rotate between my bag, offsite, and my parent's house.
|
||||
|
||||
## Backup Sources
|
||||
|
||||
I use the tool `rdiff-backup` extensively because it allows me to take
|
||||
incremental backups locally or over SSH. It acts very similar to `rsync` and has
|
||||
no configuration.
|
||||
|
||||
### Email
|
||||
|
||||
I have every email since 2010 backed up continuously in case my email provider
|
||||
disappears.
|
||||
|
||||
I use `offlineimap` to sync my mail to the directory `~/email` on my media
|
||||
server as a Maildir. Since offlineimap is only a syncing tool, the emails need
|
||||
to be copied elsewhere to be backed up. I run `rdiff-backup` from a weekly cron
|
||||
job:
|
||||
|
||||
```
|
||||
*/15 * * * * offlineimap > /var/log/offlineimap.log 2>&1
|
||||
00 12 * * 1 date -Iseconds > /home/email/email/backup_check.txt
|
||||
|
||||
20 12 * * 1 rdiff-backup /home/email/email /mnt/backup/local/email/
|
||||
40 12 * * 1 rdiff-backup --remove-older-than 12B --force /mnt/backup/local/email/
|
||||
```
|
||||
|
||||
Here's my `.offlineimaprc` for reference:
|
||||
|
||||
```
|
||||
[general]
|
||||
accounts = main
|
||||
[Account main]
|
||||
localrepository = Local
|
||||
remoterepository = Remote
|
||||
[Repository Local]
|
||||
type = Maildir
|
||||
localfolders = ~/email
|
||||
[Repository Remote]
|
||||
type = IMAP
|
||||
readonly = True
|
||||
folderfilter = lambda foldername: foldername not in ['Trash', 'Spam', 'Drafts']
|
||||
remotehost = example.com
|
||||
remoteuser = mail@example.com
|
||||
remotepass = supersecret
|
||||
sslcacertfile = /etc/ssl/certs/ca-certificates.crt
|
||||
```
|
||||
|
||||
### Notes
|
||||
|
||||
I use Standard Notes to take notes and wrote the tool
|
||||
[standardnotes-fs](https://github.com/tannercollin/standardnotes-fs) to mount my
|
||||
notes as a file system to view and edit them as plain text files.
|
||||
|
||||
I take weekly backups of the mounted file system on my media server with cron:
|
||||
|
||||
```
|
||||
00 12 * * 1 date -Iseconds > /home/notes/notes/backup_check.txt
|
||||
15 12 * * 1 rdiff-backup /home/notes/notes /mnt/backup/local/notes/
|
||||
```
|
||||
|
||||
### Nextcloud
|
||||
|
||||
I self-host a Nextcloud instance to store all my personal documents (non-code
|
||||
projects, tax forms, spreadsheets, etc.). Since it's only a syncing software,
|
||||
the files need to be copied elsewhere to be backed up.
|
||||
|
||||
I take weekly backups of the Nextcloud data folder with cron:
|
||||
|
||||
```
|
||||
00 12 * * 1 rdiff-backup /var/www/nextcloud/data/tanner/files /mnt/backup/local/nextcloud/
|
||||
30 12 * * 1 rdiff-backup --remove-older-than 12B --force /mnt/backup/local/nextcloud/
|
||||
```
|
||||
|
||||
### Gitea
|
||||
|
||||
I self-host a Gitea instance to store all my git repositories for code-based
|
||||
projects. My home folder is also a git repo so I can easily sync my config files
|
||||
and password database between servers and machines.
|
||||
|
||||
I take weekly backups of the Gitea data folder with cron:
|
||||
|
||||
```
|
||||
00 12 * * 1 date -Iseconds > /home/gitea/gitea/data/backup_check.txt
|
||||
10 12 * * 1 rdiff-backup --exclude **data/indexers --exclude **data/sessions /home/gitea/gitea/data /mnt/backup/local/gitea/
|
||||
35 12 * * 1 rdiff-backup --remove-older-than 12B --force /mnt/backup/local/gitea/
|
||||
```
|
||||
|
||||
### Telegram
|
||||
|
||||
Telegram Messenger is my main app for communication. My parents, most of my
|
||||
friends, and friend groups are on there so I don't want to lose those messages
|
||||
in case Telegram disappears or my account gets banned.
|
||||
|
||||
Telegram includes a data export feature, but it can't be automated. Instead I
|
||||
run the deprecated software
|
||||
[telegram-export](https://github.com/expectocode/telegram-export) hourly with
|
||||
cron:
|
||||
|
||||
```
|
||||
0 * * * * bash -c 'timeout 50m /home/tanner/opt/telegram-export/env/bin/python -m telegram_export' > /var/log/telegramexport.log 2>&1
|
||||
```
|
||||
|
||||
It likes to hang, so `timeout` kills it if it's still running after 50 minutes.
|
||||
Hasn't corrupted the database yet.
|
||||
|
||||
### Phone
|
||||
|
||||
[Signal
|
||||
Messenger](https://play.google.com/store/apps/details?id=org.thoughtcrime.securesms&hl=en_CA&gl=US)
|
||||
automatically exports a copy of my text messages database, and
|
||||
[Aegis](https://play.google.com/store/apps/details?id=com.beemdevelopment.aegis&hl=en_CA&gl=US)
|
||||
allows me to export an encrypted JSON file of my two-factor authentication
|
||||
codes.
|
||||
|
||||
I mount my phone's internal storage as a file system on my desktop using
|
||||
[adbfs-rootless](https://github.com/spion/adbfs-rootless). I then rsync the
|
||||
files over to my media server:
|
||||
|
||||
```
|
||||
$ ./adbfs ~/mntphone
|
||||
$ time rsync -Wav \
|
||||
--exclude '*cache' --exclude nobackup \
|
||||
--exclude '*thumb*' --exclude 'Telegram *' \
|
||||
--exclude 'collection.media' \
|
||||
--exclude 'org.thunderdog.challegram' \
|
||||
--exclude '.trashed-*' --exclude '.pending-*' \
|
||||
~/mntphone/storage/emulated/0/ \
|
||||
localmediaserver:/mnt/backup/files/phone/
|
||||
```
|
||||
|
||||
Unfortunately this is a manual process because I need to plug my phone in each
|
||||
time. Ideally it would happen automatically while I'm asleep and the phone is
|
||||
charging.
|
||||
|
||||
### Miscellaneous Files
|
||||
|
||||
The directory `/backup/files` is a repository for any kind of files I want to
|
||||
keep forever. My phone data, old archives, computer files, Minecraft worlds,
|
||||
files from previous jobs, and so on.
|
||||
|
||||
All the files will be included in the 1 TB hard drive backup rotations.
|
||||
|
||||
### Web Services
|
||||
|
||||
Web services that I run like [txt.t0.vc](https://txt.t0.vc) and
|
||||
[QotNews](https://news.t0.vc) are backed up daily, weekly, and monthly depending
|
||||
on how frequently the data changes.
|
||||
|
||||
I run `rdiff-backup` on the remote server with cron:
|
||||
|
||||
```
|
||||
00 14 * * * date -Iseconds > /home/tanner/tbot/t0txt/data/backup_check.txt
|
||||
|
||||
04 14 * * * rdiff-backup /home/tanner/tbot/t0txt/data tbotbak@remotebackup::/mnt/backup/remote/tbotbak/daily/t0txt/
|
||||
14 14 * * * rdiff-backup --remove-older-than 12B --force tbotbak@remotebackup::/mnt/backup/remote/tbotbak/daily/t0txt/
|
||||
|
||||
24 14 * * 1 rdiff-backup /home/tanner/tbot/t0txt/data tbotbak@remotebackup::/mnt/backup/remote/tbotbak/weekly/t0txt/
|
||||
34 14 * * 1 rdiff-backup --remove-older-than 12B --force tbotbak@remotebackup::/mnt/backup/remote/tbotbak/weekly/t0txt/
|
||||
|
||||
44 14 1 * * rdiff-backup /home/tanner/tbot/t0txt/data tbotbak@remotebackup::/mnt/backup/remote/tbotbak/monthly/t0txt/
|
||||
55 14 1 * * rdiff-backup --remove-older-than 12B --force tbotbak@remotebackup::/mnt/backup/remote/tbotbak/monthly/t0txt/
|
||||
```
|
||||
|
||||
The user `tbotbak` has write access only to the `/mnt/backup/remote/tbotbak`
|
||||
directory. It has its own passwordless SSH key that's only permitted to run the
|
||||
`rdiff-backup --server` command for security.
|
||||
|
||||
### Protospace
|
||||
|
||||
I run a lot of services for [Protospace](https://protospace.ca/), my city's
|
||||
makerspace.
|
||||
|
||||
The member portal I wrote called [Spaceport](https://my.protospace.ca/) creates
|
||||
an archive I download daily:
|
||||
|
||||
```
|
||||
40 10 * * * wget --content-disposition \
|
||||
--header="Authorization: secretkeygoeshere" \
|
||||
--directory-prefix /mnt/backup/remote/portalbak/ \
|
||||
--no-verbose --append-output=/var/log/portalbackup.log \
|
||||
https://api.my.protospace.ca/backup/
|
||||
```
|
||||
|
||||
The main website and [wiki](https://wiki.protospace.ca) that I sysadmin gets
|
||||
backed up weekly:
|
||||
|
||||
```
|
||||
0 12 * * 1 mysqldump --all-databases > /var/www/dump.sql
|
||||
15 12 * * 1 date -Iseconds > /var/www/backup_check.txt
|
||||
20 12 * * 1 rdiff-backup /var/www pshostbak@remotebackup::/mnt/backup/remote/pshostbak/weekly/www/
|
||||
```
|
||||
|
||||
The Protospace [Minecraft
|
||||
server](http://games.protospace.ca:8123/?worldname=world&mapname=flat&zoom=3&x=74&y=64&z=354)
|
||||
I run gets backed up daily:
|
||||
|
||||
```
|
||||
00 15 * * * date -Iseconds > /home/tanner/minecraft/backup_check.txt
|
||||
00 15 * * * rdiff-backup --exclude **CoreProtect --exclude **dynmap /home/tanner/minecraft psminebak@remotebackup::/mnt/backup/remote/psminebak/
|
||||
30 15 * * * rdiff-backup --remove-older-than 12B --force psminebak@remotebackup::/mnt/backup/remote/psminebak/
|
||||
```
|
||||
|
||||
I also back up our Google Drive with rclone:
|
||||
|
||||
```
|
||||
45 12 * * 1 rclone copy -v protospace: /mnt/backup/files/protospace/google-drive/
|
||||
```
|
||||
|
||||
## Backup Copies
|
||||
|
||||
My backup folder `/mnt/backup` now looks like this:
|
||||
|
||||
```
|
||||
/mnt/backup/
|
||||
├── files
|
||||
│ ├── docs
|
||||
│ ├── phone
|
||||
│ ├── protospace
|
||||
│ ├── telegram
|
||||
│ ├── usbsticks
|
||||
│ └── ... and so on
|
||||
├── local
|
||||
│ ├── email
|
||||
│ ├── gitea
|
||||
│ ├── nextcloud
|
||||
│ └── notes
|
||||
└── remote
|
||||
├── portalbak
|
||||
├── pshostbak
|
||||
├── psminebak
|
||||
├── tbotbak
|
||||
└── telebak
|
||||
```
|
||||
|
||||
This directory tree is the master backup and I make a copy of the entire tree
|
||||
every Saturday to a hard drive.
|
||||
|
||||
The directory is copied over with the following script:
|
||||
|
||||
```text
|
||||
#!/bin/bash
|
||||
|
||||
cryptsetup luksOpen /dev/sdf external
|
||||
mount /dev/mapper/external /mnt/external
|
||||
|
||||
time rsync -av --delete /mnt/backup/local/ /mnt/external/backup/local/
|
||||
time rsync -av --delete /mnt/backup/remote/ /mnt/external/backup/remote/
|
||||
time rdiff-backup --force -v5 /mnt/backup/files/ /mnt/external/backup/files/
|
||||
|
||||
python3 /home/tanner/scripts/checkbackup.py
|
||||
|
||||
umount /mnt/external
|
||||
cryptsetup luksClose external
|
||||
```
|
||||
|
||||
I wrote a Python script `checkbackup.py` that goes through each backup and
|
||||
compares the timestamp in `backup_check.txt` files to the current time. This
|
||||
makes sure that the cron ran, backups were taken, and transferred over
|
||||
correctly.
|
||||
|
||||
## Rotating Hard Drives
|
||||
|
||||
I rotate through 2.5" 1 TB hard drives each Saturday when I do a backup. They
|
||||
are quite cheap at [$65 CAD](https://www.memoryexpress.com/Products/MX65194)
|
||||
each so I can have a bunch floating around.
|
||||
|
||||
I keep one connected to the server, one in my bag, one offsite, one at my
|
||||
mother's house, and one at my dad's house. Every Saturday I run the script above
|
||||
to take a copy and then swap the drive with the one in my bag. It then gets
|
||||
swapped when I visit my offsite location. Same for when I visit my parents. This
|
||||
means that all hard drives eventually get rotated through with new data and
|
||||
don't sit too long unpowered.
|
||||
|
||||
The drives are all encrypted with full-disk LUKS encryption using a password I'm
|
||||
unlikely to forget.
|
||||
|
||||
I run the check-summing `btrfs` file system on them in RAID-1 to protect against
|
||||
bitrot. This means I can only use 0.5 TB of storage for my backups, but the data
|
||||
is stored redundantly.
|
||||
|
||||
Here's how I set up new hard drives to do this:
|
||||
|
||||
```
|
||||
$ sudo cryptsetup luksOpen /dev/sdf external
|
||||
$ sudo mkfs.btrfs -f -m dup -d dup /dev/mapper/external
|
||||
$ sudo mount /dev/mapper/external /mnt/external/
|
||||
$ sudo mkdir /mnt/external/backup
|
||||
$ sudo chown -R tanner:tanner /mnt/external/backup
|
||||
$ sudo umount /mnt/external
|
||||
$ sudo cryptsetup luksClose external
|
||||
```
|
||||
|
||||
## Future Improvements
|
||||
|
||||
I'm working on a system to automatically back up all my home directories to my
|
||||
media server. I need this to grab Bash histories and code that's
|
||||
work-in-progress. I've been burned by not having this once when a server died.
|
||||
|
||||
I'd like to automate backing up my phone by connecting it to a Raspberry Pi when
|
||||
I go to sleep.
|
||||
|
||||
I need to get better at fully testing my backups by restoring them on a blank
|
||||
machine.
|
|
@ -2,6 +2,7 @@ Title: Choosing a Linux Flavour
|
|||
Date: 2020-10-31
|
||||
Category: Writing
|
||||
Summary: A recommendation on which flavour of Linux to run.
|
||||
Wide: true
|
||||
|
||||
[TOC]
|
||||
|
||||
|
|
Loading…
Reference in New Issue
Block a user