diff --git a/content/backup-strategy.md b/content/backup-strategy.md new file mode 100644 index 0000000..17c60d8 --- /dev/null +++ b/content/backup-strategy.md @@ -0,0 +1,337 @@ +Title: My Backup Strategy +Date: 2021-04-08 +Category: Writing +Summary: Details about the backup system for my data. +Wide: true + +[TOC] + +Regularly backing up all the data I care about is very important to me. This +article outlines my strategy to make sure I never lose essential data. + +## Motivation + +Backups should be as automatic as possible. This ensures laziness and +forgetfulness won't interfere with the regularity. + +All software used to create and store the backups should be free and open source +so I'm not depending on the survival of a company. + +Backups need to be tested to ensure they are correct and happening regularly. +Multiple copies of the backups should exist, including at least one offsite to +protect against my building burning down. + +Backups should also be incremental when possible (rather than mirror copies) so +an accidental deletion isn't propagated into the backups, making the file +irrecoverable. + +## Strategy + +I have one backup folder `/mnt/backup` on my media server at home that serves as +the destination for all my backup sources. All scheduled automatic backups write +to their own subfolder inside of it. + +This backup folder is then synced to encrypted 2.5" 1 TB hard drives which I +rotate between my bag, offsite, and my parent's house. + +## Backup Sources + +I use the tool `rdiff-backup` extensively because it allows me to take +incremental backups locally or over SSH. It acts very similar to `rsync` and has +no configuration. + +### Email + +I have every email since 2010 backed up continuously in case my email provider +disappears. + +I use `offlineimap` to sync my mail to the directory `~/email` on my media +server as a Maildir. Since offlineimap is only a syncing tool, the emails need +to be copied elsewhere to be backed up. I run `rdiff-backup` from a weekly cron +job: + +``` +*/15 * * * * offlineimap > /var/log/offlineimap.log 2>&1 +00 12 * * 1 date -Iseconds > /home/email/email/backup_check.txt + +20 12 * * 1 rdiff-backup /home/email/email /mnt/backup/local/email/ +40 12 * * 1 rdiff-backup --remove-older-than 12B --force /mnt/backup/local/email/ +``` + +Here's my `.offlineimaprc` for reference: + +``` +[general] +accounts = main +[Account main] +localrepository = Local +remoterepository = Remote +[Repository Local] +type = Maildir +localfolders = ~/email +[Repository Remote] +type = IMAP +readonly = True +folderfilter = lambda foldername: foldername not in ['Trash', 'Spam', 'Drafts'] +remotehost = example.com +remoteuser = mail@example.com +remotepass = supersecret +sslcacertfile = /etc/ssl/certs/ca-certificates.crt +``` + +### Notes + +I use Standard Notes to take notes and wrote the tool +[standardnotes-fs](https://github.com/tannercollin/standardnotes-fs) to mount my +notes as a file system to view and edit them as plain text files. + +I take weekly backups of the mounted file system on my media server with cron: + +``` +00 12 * * 1 date -Iseconds > /home/notes/notes/backup_check.txt +15 12 * * 1 rdiff-backup /home/notes/notes /mnt/backup/local/notes/ +``` + +### Nextcloud + +I self-host a Nextcloud instance to store all my personal documents (non-code +projects, tax forms, spreadsheets, etc.). Since it's only a syncing software, +the files need to be copied elsewhere to be backed up. + +I take weekly backups of the Nextcloud data folder with cron: + +``` +00 12 * * 1 rdiff-backup /var/www/nextcloud/data/tanner/files /mnt/backup/local/nextcloud/ +30 12 * * 1 rdiff-backup --remove-older-than 12B --force /mnt/backup/local/nextcloud/ +``` + +### Gitea + +I self-host a Gitea instance to store all my git repositories for code-based +projects. My home folder is also a git repo so I can easily sync my config files +and password database between servers and machines. + +I take weekly backups of the Gitea data folder with cron: + +``` +00 12 * * 1 date -Iseconds > /home/gitea/gitea/data/backup_check.txt +10 12 * * 1 rdiff-backup --exclude **data/indexers --exclude **data/sessions /home/gitea/gitea/data /mnt/backup/local/gitea/ +35 12 * * 1 rdiff-backup --remove-older-than 12B --force /mnt/backup/local/gitea/ +``` + +### Telegram + +Telegram Messenger is my main app for communication. My parents, most of my +friends, and friend groups are on there so I don't want to lose those messages +in case Telegram disappears or my account gets banned. + +Telegram includes a data export feature, but it can't be automated. Instead I +run the deprecated software +[telegram-export](https://github.com/expectocode/telegram-export) hourly with +cron: + +``` +0 * * * * bash -c 'timeout 50m /home/tanner/opt/telegram-export/env/bin/python -m telegram_export' > /var/log/telegramexport.log 2>&1 +``` + +It likes to hang, so `timeout` kills it if it's still running after 50 minutes. +Hasn't corrupted the database yet. + +### Phone + +[Signal +Messenger](https://play.google.com/store/apps/details?id=org.thoughtcrime.securesms&hl=en_CA&gl=US) +automatically exports a copy of my text messages database, and +[Aegis](https://play.google.com/store/apps/details?id=com.beemdevelopment.aegis&hl=en_CA&gl=US) +allows me to export an encrypted JSON file of my two-factor authentication +codes. + +I mount my phone's internal storage as a file system on my desktop using +[adbfs-rootless](https://github.com/spion/adbfs-rootless). I then rsync the +files over to my media server: + +``` +$ ./adbfs ~/mntphone +$ time rsync -Wav \ + --exclude '*cache' --exclude nobackup \ + --exclude '*thumb*' --exclude 'Telegram *' \ + --exclude 'collection.media' \ + --exclude 'org.thunderdog.challegram' \ + --exclude '.trashed-*' --exclude '.pending-*' \ + ~/mntphone/storage/emulated/0/ \ + localmediaserver:/mnt/backup/files/phone/ +``` + +Unfortunately this is a manual process because I need to plug my phone in each +time. Ideally it would happen automatically while I'm asleep and the phone is +charging. + +### Miscellaneous Files + +The directory `/backup/files` is a repository for any kind of files I want to +keep forever. My phone data, old archives, computer files, Minecraft worlds, +files from previous jobs, and so on. + +All the files will be included in the 1 TB hard drive backup rotations. + +### Web Services + +Web services that I run like [txt.t0.vc](https://txt.t0.vc) and +[QotNews](https://news.t0.vc) are backed up daily, weekly, and monthly depending +on how frequently the data changes. + +I run `rdiff-backup` on the remote server with cron: + +``` +00 14 * * * date -Iseconds > /home/tanner/tbot/t0txt/data/backup_check.txt + +04 14 * * * rdiff-backup /home/tanner/tbot/t0txt/data tbotbak@remotebackup::/mnt/backup/remote/tbotbak/daily/t0txt/ +14 14 * * * rdiff-backup --remove-older-than 12B --force tbotbak@remotebackup::/mnt/backup/remote/tbotbak/daily/t0txt/ + +24 14 * * 1 rdiff-backup /home/tanner/tbot/t0txt/data tbotbak@remotebackup::/mnt/backup/remote/tbotbak/weekly/t0txt/ +34 14 * * 1 rdiff-backup --remove-older-than 12B --force tbotbak@remotebackup::/mnt/backup/remote/tbotbak/weekly/t0txt/ + +44 14 1 * * rdiff-backup /home/tanner/tbot/t0txt/data tbotbak@remotebackup::/mnt/backup/remote/tbotbak/monthly/t0txt/ +55 14 1 * * rdiff-backup --remove-older-than 12B --force tbotbak@remotebackup::/mnt/backup/remote/tbotbak/monthly/t0txt/ +``` + +The user `tbotbak` has write access only to the `/mnt/backup/remote/tbotbak` +directory. It has its own passwordless SSH key that's only permitted to run the +`rdiff-backup --server` command for security. + +### Protospace + +I run a lot of services for [Protospace](https://protospace.ca/), my city's +makerspace. + +The member portal I wrote called [Spaceport](https://my.protospace.ca/) creates +an archive I download daily: + +``` +40 10 * * * wget --content-disposition \ + --header="Authorization: secretkeygoeshere" \ + --directory-prefix /mnt/backup/remote/portalbak/ \ + --no-verbose --append-output=/var/log/portalbackup.log \ + https://api.my.protospace.ca/backup/ +``` + +The main website and [wiki](https://wiki.protospace.ca) that I sysadmin gets +backed up weekly: + +``` +0 12 * * 1 mysqldump --all-databases > /var/www/dump.sql +15 12 * * 1 date -Iseconds > /var/www/backup_check.txt +20 12 * * 1 rdiff-backup /var/www pshostbak@remotebackup::/mnt/backup/remote/pshostbak/weekly/www/ +``` + +The Protospace [Minecraft +server](http://games.protospace.ca:8123/?worldname=world&mapname=flat&zoom=3&x=74&y=64&z=354) +I run gets backed up daily: + +``` +00 15 * * * date -Iseconds > /home/tanner/minecraft/backup_check.txt +00 15 * * * rdiff-backup --exclude **CoreProtect --exclude **dynmap /home/tanner/minecraft psminebak@remotebackup::/mnt/backup/remote/psminebak/ +30 15 * * * rdiff-backup --remove-older-than 12B --force psminebak@remotebackup::/mnt/backup/remote/psminebak/ +``` + +I also back up our Google Drive with rclone: + +``` +45 12 * * 1 rclone copy -v protospace: /mnt/backup/files/protospace/google-drive/ +``` + +## Backup Copies + +My backup folder `/mnt/backup` now looks like this: + +``` +/mnt/backup/ +├── files +│   ├── docs +│   ├── phone +│   ├── protospace +│   ├── telegram +│   ├── usbsticks +│   └── ... and so on +├── local +│   ├── email +│   ├── gitea +│   ├── nextcloud +│   └── notes +└── remote + ├── portalbak + ├── pshostbak + ├── psminebak + ├── tbotbak + └── telebak +``` + +This directory tree is the master backup and I make a copy of the entire tree +every Saturday to a hard drive. + +The directory is copied over with the following script: + +```text +#!/bin/bash + +cryptsetup luksOpen /dev/sdf external +mount /dev/mapper/external /mnt/external + +time rsync -av --delete /mnt/backup/local/ /mnt/external/backup/local/ +time rsync -av --delete /mnt/backup/remote/ /mnt/external/backup/remote/ +time rdiff-backup --force -v5 /mnt/backup/files/ /mnt/external/backup/files/ + +python3 /home/tanner/scripts/checkbackup.py + +umount /mnt/external +cryptsetup luksClose external +``` + +I wrote a Python script `checkbackup.py` that goes through each backup and +compares the timestamp in `backup_check.txt` files to the current time. This +makes sure that the cron ran, backups were taken, and transferred over +correctly. + +## Rotating Hard Drives + +I rotate through 2.5" 1 TB hard drives each Saturday when I do a backup. They +are quite cheap at [$65 CAD](https://www.memoryexpress.com/Products/MX65194) +each so I can have a bunch floating around. + +I keep one connected to the server, one in my bag, one offsite, one at my +mother's house, and one at my dad's house. Every Saturday I run the script above +to take a copy and then swap the drive with the one in my bag. It then gets +swapped when I visit my offsite location. Same for when I visit my parents. This +means that all hard drives eventually get rotated through with new data and +don't sit too long unpowered. + +The drives are all encrypted with full-disk LUKS encryption using a password I'm +unlikely to forget. + +I run the check-summing `btrfs` file system on them in RAID-1 to protect against +bitrot. This means I can only use 0.5 TB of storage for my backups, but the data +is stored redundantly. + +Here's how I set up new hard drives to do this: + +``` +$ sudo cryptsetup luksOpen /dev/sdf external +$ sudo mkfs.btrfs -f -m dup -d dup /dev/mapper/external +$ sudo mount /dev/mapper/external /mnt/external/ +$ sudo mkdir /mnt/external/backup +$ sudo chown -R tanner:tanner /mnt/external/backup +$ sudo umount /mnt/external +$ sudo cryptsetup luksClose external +``` + +## Future Improvements + +I'm working on a system to automatically back up all my home directories to my +media server. I need this to grab Bash histories and code that's +work-in-progress. I've been burned by not having this once when a server died. + +I'd like to automate backing up my phone by connecting it to a Raspberry Pi when +I go to sleep. + +I need to get better at fully testing my backups by restoring them on a blank +machine. diff --git a/content/linux-flavour.md b/content/linux-flavour.md index e2548b0..5536540 100644 --- a/content/linux-flavour.md +++ b/content/linux-flavour.md @@ -2,6 +2,7 @@ Title: Choosing a Linux Flavour Date: 2020-10-31 Category: Writing Summary: A recommendation on which flavour of Linux to run. +Wide: true [TOC]