Linux Bootable Backup
When I set up this new server, one of the things I really wanted to have was a way to make sure that everything was regularly backed up in the event of some system issue, a hardware failure, something I broke accidentally, etc. My old Mac based system used an excellent tool called Carbon Copy Cloner which would make a bootable backup copy on a scheduled basis.
After much searching, I wasn’t able to find anything comparable to this utility for linux, that could run on an active system and yield a bootable volume. There were some tools to clone a drive but only after booting to another live image and running a separate utility, but this wasn’t something I could schedule. There were tools to make copies, but these wouldn’t end up with a bootable backup. So after a lot of head scratching and a lot of research, I was able to come up with something that worked for me.
Like many linux admins, I’ve been using rsync for years to copy data from one location to another. The Carbon Copy Cloner utility also made use of rsync for its backup strategy. But the hiccup is with how modern linux versions boot, EFI, Grub 2, etc. Simply cloning this data to another drive will at best direct the boot process back to the original drive if still attached, or fail totally if that drive is no longer present.
After much research and some trial and error, I found that I could clone the boot partition, as long as I didn’t overwrite the grubenv file, in my distribution this was located at /boot/efi/EFI/rocky/grubenv, provided I had a working grubenv on that volume to begin with. This dovetailed into the issue of a standard linux install generally having multiple partitions, which wouldn’t yet exist on my backup drive.
I toyed with the idea of scripting a routine to analyze the source drive, determine the partitions used, then repartition the destination drive to match, and build from scratch a new grubenv file each time for the backup drive. This is still a back burner idea, but to simplify things I simply installed a fresh copy of linux (Rocky 8) on my backup drive, used the same automatic partitioning as my source drive, and saved doing all this via a script for another day.
This still left me with the task of copying all necessary data from one drive to another. This meant that the destination drive and its partitions needed to be accessible by my script. The external drive was going to be available at all times, so I simply used the Disks utility to set the mount options for each partition so that these mounted at every startup and were therefore always available.
Once this was done, I could then reference my source drive and the destination drive by their respective mount points to facilitate copying.
Again, I took a shortcut here and hard copied these values into my script. There are four separate copies taking place. The first is the /boot/efi partition, which is the first partition of the drive. Next is the /boot partition, which is actually a separate partition, and why it helps to understand linux mount points. Next up is the /home partition, which in my case is contained in a LVM2 partition, and finally the root (/) partition, which again is on that LVM2 partition. The source partition names for the script are easy enough, I refered to the /etc/fstab file to help copy the destination partition mount points for the copy.
The real magic comes in with the rsync commands as shown in the script, in several cases certain files or directories needed to be excluded. The grubenv file as mentioned above excluded from the /boot/efi rsync, the entire /efi directory from the boot volume (which thanks to linux mount points, despite being a separate partition, actually appears to be in the /boot directory, and we don’t want this copied since it doesn’t actually exist here), and lastly a whole slew of directories from the root volume which either contain directories created dynamically that don’t exist on the hard drive (/dev, /proc, /tmp, etc) and some that are created on the fly (/mnt, etc). This part took a bit of research to get right…
Next up I wanted some sort of visual indication in case I happened to boot from the backup hard drive by mistake, or in the event of the primary drive being unavailable. I toyed with several different ideas before coming across the gmd-wallpaper utility, link in the code below. This allowed my to change the wallpaper on the login screen from the default to a custom image, in my case a bright RED background to serve as a warning I was booted from the backup drive.
Before the copy takes place, I set the red wallpaper, the backup rsyncs take place, then the original wallpaper is restored. Some logging has been thrown in to let me check up on things from time to time, and only the stats from the rsync are logged rather than the full file list.
For good measure, I also stop and then restart several processes which are likely to be updating files while the backup takes place to avoid files changing in the middle of a backup. After the initial backup, subsequent backups tend to be quite fast, so services aren’t disabled for long.
This script is located in my /etc/cron.daily directory, and so executes each morning with no further involvement from me.
#!/bin/bash
# Determine this via: cat /etc/fstab | grep boot/efi
sourceUUID="UUID=E11C-1E84"
# Use Disks utility to configure backup drive patitions to be mounted at startup (updates fstab).
# Copy paths to the appropriate variables below.
sourcebootEFI="/boot/efi/"
backupbootEFI="/mnt/wwn-0x5000000000000001-part1"
sourceboot="/boot/"
backupboot="/run/media/wright/040f5401-3854-40dc-a0d8-04e1fc8df102"
sourcehome="/home/"
backuphome="/mnt/158748a6-a322-411f-8e6f-c5be04c49c18"
sourceroot="/"
backuproot="/mnt/50da7e61-7c35-448e-a655-3fd876bc6cb4"
#Direct all output to log file
#exec 1> >(logger -s -t $(basename $0) -f $(basename $0)) 2>&1
exec 1>> /var/log/nightly_backup
printf "%(%b %_d %Y %H:%M:%S)T ** Backup Begins **\n\n"
#Find the boot partition
#Match only the string boot/efi
read bootUUID < <(cat /etc/fstab | grep boot/efi | cut -d " " -f 1)
printf "Boot Volume: "
printf "$bootUUID\n"
printf "Source Volume: "
printf "$sourceUUID\n"
printf "\n"
if [[ $sourceUUID != $bootUUID ]]
then
printf "Boot drive does not match configured source drive, Aborting.\n\n"
exit
else
printf "Boot drive matches configured source drive.\n"
fi
# Set to red login/wallpaper, when booting from the backup drive,
# this will serve as a clear warning we aren't using the normal boot drive.
# using this code: https://copr.fedorainfracloud.org/coprs/zirix/gdm-wallpaper/
# Rocky Linux required a manual install from source.
/home/wright/Documents/set-gdm-wallpaper.sh '/home/wright/Documents/Pure-Red-Wallpaper.jpg'
# Stop various processes here before backup (postfix, dovecot, mysql, etc)
systemctl stop postfix dovecot mysql
# rsync section
#sync /boot/efi
printf "Syncing /boot/efi volume.\n\n"
printf "rsync -ahAHX --inplace --delete --stats --exclude=EFI/rocky/grubenv %s %s\n" $sourcebootEFI $backupbootEFI
rsync -ahAHX --inplace --delete --stats --exclude=EFI/rocky/grubenv $sourcebootEFI $backupbootEFI | awk '/Number of/'
#sync /boot, exclude /boot/efi path as this is copied separately above to a different partition
printf "\nSyncing /boot volume.\n\n"
printf "rsync -ahAHX --inplace --delete --stats --exclude=efi/* %s %s\n" $sourceboot $backupboot
rsync -ahAHX --inplace --delete --stats --exclude=efi/* $sourceboot $backupboot | awk '/Number of/'
#sync /home
printf "\nSyncing /home volume.\n\n"
printf "rsync -ahAHX --inplace --delete --stats %s %s\n" $sourcehome $backuphome
rsync -ahAHX --inplace --delete --stats $sourcehome $backuphome | awk '/Number of/'
#sync /, exclude /boot and /home as these are copied above, and various system linux paths that are created dynamically so don't need backed up
printf "\nSyncing / volume.\n\n"
printf "rsync -ahAHX --inplace --delete --stats --exclude={/etc/fstab,/boot/*,/dev/*,/home/*,/media/*,/mnt/*,/proc/*,/run/*,/sys/*,/tmp/*} %s %s\n" $sourceroot $backuproot
rsync -ahAHX --inplace --delete --stats --exclude={/etc/fstab,/boot/*,/dev/*,/home/*,/media/*,/mnt/*,/proc/*,/run/*,/sys/*,/tmp/*} $sourceroot $backuproot | awk '/Number of/'
# Restore normal login/wallpaper.
/home/wright/Documents/set-gdm-wallpaper.sh --uninstall
printf '\n'
#Restart services stopped earlier
systemctl start postfix dovecot mysql
printf "%(%b %_d %Y %H:%M:%S)T ** Backup Ends **\n\n"
printf '***********************************************\n\n'