I had a look around on the net and didn't find anyone who'd done this precise sort of thing before (fair few people doing this is slightly less complex setups though), so seeing as I spent a couple of days putting this together I thought it might help someone else out there. Bear in mind I've only attempted this on MBR-based systems currently and I'm still using good ol' sysV init - I'm not sure if systemd makes a difference to any of the below or not but I doubt it since all of the changes should mostly be happening before init kicks in anyway. Nor has this been tested on anything other than debian stretch, although I can't see most other debian-derived distros being a million miles apart.
Some background: some of our kit in satellite offices is less physically secure than we'd like, so we've been standardising on making sure things like file server storage sit on LUKS crypto devices to mitigate the risk of data being physically stolen (physical theft of HDDs from servers having happened to me in a previous job and something I think that's likely also applicable to many of you at home too). But that's meant either servers coming up without the data partitions mounted, manual intervention when booting (to input the password when the volume is mounted) or storing LUKS keys on unencrypted filesystems. Anyway, seeing as all our servers are IPMI'd up the wazoo we've agreed to start encrypting all the boot drives as well as the data drives so that the LUKS keys could be stored on the root filesystem, and an operator would be there at boot to input the "master" password to open the root drive, whereupon the LUKS keys could be accessed and used to mount other drives. Servers are never intentionally rebooted without an operator on the IPMI console anyway so it's not a major inconvenience. The question would be how to do this with as little disruption as possible - ideally by converting in place, avoiding a reinstall and partial restore (full restore would be out of the window since too many niggly bits in /boot and /etc will have changed).
So we start looking for a way to convert a running system. Former disc geometry is two SSDs for the OS root and other system partitions;
Our general "old" device/filesystem layout is as follows:
If the SSDs are 240GB or bigger we'd generally also add:
Essentially the plan is to insert another layer between the mdadm and LVM of LUKS-encrypted storage. When the device boots, the operator will be prompted to enter the password for the root partition, all other LUKS partitions will be configured to open up based on a key stored on the encrypted root partition. Note that we're still keeping /boot unencrypted as I've not done enough fiddling with grub to know whether this is viable yet, and RAID1 for boot typically only works with MBR/legacy BIOS since in this regard UEFI is fecking stupid.
I tested this on physical hardware - I could have done it in VMware workstation or ESX easily enough but had spare tin around it cloning the discs was easy. Anyway, booted into a live USB distro (specifically SystemRescueCD) on a spare chassis, plugged in the SSDs and got cracking. In the following examples the SSDs are sdb and sdc since the bootable USB was sda. sdb was formatted as follows:
The partition layout was cloned to the other SSD with sfdisk
All three of the RAID1 arrays were then created
md1 was then turned into a LUKS partition using cryptsetup. This'll prompt you to create the password that'll ultimately be needed to open the root drive.
Once it's created, open it up to the OS and label it as md1_crypt
We're now ready to recreate the LVM stuff, and to make things easier for us down the line instead of recreating from scratch we're going to copy the LVM config from the original system; I scp'd the current LVM config over from the running system (to cut a long story short, this'll almost always be the file /etc/lvm/backup/$vgname) to a working area my bootable distro. Look inside this file for the UUID of the PV; generally that'll be in a block that looks like this:
Now let's use a) that UUID and b) the restore file itself to recreate the PV with the same details
The "couldn't find device with uuid..." line is normal when you're creating one from scratch with a UUID specified, but the error below it is because of the change in disc geometry. The original system was first set up on a 120GB SSD, and then later cloned to a 240GB SSD - when the partitions were expanded and the PV resized, it thought it was now a ~240GB PV. As such when we try and recreate it, it rightly refuses to say it can fit on a 96GB partition.
Easy way out here - I know for a fact that the volumes within the PV take up way less space than 96GB, so we can happily change the dev_size and pe_count attributes in the /mnt/usb/original_system/etc/lvm/backup/root_vg file. Cue back-of-a-fag-packet calculating the dev_size for a ~90GB PV; 90*1024*1024 = 94371840. And for 4MB extents that's (90*1024)/4 = 23040. So amend the file with these new values for the PV and that section now looks like so:
PV now creates fine.
Yes, 90GB is smaller than 96GB (didn't want to risk overshooting), but we can resize the PV any time we want in the future with a simple pvresize.
Now we can recreate the volume group itself - along with all the child logical volumes - from the same config file.
After a quick glance at pvdisplay, vgdisplay and lvdisplay everything looked happy so we activate the VG and LVs to allow them to be used.
Now is the time to recreate the filesystems, and another thing I did here was to make a note of the UUIDs of the ext4 filesystems from the original system and re-use those as well. Just showing the boot and root partitions here for brevity, we're also making sure that 64bit filesystem (needed for >16TB and metadata checksums) and metadata checksums are turned on:
Now's the time to mount the new partitions and copy the data over. The new disc structure will be mounted at /mnt/poghril (that's the system name BTW) and then have the root mounted on it;
On poghril itself, I use rsync to push the data from the live system to the one running the live ISO;
The x option is important here as it stops rsync from crossing filesystem, so any additional mounts won't be copied and will need to be done separately. As such, it also avoids copying /dev, /proc and the rest of it. Anyway, now root is done we'll mount boot and so the same thing there:
...and copy the filesystem over from poghril itself again
Repeat for whatever other partitions make up your root filesystem. Once those are done we're ready to start chrooting but first we need to set up the mounts to be used within the chroot itself.
We're now hopefully ready to do the chroot.
So, hopefully tadaa! Now we can look at reconfiguring grub and reinstalling to the MBR to make the new drives bootable but there's a bunch of stuff we need to check first. Firstly make sure /etc/mtab contains the right information - it should just be a symlink to /proc/mounts but you want to make sure the disc geometry in there looks kosher. Since we've retained the same UUIDs for both LVM and the filesystems themselves, most stuff configured in fstab should Just Work, but if you use labels or other methods of mounting those will need to be changed now.
One thing that needs to be added is an entry into /etc/crypttab for the new LUKS disc to allow it to prompt to be opened/unlocked at boot. We're sticking with the md1_crypt for the mounted name and we don't want to set a key for it.
One gotcha that I missed when I first looking at grub reconfiguration was this, complaining about a missing something or other:
The terminology of "physical volume" was confusing but turns out this actually means a physical device of some sort. As near as I could tell was because in the chroot, the grub configulator was looking for information about /boot (sitting on md0) from mdadm.conf - but mdadm.conf was wrong since it did not contain the UUIDs of the new mdadm arrays, so I quickly changed it to include the new ones (I didn't recreate it from mdadm --scan since that would have removed all the items that weren't currently attached), so I just remmed out the old and added the new;
Now the grub wizard should be able to find the /boot md0 OK - just run `dpkg-reconfigure grub-pc`, install it to the MBR of each of the new SSDs (/dev/sdb and /dev/sdc as they're seen here) and (fingers crossed) we should be good to reboot...
...and at this point I rebooted and the system clanged to an initramfs emergency prompt. D'oh! I had forgotten to reconfigure the initramfs to make sure it's now able to boot from crypto. So I had to boot back into the USB distro, ensure all the RAID devices and LVMs were correctly detected and remounted and then repeat the chroot process. Once we were back in the chroot environment, add explicit support for dmcrypt and LVM to initramfs - they should be detected and added automagically, but at this point I didn't want to take any more chances.
Finally we can run the following to recreate the initramfs for all installed kernels:
Laptop/desktop users will also want to double-check the file /etc/initramfs-tools/conf.d/resume - this is the device that initramfs looks for resume images for hibernate/S2D. However since this is a server it's never used, and in any case the UUIDs of the LVMs and filesystems have been retained so hopefully there's nothing to change here.
This time when I rebooted into the SSDs, I immediately got the prompt asking me to provide the password to unlock the root partition. Unplugged the NIC as a just-in-case, entered the password and successfully booted into a LUKS clone of the existing system. After that's done it's a simple matter to re-enable the pre-existing RAID arrays and crypt volumes, and to add any new ones like the /dev/md2 that we'll be encrypting for use with dm-cache.
Once we'd done this as a proof-of-concept and got the procedure down pat, we did a test on poghril itself. We powered it down and repeated the above process with the live USB, but when the time came to copy from the original system, we rsynced from poghril's original discs (also mounted in the live environment) to ensure data consistency, removed the old discs and then rebooted poghril into the shiny new world of encryption where it's been running fine ever since. I'm also going to be doing the same thing to my main home server when I semi-rebuild it (which was part of why I wanted to get this properly written up) as that also involves a motherboard swap-out and a change in disc geometry too.
As I'm sure most have figured out, most of the above commands and general methodology will work fine for non-RAIDed, non-LVM'd setups as well assuming you know which device IDs you're working from, and there's no real need to do this with a whole new SSD/chassis if you're confident enough in your backups - I did so to avoid potential disruption/destruction and because I had the means available. Anyway, hope someone out there finds this useful or at least interesting.
Some background: some of our kit in satellite offices is less physically secure than we'd like, so we've been standardising on making sure things like file server storage sit on LUKS crypto devices to mitigate the risk of data being physically stolen (physical theft of HDDs from servers having happened to me in a previous job and something I think that's likely also applicable to many of you at home too). But that's meant either servers coming up without the data partitions mounted, manual intervention when booting (to input the password when the volume is mounted) or storing LUKS keys on unencrypted filesystems. Anyway, seeing as all our servers are IPMI'd up the wazoo we've agreed to start encrypting all the boot drives as well as the data drives so that the LUKS keys could be stored on the root filesystem, and an operator would be there at boot to input the "master" password to open the root drive, whereupon the LUKS keys could be accessed and used to mount other drives. Servers are never intentionally rebooted without an operator on the IPMI console anyway so it's not a major inconvenience. The question would be how to do this with as little disruption as possible - ideally by converting in place, avoiding a reinstall and partial restore (full restore would be out of the window since too many niggly bits in /boot and /etc will have changed).
So we start looking for a way to convert a running system. Former disc geometry is two SSDs for the OS root and other system partitions;
Our general "old" device/filesystem layout is as follows:
Code:
/dev/sda1 512MB fd00
/dev/sdb1 512MB fd00
Combined into mdadm RAID1 /dev/md0
Formatted ext4 and mounted as /boot
/dev/sda2 96GB fd00
/dev/sdb2 96GB fd00
Combined into mdadm RAID1 /dev/md1
Formatted as an LVM physical volume (PV)
Volume group (VG) "root_vg"
Logical volumes (LV) for OS, swap, etc
Code:
/dev/sda3 96GB+ fd00
/dev/sdb3 96GB+ fd00
Combined into mdadm RAID1 /dev/md2
Formatted as a LVM physical volume (PV)
Volume group (VG) "storage_vg"
Logical volumes (LV) for dm-cache to speed up the platters used elsewhere
I tested this on physical hardware - I could have done it in VMware workstation or ESX easily enough but had spare tin around it cloning the discs was easy. Anyway, booted into a live USB distro (specifically SystemRescueCD) on a spare chassis, plugged in the SSDs and got cracking. In the following examples the SSDs are sdb and sdc since the bootable USB was sda. sdb was formatted as follows:
Code:
fdisk -l /dev/sdb
Disk /dev/sdi: 223.6 GiB, 240057409536 bytes, 468862128 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disklabel type: dos
Disk identifier: 0x7s2o3220
Device Boot Start End Sectors Size Id Type
/dev/sdb1 * 2048 1048575 1046528 511M fd Linux raid autodetect
/dev/sdb2 1048576 202375167 201326592 96G fd Linux raid autodetect
/dev/sdb3 202375168 403701759 201326592 96G fd Linux raid autodetect
Code:
sfdisk -d /dev/sdb | sfdisk /dev/sdc
Code:
mdadm --create /dev/md0 --level=1 --raid-devices=1 /dev/sdb1 /dev/sdc1
mdadm --create /dev/md1 --level=1 --raid-devices=1 /dev/sdb2 /dev/sdc2
mdadm --create /dev/md2 --level=1 --raid-devices=1 /dev/sdb3 /dev/sdc3
Code:
cryptsetup luksFormat /dev/md1
Code:
cryptsetup luksOpen /dev/md1 md1_crypt
Code:
physical_volumes {
pv0 {
id = "f77rlK-qxme-D6kG-SLV3-lqaF-Zs1m-iEuu1k"
device = "/dev/md1" # Hint only
status = ["ALLOCATABLE"]
flags = []
dev_size = 251656192 # 119.999 Gigabytes
pe_start = 2048
pe_count = 30719 # 119.996 Gigabytes
}
}
Code:
pvcreate --uuid "f77rlK-qxme-D6kG-SLV3-lqaF-Zs1m-iEuu1k" --restorefile /mnt/usb/original_system/etc/lvm/backup/root_vg /dev/mapper/md1_crypt
Couldn't find device with uuid f77rlK-qxme-D6kG-SLV3-lqaF-Zs1m-iEuu1k.
Physical extents end beyond end of device /dev/mapper/md1_crypt.
Format-specific initialisation of physical volume /dev/mapper/md1_crypt failed.
Failed to setup physical volume "/dev/mapper/md1_crypt".
Easy way out here - I know for a fact that the volumes within the PV take up way less space than 96GB, so we can happily change the dev_size and pe_count attributes in the /mnt/usb/original_system/etc/lvm/backup/root_vg file. Cue back-of-a-fag-packet calculating the dev_size for a ~90GB PV; 90*1024*1024 = 94371840. And for 4MB extents that's (90*1024)/4 = 23040. So amend the file with these new values for the PV and that section now looks like so:
Code:
physical_volumes {
pv0 {
id = "f77rlK-qxme-D6kG-SLV3-lqaF-Zs1m-iEuu1k"
device = "/dev/md1" # Hint only
status = ["ALLOCATABLE"]
flags = []
#dev_size = 251656192 # 119.999 Gigabytes
dev_size = 94371840
pe_start = 2048
#pe_count = 30719 # 119.996 Gigabytes
pe_count = 23040
}
}
Code:
pvcreate --uuid "f77rlK-qxme-D6kG-SLV3-lqaF-Zs1m-iEuu1k" --restorefile /mnt/usb/original_system/etc/lvm/backup/root_vg /dev/mapper/md1_crypt
Couldn't find device with uuid f77rlK-qxme-D6kG-SLV3-lqaF-Zs1m-iEuu1k.
Physical volume "/dev/mapper/md1_crypt" successfully created.
Now we can recreate the volume group itself - along with all the child logical volumes - from the same config file.
Code:
vgcfgrestore -f /mnt/usb/original_system/etc/lvm/backup/root_vg root_vg
Restored volume group root_vg
Code:
vgchange -a y root_vg
lvchange -a y /dev/root_vg/*_lv
Code:
mkfs.ext4 -m 5 -L poghril_boot -O 64bit,metadata_csum -U rq908oqq-588p-41op-8qsr-snq887518p5o /dev/md0
mkfs.ext4 -m 5 -L poghril_root -O 64bit,metadata_csum -U 288289q0-48s0-4sq6-9p7r-9pq408sorn63 /dev/mapper/root_vg-root_lv
Code:
mkdir /mnt/poghril
mount /dev/mapper/root_vg-root_lv /mnt/poghril
Code:
root@poghril:~# rsync -axHAWXS --numeric-ids --info=progress2 / root@10.17.10.6:/mnt/poghril
Password:
1,640,635,234 99% 52.16MB/s 0:00:29 (xfr#32254, to-chk=0/41424)
Code:
mount /dev/mapper/root_vg-boot_lv /mnt/poghril/boot
Code:
root@poghril:~# rsync -axHAWXS --numeric-ids --info=progress2 /boot/ root@10.17.10.6:/mnt/poghril/boot
Password:
194,414,333 100% 86.58MB/s 0:00:02 (xfr#352, to-chk=0/358)
Code:
mount -t proc none /mnt/poghril/proc
mount -t sysfs none /mnt/poghril/sys
mount --bind /dev /mnt/poghril/dev
Code:
chroot /mnt/poghril /bin/bash
One thing that needs to be added is an entry into /etc/crypttab for the new LUKS disc to allow it to prompt to be opened/unlocked at boot. We're sticking with the md1_crypt for the mounted name and we don't want to set a key for it.
Code:
md1_crypt UUID=dc28b993-8a09-475c-b4ed-9e7aa460a791 none luks
Code:
update-grub2
Generating grub configuration file ...
/usr/sbin/grub-probe: warning: Couldn't find physical volume `(null)'. Some modules may be missing from core image..
/usr/sbin/grub-probe: warning: Couldn't find physical volume `(null)'. Some modules may be missing from core image..
/usr/sbin/grub-probe: warning: Couldn't find physical volume `(null)'. Some modules may be missing from core image..
WARNING: Failed to connect to lvmetad. Falling back to device scanning.
WARNING: Failed to connect to lvmetad. Falling back to device scanning.
/usr/sbin/grub-probe: warning: Couldn't find physical volume `(null)'. Some modules may be missing from core image..
Found linux image: /boot/vmlinuz-4.9.0-7-amd64
Found initrd image: /boot/initrd.img-4.9.0-7-amd64
/usr/sbin/grub-probe: warning: Couldn't find physical volume `(null)'. Some modules may be missing from core image..
/usr/sbin/grub-probe: warning: Couldn't find physical volume `(null)'. Some modules may be missing from core image..
/usr/sbin/grub-probe: warning: Couldn't find physical volume `(null)'. Some modules may be missing from core image..
Code:
# definitions of existing MD arrays
#ARRAY /dev/md/0 metadata=1.2 UUID=68rr4q67:0r14os37:3061r653:sq7q53o0 name=poghril:0
ARRAY /dev/md/0 metadata=1.2 UUID=n72so981:136n7006:4p75404s:s6o6os0r name=poghril:0
#ARRAY /dev/md/1 metadata=1.2 UUID=q00sr243:qs8r7p05:o6pn7oo7:79sn3qrp name=poghril:1
ARRAY /dev/md/1 metadata=1.2 UUID=2478705s:22n6o8oo:9p7119r7:90p5sr12 name=poghril:1
...and at this point I rebooted and the system clanged to an initramfs emergency prompt. D'oh! I had forgotten to reconfigure the initramfs to make sure it's now able to boot from crypto. So I had to boot back into the USB distro, ensure all the RAID devices and LVMs were correctly detected and remounted and then repeat the chroot process. Once we were back in the chroot environment, add explicit support for dmcrypt and LVM to initramfs - they should be detected and added automagically, but at this point I didn't want to take any more chances.
Code:
echo lvm2 >> /etc/initramfs-tools/modules
echo dmcrypt >> /etc/initramfs-tools/modules
Code:
update-initramfs -k all -u
This time when I rebooted into the SSDs, I immediately got the prompt asking me to provide the password to unlock the root partition. Unplugged the NIC as a just-in-case, entered the password and successfully booted into a LUKS clone of the existing system. After that's done it's a simple matter to re-enable the pre-existing RAID arrays and crypt volumes, and to add any new ones like the /dev/md2 that we'll be encrypting for use with dm-cache.
Once we'd done this as a proof-of-concept and got the procedure down pat, we did a test on poghril itself. We powered it down and repeated the above process with the live USB, but when the time came to copy from the original system, we rsynced from poghril's original discs (also mounted in the live environment) to ensure data consistency, removed the old discs and then rebooted poghril into the shiny new world of encryption where it's been running fine ever since. I'm also going to be doing the same thing to my main home server when I semi-rebuild it (which was part of why I wanted to get this properly written up) as that also involves a motherboard swap-out and a change in disc geometry too.
As I'm sure most have figured out, most of the above commands and general methodology will work fine for non-RAIDed, non-LVM'd setups as well assuming you know which device IDs you're working from, and there's no real need to do this with a whole new SSD/chassis if you're confident enough in your backups - I did so to avoid potential disruption/destruction and because I had the means available. Anyway, hope someone out there finds this useful or at least interesting.
Last edited: