RHEL LVM RAID does have bit-rot protecting!

MrCalvin

IT consultant, Denmark
Aug 22, 2016
81
15
8
50
Denmark
www.wit.dk
To my surprise LVM mdadm does actually seem to be able to save extra checksum-data to prevent bit-rot/soft corruption. Until now I thought the only way was to go e.g. btrfs or sfz. But I just ran into this article over at RHEL: Using DM integrity with RAID LV
Together with thin LVM snapshots, which shouldn't have the performance penalties as the first LVM-SS generation had, I might be a good storage solution after all.
I assume it work on any distro.

What's your thoughts/experience?
 

Tinkerer

Member
Sep 5, 2020
45
14
8
I don't have experience with it but Im very interested, especially in the performance penalty it will bring.

I am in the process of setting up a raid10 with mdadm, I might take the time to test with lvm and run some quick fio benchmarks against it, with and without chksum data.

lvm can be setup with ssd/nvme caching and metadata. See this for example.

Ps, out of curiousity, why did you write 'sfz'?
 

MrCalvin

IT consultant, Denmark
Aug 22, 2016
81
15
8
50
Denmark
www.wit.dk
I don't have experience with it but Im very interested, especially in the performance penalty it will bring.

I am in the process of setting up a raid10 with mdadm, I might take the time to test with lvm and run some quick fio benchmarks against it, with and without chksum data.

lvm can be setup with ssd/nvme caching and metadata. See this for example.

Ps, out of curiousity, why did you write 'sfz'?
Just typo ;-)

I agree it is interesting to see if LVM DM integrity can provide better performance. I'm guessing it will perform better than btrfs (at least for RAID).
I also expect to give it a spin some day and I'll share my result.
 

MrCalvin

IT consultant, Denmark
Aug 22, 2016
81
15
8
50
Denmark
www.wit.dk
I'm guessing so. But since btrfs apparently is particular slow on RAID this it where there potentially could be the highest performance gain using LVM+RAID+dm-integrity.
I'm also curious about how much control you have of the location of those checksums, and if the array keep running if those checksums
"disappear", e.g. if you place them on a single drive outside the array for performance reasons (I know, there can be other bad reasons, the obvious is they disappear :p). But if you run integrity checks e.g. once a week, it might not be that big a deal. But it all depend on how expensive , performance wise, it is to write those checksums to disk.
Also I think it would be fine in some cases if the checksums are/can be written to the pagecache and only be purged when the system "has time".
 

MrCalvin

IT consultant, Denmark
Aug 22, 2016
81
15
8
50
Denmark
www.wit.dk
As it turn out it's not an LVM feature, but been in the kernel since 4.12 apparently introduced for LUKS but work with mdadm too. Don't know if RHEL added any additional stability/performance tweaking if you configure it under LVM instead of doing it native Linux, I'm guessing not.
But you must run at least kernel ver. 5.4-rc1 to have a bug fixed (ref. thread on github)...for LTS distros I believe that shrink to Oracle Linux 8.4 and Ubuntu 20.04. Debian 11 run 5.1.x and SUSE'ish 5.3.x (Well, you never know about backports rights). According to the github-thread RHEL did backport the bug in RHEL 8.x
 

Tinkerer

Member
Sep 5, 2020
45
14
8
Thanks for sharing. My adventure last night ended in me not using lvm raid and I didn't come around testing this specific feature, I decided against lvm raid before I got to that point. Besides the system its for runs kernel 4.19 (xcp, xen hypervisor).
 

RTM

Well-Known Member
Jan 26, 2014
868
325
63
As it turn out it's not an LVM feature, but been in the kernel since 4.12 apparently introduced for LUKS but work with mdadm too. Don't know if RHEL added any additional stability/performance tweaking if you configure it under LVM instead of doing it native Linux, I'm guessing not.
But you must run at least kernel ver. 5.4-rc1 to have a bug fixed (ref. thread on github)...for LTS distros I believe that shrink to Oracle Linux 8.4 and Ubuntu 20.04. Debian 11 run 5.1.x and SUSE'ish 5.3.x (Well, you never know about backports rights). According to the github-thread RHEL did backport the bug in RHEL 8.x
Just a minor correction, Debian 11 "bullseye" comes with a 5.10 kernel (and not 5.1).