SnapRAID and LVM for Pooling

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

cactus

Moderator
Jan 25, 2011
830
75
28
CA
I would consider this a kludge at this point. I have done very little(assume "no") testing to see if things go wrong.

I have been looking into the pooling options for Linux and am dissatisfied with current solutions. The main limitation for me are files that are larger than an underlying disk. Ex. disk images

I start with a basic SnapRAID setup, two data disks with a single parity disk. This is a Ubuntu 12.04.1 LTS Server VM with SnapRAID from PPA, LVM, and a few small vmdks.

Code:
$ mount
...
/dev/sdc1 on /SnapRAID/Data1 type ext4 (rw,errors=remount-ro)
/dev/sdd1 on /SnapRAID/Data0 type ext4 (rw,errors=remount-ro)
/dev/sdb1 on /SnapRAID/Parity type ext4 (rw,errors=remount-ro)
Code:
$ cat /etc/snapraid.conf
...
parity /SnapRAID/Parity/snapraid_parity_file

content /SnapRAID/Parity/content
content /home/max/content

disk disk0 /SnapRAID/Data0/
disk disk1 /SnapRAID/Data1/

exclude /lost+found/

block_size 256
I set up a loop back device of a sparse file. Each data disk has one or more sparse files which are span'd using LVM. Do not over prevision or I/O errors will ensue when an underlying disk runs out of space. You can also add dm-crypt to have an encrypted FS.

Code:
$ dd if=/dev/zero of=/SnapRAID/Data0/lvm_data0 bs=1 count=0 seek=2G
$ dd if=/dev/zero of=/SnapRAID/Data1/lvm_data1 bs=1 count=0 seek=2G
$ sudo losetup /dev/loop0 /SnapRAID/Data0/lvm_data0
$ sudo losetup /dev/loop1 /SnapRAID/Data1/lvm_data1
Now, set up LVM with all the sparse files. I pretty much followed the Arch Wiki.

Code:
$ sudo pvcreate /dev/loop0
$ sudo pvcreate /dev/loop1
$ sudo vgcreate Bundle0 /dev/loop0
$ sudo vgextend Bundle0 /dev/loop1
$ sudo lvcreate -l +100%FREE -n Cavern
$ sudo vgscan
   Reading all physical volumes.  This may take a while...
   Found volume group "Bundle0" using metadata type lvm2
$ sudo vgchange -ay
Now make a file system on the new Logical Volume(LV) and mount it.
Code:
$ sudo mkfs.ext4 /dev/Bundle0/Cavern
   *Happy output*
$ sudo mount /dev/Bundle0/Cavern /SnapRAID/Pool
$ dh -h
...
/dev/sdc1                   2.0G  929M 1011M  48% /SnapRAID/Data1
/dev/sdd1                   2.0G  729M  1.2G  38% /SnapRAID/Data0
/dev/sdb1                   2.0G  1.1G  854M  57% /SnapRAID/Parity
/dev/mapper/Bundle0-Cavern  2.0G  813M  1.1G  43% /SnapRAID/Pool
$ mount
...
/dev/sdc1 on /SnapRAID/Data1 type ext4 (rw,errors=remount-ro)
/dev/sdd1 on /SnapRAID/Data0 type ext4 (rw,errors=remount-ro)
/dev/sdb1 on /SnapRAID/Parity type ext4 (rw,errors=remount-ro)
/dev/mapper/Bundle0-Cavern on /SnapRAID/Pool type ext4 (rw)
Pros: contiguous pool across all disks; checksums and parity; this can be done on top of current SnapRAID setups; backup can be as simple as copying the sparse files. Cons: multiple layers, journaling file system in a journaling file system (in vmdk in a journaling file system); the LV cant see files on the individual data disks; if you delete a file in the LV you wont see the space return to the data disks unless you employ shenanigans like zeroing the file before deleting and compressing the sparse files, but allocated sparse file space will be used for subsequent files.

Edit: Actually, FALLOC_FL_PUNCH_HOLE has been supported by ext4 since kernel 3.0. This should allow deleted files to *give back* their space, but I'm not seeing it happen with just rm(1) on 3.2. I may be over simplifying what is going on...

At the end of the day, I think I am just going to use ZFS, but this was a fun exercise.
 
Last edited:

rubylaser

Active Member
Jan 4, 2013
846
236
43
Michigan, USA
Can I ask why you didn't use mhddfs or AUFS to pool your disks? Either of them work very well with SnapRAID, and AUFS is very fast due to it's kernel integration. I understand that you are trying to write a > 1GB individual file to your array built out of 1GB disks, but I assume this is only a test case (I assume you'd be using much larger vmdk's when you are truly implementing this). If I misunderstood or there's a logical reason for this type of use, please explain, because I'm a little confused. Otherwise, ZFS is definitely a better solution for your use case.
 
Last edited:

cactus

Moderator
Jan 25, 2011
830
75
28
CA
I will be using 2TB drives, but have a >2TB file I want to store in the pool. With both mhddfs and AUFS you are limited to file sizes less than the underlying disk size. For a pool of files with size smaller than the smallest disk, AUFS is the most seamless of the pooling solutions I have tested(Greyhole, mhddfs, and AUFS).

Code:
$ dd if=/dev/zero of=Pool/big bs=512k count=5000
dd: writing `Pool/big': No space left on device
3744+0 records in
3743+0 records out
1962881024 bytes (2.0 GB) copied, 9.69948 s, 202 MB/s

$ df -h
...
/dev/sdb1       2.0G  1.1G  854M  57% /SnapRAID/Parity
/dev/sdc1       2.0G   63M  1.9G   4% /SnapRAID/Data1
/dev/sdd1       2.0G   63M  1.9G   4% /SnapRAID/Data0
none            4.0G  125M  3.7G   4% /SnapRAID/Pool

$ mount
.../dev/sdb1 on /SnapRAID/Parity type ext4 (rw,errors=remount-ro)
/dev/sdc1 on /SnapRAID/Data1 type ext4 (rw,errors=remount-ro)
/dev/sdd1 on /SnapRAID/Data0 type ext4 (rw,errors=remount-ro)
none on /SnapRAID/Pool type aufs (rw,relatime,si=60714ce4679fbfc3,create=pmfs,sum)
 
Last edited:

odditory

Moderator
Dec 23, 2010
381
59
28
Yeah when it comes to pooling generally it implies JBOD pooling and it sits on top of the filesystem layer, it certainly has strengths when it comes to scenarios like media storage - the need to only spin one disk up at a time and files are changing infrequently. Whole lot to like about SnapRAID in that regard.
 

john4200

New Member
Jan 1, 2011
152
0
0
I read through the thread and I am still not certain exactly what you need to accomplish. But would it be feasible to use mdadm to make RAID-0 devices of two pairs of 2TB HDDs (one for your big file, the other for your parity file)? Then you could use aufs for pooling, and snapraid for parity and checksums.

It seems like it would be simpler than using LVM.

On a slightly different subject, I like to do "pooling" using symlinks. It is basically read-only pooling. I still decide which drive to write to when I add data. But for reading, I have a pool which is just a bunch of symlinks. This works well for me since I only add new data to my media server occasionally, and I like to have the control of deciding which drive gets the new data. Actually creating the symlink pool is simple -- just use the 'cp' command with the '-s' option to make symlinks instead of copying the files.
 
Last edited: