OmniOS (actually, general zfs raidz1 dumb question)

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

RobertCO

New Member
Jan 2, 2022
14
1
3
Hi all. I've been reading at length here about raidz1 vs multiple mirrored vdevs and I'm tired of the mediocre performance I'm seeing when writing to this pool. I've read more vdevs = better write performance. I'm in the process of backing up nearly 30TB and nearly ready to reconfig this pool. But I have a dumb question...

Current config (all disks seagate ironwolf, 7200RPM, split between two LSI-9211-8i flashed to IT mode) running on omnios-r151038

Code:
pool: media
      raidz1-0
        c0t5000C500DCEDAA8Fd0 <ATA-ST12000VN0008-2Y-SC60-10.91TB>
        c0t5000C500DCE731D6d0 <ATA-ST12000VN0008-2Y-SC60-10.91TB>
        c0t5000C500DCEDD2E3d0 <ATA-ST12000VN0008-2Y-SC60-10.91TB>
        c0t5000C500DCE714D4d0 <ATA-ST12000VN0008-2Y-SC60-10.91TB>
        c0t5000C500DCEFC615d0 <ATA-ST12000VN0008-2Y-SC60-10.91TB>
      raidz1-1
        c0t5000C500B5AAD9B9d0 <ATA-ST8000VN0022-2EL-SC61-7.28TB>
        c0t5000C500B5ADD6E3d0 <ATA-ST8000VN0022-2EL-SC61-7.28TB>
        c0t5000C500B6A2E196d0 <ATA-ST8000VN0022-2EL-SC61-7.28TB>
        c0t5000C500B6A14FB8d0 <ATA-ST8000VN0022-2EL-SC61-7.28TB>
        c0t5000C500B6A2360Ed0 <ATA-ST8000VN0022-2EL-SC61-7.28TB>
Proposed config:

Code:
    pool: media
      mirror-0
        c0t5000C500DCEDAA8Fd0  8TB
        c0t5000C500DCE731D6d0  8TB
      mirror-1
        c0t5000C500DCEDD2E3d0  8TB
        c0t5000C500DCE714D4d0  8TB
      mirror-2
        c0t5000C500DCEFC615d0  8TB
        c0t5000C500DCEFC615d0  8TB
      mirror-3
        c0t5000C500B5AAD9B9d0  12TB
        c0t5000C500B5ADD6E3d0  12TB
      mirror-4
        c0t5000C500B6A2E196d0  12TB
        c0t5000C500B6A14FB8d0  12TB
Here's the question: once at least 8TB has been written to all the vdevs, won't future writes to the pool be concentrated on mirror-3 and mirror-4? And, subsequently reads as well?

I am in a position right now where I could return the 4 12TB drives and get 6 8TB drives for the same price. Would that make more sense to have all the drives in the pool be the same size?
 

gea

Well-Known Member
Dec 31, 2010
3,161
1,195
113
DE
You are correct. Over time when the pool fills up it became unbalanced as newer writes must land on the 12T disks. I would return the disks and switch to 8T disks only.

Only when you plan to replace the 8T disks sometimes with 12T ones, keep the 12T disks. Use the 8T disks then for backup.

btw
Sequentially a dual raid-z pool is quite as fast as a multi mirror but iops is only the sum of two disks. Only on random io (iops) the multi mirror is faster (write iops and sequential performance = n * number of mirrors, read iops= twice the write value)
 
  • Like
Reactions: RobertCO

RobertCO

New Member
Jan 2, 2022
14
1
3
You are correct. Over time when the pool fills up it became unbalanced as newer writes must land on the 12T disks. I would return the disks and switch to 8T disks only.

Only when you plan to replace the 8T disks sometimes with 12T ones, keep the 12T disks. Use the 8T disks then for backup.

btw
Sequentially a dual raid-z pool is quite as fast as a multi mirror but iops is only the sum of two disks. Only on random io (iops) the multi mirror is faster (write iops and sequential performance = n * number of mirrors, read iops= twice the write value)
Thank you.

After having used the 2 (or 3) raidz1 vdev configuration for years now I'm quite looking forward to a much easier to manage config of multi mirror. Also going to be nice being able to zpool remove a vdev of data if need be.

"
zpool remove [-np] pool device...
Removes the specified device from the pool. This command
currently only supports removing hot spares, cache, log devices
and mirrored top-level vdevs (mirror of leaf devices); but not
raidz.
"
 
Last edited:

RobertCO

New Member
Jan 2, 2022
14
1
3
Storage never shrink.....
Seems like it can now? omnios-r151038 shows a new zfs feature (at least I haven't seen it before):

device_removal
Top-level vdevs can be removed, reducing logical pool size.

I decided to test it on a small test pool and it seems to work great.

Starting config:

Code:
root:/# df -h /test
Filesystem      Size  Used Avail Use% Mounted on
test            57G  5.2G   52G  10% /test

root:/# zpool status test
  pool: test
 state: ONLINE
  scan: scrub repaired 0 in 0 days 00:00:00 with 0 errors on Mon Jan  3 14:59:49 2022
config:

        NAME          STATE     READ WRITE CKSUM
        test         ONLINE       0     0     0
          mirror-0    ONLINE       0     0     0
            c2t1d0s0  ONLINE       0     0     0
            c2t1d0s1  ONLINE       0     0     0
          mirror-1    ONLINE       0     0     0
            c2t1d0s2  ONLINE       0     0     0
            c2t1d0s3  ONLINE       0     0     0
          mirror-2    ONLINE       0     0     0
            c2t1d0s4  ONLINE       0     0     0
            c2t1d0s5  ONLINE       0     0     0

errors: No known data errors
Let's remove the mirror-1 vdev:

Code:
root:/# zpool remove test mirror-1

root:/# zpool status test
  pool: test
 state: ONLINE
  scan: scrub repaired 0 in 0 days 00:00:00 with 0 errors on Mon Jan  3 14:59:49 2022
remove: Evacuation of mirror in progress since Mon Jan  3 15:10:54 2022
    632M copied out of 1.71G at 63.2M/s, 36.19% done, 0h0m to go
config:

        NAME          STATE     READ WRITE CKSUM
        test         ONLINE       0     0     0
          mirror-0    ONLINE       0     0     0
            c2t1d0s0  ONLINE       0     0     0
            c2t1d0s1  ONLINE       0     0     0
          mirror-1    ONLINE       0     0     0
            c2t1d0s2  ONLINE       0     0     0
            c2t1d0s3  ONLINE       0     0     0
          mirror-2    ONLINE       0     0     0
            c2t1d0s4  ONLINE       0     0     0
            c2t1d0s5  ONLINE       0     0     0

errors: No known data errors
While the evacuation is in progress, I/O looks as expected, reads only on the mirror-1 devices and writes on the mirror-0 and mirror-2 devices.

Code:
root:/# zpool iostat -v test 5
------------  -----  -----  -----  -----  -----  -----
                capacity     operations     bandwidth
pool          alloc   free   read  write   read  write
------------  -----  -----  -----  -----  -----  -----
test         6.03G  52.5G    130    153   129M   129M
  mirror      2.17G  17.3G      0     74      0  64.6M
    c2t1d0s0      -      -      0     37      0  32.4M
    c2t1d0s1      -      -      0     37      0  32.2M
  mirror      1.71G  17.8G    130      1   129M  1.00K
    c2t1d0s2      -      -     65      0  64.4M    514
    c2t1d0s3      -      -     65      0  64.4M    514
  mirror      2.16G  17.3G      0     77      0  64.4M
    c2t1d0s4      -      -      0     38      0  32.2M
    c2t1d0s5      -      -      0     38      0  32.2M
------------  -----  -----  -----  -----  -----  -----
Finished:

Code:
root:/# zpool status test
  pool: test
 state: ONLINE
  scan: scrub repaired 0 in 0 days 00:00:00 with 0 errors on Mon Jan  3 14:59:49 2022
remove: Removal of vdev 1 copied 1.71G in 0h0m, completed on Mon Jan  3 15:11:23 2022
    41.6K memory used for removed device mappings
config:

        NAME          STATE     READ WRITE CKSUM
        test         ONLINE       0     0     0
          mirror-0    ONLINE       0     0     0
            c2t1d0s0  ONLINE       0     0     0
            c2t1d0s1  ONLINE       0     0     0
          mirror-2    ONLINE       0     0     0
            c2t1d0s4  ONLINE       0     0     0
            c2t1d0s5  ONLINE       0     0     0

errors: No known data errors
Pool is shrunk:

Code:
root:/# zpool list test
NAME    SIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAG    CAP  DEDUP  HEALTH  ALTROOT
test    39G  5.10G  33.9G        -         -     0%    13%  1.00x  ONLINE  -
Let's see if those old mirror-1 devices are available or in use:

Code:
root:/# zpool create testbaby mirror c2t1d0s2 c2t1d0s3

root:/# zpool list testbaby
NAME        SIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAG    CAP  DEDUP  HEALTH  ALTROOT
testbaby  19.5G   102K  19.5G        -         -     0%     0%  1.00x  ONLINE  -

root:/# zpool status testbaby
  pool: testbaby
 state: ONLINE
  scan: none requested
config:

        NAME          STATE     READ WRITE CKSUM
        testbaby     ONLINE       0     0     0
          mirror-0    ONLINE       0     0     0
            c2t1d0s2  ONLINE       0     0     0
            c2t1d0s3  ONLINE       0     0     0

errors: No known data errors
I saved the checksums of the files in /test to compare:

Code:
root:/test/tvshow/Season 1# pwd
/test/tvshow/Season 1
root:/test/tvshow/Season 1# sum *.avi > sums.after.save
root:/test/tvshow/Season 1# ls -lh
total 1.4G
-rw-r--r-- 1 root root  168 Jan  3 15:21 sums.after.save
-rw-r--r-- 1 root root  168 Jan  3 15:07 sums.save
-rwxrwxrwx 1 root root 220M Aug 28  2009 tvshow.101.avi
-rwxrwxrwx 1 root root 233M Aug 28  2009 tvshow.102.avi
-rwxrwxrwx 1 root root 233M Aug 28  2009 tvshow.103.avi
-rwxrwxrwx 1 root root 233M Aug 28  2009 tvshow.104.avi
-rwxrwxrwx 1 root root 233M Aug 28  2009 tvshow.105.avi
-rwxrwxrwx 1 root root 257M Aug 28  2009 tvshow.106.avi
root:/test/tvshow/Season 1# diff sums*
root:/test/tvshow/Season 1#
Looks good.
 

gea

Well-Known Member
Dec 31, 2010
3,161
1,195
113
DE
You CAN remove mirrors (a restriction of Open-ZFS, Solaris can even remove raid-z) to shrink a pool
but you would be quite the first who wants a smaller and not a larger pool...
 

RobertCO

New Member
Jan 2, 2022
14
1
3
You CAN remove mirrors (a restriction of Open-ZFS, Solaris can even remove raid-z) to shrink a pool
but you would be quite the first who wants a smaller and not a larger pool...
Ah I get it, yeah I just discovered it was possible after my annual omniOS update. Just seems it could come in handy in a few situations.
 

gea

Well-Known Member
Dec 31, 2010
3,161
1,195
113
DE
Care about ashift. This is a vdev property that can be set only on creation time. Vdev remove requires that all vdevs have the same ashift. This also affects special vdevs that can be removed.