11 disk RAIDz2 hurt performance? should it be 10 or 12 instead?

Discussion in 'Solaris, Nexenta, OpenIndiana, and napp-it' started by bp_968, Jun 11, 2013.

  1. bp_968

    bp_968 New Member

    Joined:
    Dec 23, 2012
    Messages:
    45
    Likes Received:
    0
    I saw some mention of N+P and block/stripe sizes in regards to ZFS while searching for other info. Right now my plan was to use all 11 2TB drives I have in a large RAIDz2 pool and keep a 3TB drive I have as a hot spare or just as a spare if needed. If I'm going to seriously gimp the system building it with 11 drives I can either go ahead and stuff the 3TB in and make it 12 disks, or rebuild the pool with 10 drives and shunt one of them off as a hot-spare. Personally I'd rather keep all the space I can keep since ZFS doesn't like to grow large parity pools. I guess I could also just turn all 11 disks into a RAIDz3 pool as well.

    How much performance impact are we talking about if I use 11 drives in a RAIDz2?

    If it matters I do prefer better performance if I can get it, but its not a primary goal for this pool, this pool is more an archive then constant active data though I do have 10Gb connectivity for everything so I'm not bottlenecked with 1GbE.

    Also, i have a mix of 512b and 4k drives in the pool (and 7200RPM and 5400RPM). Am I better off setting the whole pool to ashift=12 to accomodate the 4k drives or just leave it at ashift=9 (which is what it picked when I created the pool in napp-it)?

    I know mixing the 7200 and 5400 will limit the 7200 drives but I don't care, I don't have the cash to replace them and don't really want to sell them and buy 5400s to replace them (since only 2 of them are 7200rpm).
     
    #1
  2. gea

    gea Well-Known Member

    Joined:
    Dec 31, 2010
    Messages:
    2,338
    Likes Received:
    784
    There are some basic principles.
    - The more disks you have, the better is your sequential performance
    - you can mix 5200 and 7200 rpm drives, average performance is between
    - if you mix 512B and 4k disks, use or force ashift=12 (modify sd.conf or use at least one disk that report 4k physical sectorsize)
    - even for pure 512B, i would force ashift=12 to be able to replace disks later with 4k ones
    (caveat: less usable capacity with ashift=12)

    There are "golden numbers" of disks in a Raid-Z: 1,2,4,8,16,32,.. disks (can be divided through ZFS 128k sectorsize) These numbers do not affect stability or performance but how much spave is usable.

    ex: a 10 disk Raid-Z2 (8 datadisks + 2 redundancy disks) have a capacity of about 8 x datadisks.
    If you add another disk, you do not have 9 x datadisk but less but a better performance, so 12 disks in a Raid-Z3 may be an option too.

    read
    ZFS array size calculation - Overclockers Australia Forums


    I would also prefer a 11 disk Raid-Z3 over a 10 disk Raid-Z2 + hotspare.
    The first option has a "hotspare that is altready resilvered" included.


    Problems with pools build from vdevs with many disks:
    - large resilver time on disk failures (up to some days) with bad performance
    - bad I/O performance (one raid-z vdev has I/O of a single disk because all heads must be positioned on every read/write)
     
    #2
    Last edited: Jun 11, 2013
  3. bp_968

    bp_968 New Member

    Joined:
    Dec 23, 2012
    Messages:
    45
    Likes Received:
    0
    I'm not concerned about performance while the pools rebuilding, again its just for archiving. So if I want things to all align and not waste lots of space I need to go with Z3 at 11 disks or Z2 at 10 disks?

    As for the 512b VS 4k think, 2 of the drives are 512b the rest are all 4K drives. Yet napp-it built the pool with ashift=9 not 12. Should I rebuild the pool with 12 when/if I switch it to Z3? If I snagged another 2TB disk I guess I could do 2 RAIDz2 vdevs of 6 disks each. Still, My primary goal right now with that pool is large amounts of safe storage. R/W speeds of 100-200MB/s or better would be plenty.

    Thanks for the help. The 512b and 4K thing wasting space is still throwing me off a bit. How bad does it hurt performance to stay with 512b instead of going 4k and risking loosing lots of space? I mostly store large RAW images so maybe it wouldn't end up being a big deal..

    Thanks.
     
    #3
  4. bp_968

    bp_968 New Member

    Joined:
    Dec 23, 2012
    Messages:
    45
    Likes Received:
    0
    https://dl.dropboxusercontent.com/u..._48_45-OI_SAN1 __ network & nas appliance.png

    There is the current performance of the 11 disk pool of 2TB drives (2 512b, 9 4k drives). So I should just turn it into raidz3 so as to not waste so much space on little files or wasteful block rights? Just wanted to get it all hammered out right before I took the gloves off and was done with it.

    EDIT:

    Here is the performance of my 8 mirrored 10,000RPM 300GB drives...
    https://dl.dropboxusercontent.com/u..._31_30-OI_SAN1 __ network & nas appliance.png

    How is it that the mix-n-match of slow 5400rpm drives, 512b, 4k sectored, etc etc drives all dumped into one 11 drive vdev are turning in numbers better then the 10k drives?
     
    #4
    Last edited: Jun 11, 2013
  5. gea

    gea Well-Known Member

    Joined:
    Dec 31, 2010
    Messages:
    2,338
    Likes Received:
    784

    High capacity disks (even 5400rpm disks) have a much higher data density than low capacity disks so pure transfer rates on sequential reads or writes can be higher. Only if you check small random reads or latency, the high performance disk is better.
     
    #5
  6. bp_968

    bp_968 New Member

    Joined:
    Dec 23, 2012
    Messages:
    45
    Likes Received:
    0

    So I'm going to try it with RAIDz3 so it doesn't waste space. Its still defaulting to ashift=9 (using napp-it to make the pool). I tried just to do -o ashift=12 and make the pool at the command line but apparently the -o ashift=12 is a freeBSD thing, not a OI thing. My speeds with the 11 drive RAIDz2 and ashift=9 were plenty fast, about 80MB/s random, 427MB/s write and 925MB/s read (sequential, which is what should matter anyway since it will be writing/reading 20MB-60MB RAW files).

    Here is the RAIDz3 pool and drives. The only drives in that list that are *not* 4k drives is the Hitachi drives (they are 512b sector drives). Should I try and edit the sd.conf or whatever the file is and manually enter all the drives or is there a command I'm not seeing that I should use to force ashift=12 for the whole pool? With 925MB/s read and near 500MB/s writes do I really need to even mess with it? will it cause me problems in the future? I just want to make sure once its setup and I start loading it full of data that if a drive fails I can replace it without massive hassle. Its just strange that 9 of 11 disks are 4k disks and yet it keeps putting ashift as 9.

    I'm going to run a bonnie benchmark on the z3 pool and see what I get (this pool is ashift=9 since I can't figure out how to or if I should force it to 12)

    pool: megapool
    state: ONLINE
    scan: none requested
    config:

    NAME STATE READ WRITE CKSUM CAP Product
    megapool ONLINE 0 0 0
    raidz3-0 ONLINE 0 0 0
    c17t4d0 ONLINE 0 0 0 2 TB WDC WD20EADS-11R
    c17t5d0 ONLINE 0 0 0 2 TB WDC WD20EFRX-68A
    c17t6d0 ONLINE 0 0 0 2 TB WDC WD20EFRX-68A
    c7t12d0 ONLINE 0 0 0 2 TB SAMSUNG HD204UI
    c7t26d0 ONLINE 0 0 0 2 TB SAMSUNG HD204UI
    c7t27d0 ONLINE 0 0 0 2 TB ST2000DL003-9VT1
    c7t28d0 ONLINE 0 0 0 2 TB ST2000DL003-9VT1
    c7t29d0 ONLINE 0 0 0 2 TB Hitachi HDS72202
    c7t30d0 ONLINE 0 0 0 2 TB WDC WD20EARS-00M
    c7t31d0 ONLINE 0 0 0 2 TB Hitachi HDS72202
    c7t32d0 ONLINE 0 0 0 2 TB WDC WD20EARS-00M

    errors: No known data errors


    EDIT:

    The RAIDz3 pool scores from bonnie (ashift=9):

    Seq-Wr-Chr Seq-Write Seq-Rewr Seq-Rd-Chr seq-Read
    80 MB/s 313 MB/s 190 MB/s 83 MB/s 732 MB/s

    Thanks!!!
     
    #6
    Last edited: Jun 12, 2013
  7. gea

    gea Well-Known Member

    Joined:
    Dec 31, 2010
    Messages:
    2,338
    Likes Received:
    784
    There may be only a future problem.
    I you need to replace a faulted disk sometimes, you may not be able to buy 512B ones and you cannot replace them with true 4k ones.

    If you want to force ashift with Illumos based systems, read
    ZFS and Advanced Format disks - illumos - illumos wiki
     
    #7
Similar Threads: disk RAIDz2
Forum Title Date
Solaris, Nexenta, OpenIndiana, and napp-it ZFS: zero reads on 2 disks of 6-drive RAIDZ2 : because of power-of-two data disks? Feb 23, 2017
Solaris, Nexenta, OpenIndiana, and napp-it napp-it: some disks have same enclosure id, second disk vanishes when mapping the other Nov 17, 2019
Solaris, Nexenta, OpenIndiana, and napp-it Checking complete disks for errors in OmniOS Nov 9, 2019
Solaris, Nexenta, OpenIndiana, and napp-it Trouble Initializing Disks OmniOS / Napp-it Apr 12, 2019
Solaris, Nexenta, OpenIndiana, and napp-it NAPP-IT bug: zpool add disks to mirror, pool incorrectly reported non-existant Aug 31, 2018

Share This Page