ZFS/FreeNas : Large pools design with a HGST 4U60 rack

Discussion in 'FreeBSD and FreeNAS' started by nephri, Jul 28, 2017.

  1. nephri

    nephri Active Member

    Sep 23, 2015
    Likes Received:
    I have a HGST 4U60 bay that i'm being fill it with 60 HGST 2Tb SAS 6Gb/s

    I would to have advice on ZFS topology for achieve:
    - 1 large pool for main storage
    - 1 large pool for backup of the main storage

    I thinking things like this:
    - Main pool
    - Stripping 15 "mirror" vdevs (30Tb usable with 30 disks)
    - Backup pool
    - Stripping 5 "raidz2" vdevs of 6 hdds (40 Tb usable with 30 disks)

    I have also S3700/S3710 SSD to use for SLOG.
    I would set a mirrored S3700 100 Gb/s SSDs on the Main pool.
    I didn't think that i need an SLOG for the backup pool.

    Any advices are welcomes.
    Last edited: Jul 28, 2017
    mstrzyze likes this.
  2. i386

    i386 Well-Known Member

    Mar 18, 2016
    Likes Received:
    30 drives in a raid 10/SAME config?
    I think that's too dangerous, especially with used(?) older hdds. 1 mirror pair has to fail and you can lose the whole pool.

    Backup pool:
    Why not use 2 raid z3 vdevs? 15 drives of which 3 (random) can fail without losing data and more capacity (48tb).
  3. nephri

    nephri Active Member

    Sep 23, 2015
    Likes Received:
    Yes it's used hdd. I have 10 hdd as cold spare.

    It's for home lab. i have the backup if a pool failure arise. i can live with pool interrupt.
    But, i thinked also that a resilvering with stripped mirror vdevs will not stress a lot of disks (only one).

    If you don't go the stripped-mirror, what you would do for the main pool ?
    For the backup pool, i read often to keep vdev with a small amount of hdd and try to keep below 8 hdd per vdev.

    Another question, on the HGST 4U60, we have 2 controllers that handle each 30 disks.
    Do you strip each pool equally on both controllers (probably best performance) or dedicate each pool to one controller (probably best resiliency) ?
  4. nitrobass24

    nitrobass24 Moderator

    Dec 26, 2010
    Likes Received:
    For the main pool I would do the 6-disk Z2 config since it will provide you more redundancy and capacity. The cost will be the time on resilvers, but with 2TB disk it's a great trade off in my opinion.

    SLOG - I wouldn't split it across two pools, the results will be suboptimal. Also do you even need it on the backup pool?

    Sent from my iPhone using Tapatalk
  5. nephri

    nephri Active Member

    Sep 23, 2015
    Likes Received:
    So, everyone recommand a 5x raidz2 of 6 disks for both pools (40Tb usable for each)

    I don't think to put an SLOG on the backup pool.
    I want to set an SLOG only on the main pool. The slog will be a mirror of S3700 100gb SATA

    In term of performance, what will be the best between strip/raidz2 vs strip/mirror ?

    The server is built with:
    - The HGST JBOD chassis is connected to a storage server with a LSI 9300-8e HBA
    - The storage server use a 40Gb/s NIC (Chelsio T580-CR)
    - The server has 64Gb RAM but i'm thinking to upgrade to 128 Gb
    - The server has 2x Intel SSD DC S3700 100Gb that will be used as SLOG for the main pool
    - The server has 4x Intel SSD CD S3710 400Gb that for a strip/mirror pool for hosting proxmox vm (on iscsi)
    - The server has also 8x Seagate 3To SAS HDD (but i will try to resell it)
    - The server has also 8x HGST 2To SATA HDD (but i will try to resell it)
  6. _alex

    _alex Active Member

    Jan 28, 2016
    Likes Received:
    Not sure if id'd go 5x raidz2 vs. 15x mirrors for the main pool.
    In terms of performance the raidz2 in theory should end with 1/3 of the IOPS, as its 5 vdev vs. 15 vdev.
    Sure, if you lose a whole mirror the pool is gone, but as you said you can live with this and restore from backup this shouldn't matter.
    With hot-spares chances this will ever happen should be low, but definitely not zero.
    [edit: saw you have cold spares, maybe put in 1 or 2 hot-spares ?]
    Also resilvering is definitely faster on mirrors, and 2Tb drives should do the job quite fast.

    For the controllers, why not put disk 1 of each mirror on first controller, the second on the other ?
    Would balance the load for the main pool and protect from controller failure.

    Another option could be 3-way mirrors, 10x each 3 disks, unfortunately ending with 1/3 of raw capacity (20Tb usable) but still 10 vs 5 raidz2 vdev and could loose 2 disks, no matter what vdev they belong to.
    Last edited: Jul 28, 2017
  7. nephri

    nephri Active Member

    Sep 23, 2015
    Likes Received:
    Hi Alex,

    It's not hot-spares but cold-spares. I will have to handle manually the disk replacement.
    I didn't really like the hot-spares feature for spinning disks.

    For the controllers, it's exactly what i'm thinking.
    But admitting you have this topology:

    Controller A Controller B
    HD1a HD1b
    HD2a HD2b
    HD3a HD3b
    HD4a HD4b

    The pool is a strip of 4 vdev like:
    - HD1a mirrored with HD1b (vdev1)
    - HD2a mirrored with HD2b (vdev2)
    - HD3a mirrored with HD3b (vdev3)
    - HD4a mirrored with HD4b (vdev4)

    If a read's IO need to read blocks along theses 4 vdevs, is ZFS enough smart ? for by example:
    - It read blocks from vdev1 and vdev2 from the controller A
    - It read blocks from vdev3 and vdev4 from the controller B

    In order to optimize controllers throughput.
    For 4 vdevs it seeems to be useless but for 15 vdevs+ it's another story.
    I didn't know enough ZFS internals and behaviours for knowing how it will do under the hood.

    The 3-way mirror is appealing but its' a bit costly in term of capacity. 50% is already a costly trade-off but a 33% is a nogo for me.
    At this time a ZFS setup cost me about 1/8 of capacity of a raw storage
    - 50% : main / backup pools
    - 50% : resiliency of pools
    - 50% : using only half of capacity of each pools for best performance/health of pools (zfs recommendations)

    I will probably go for the strip/mirror for the main pool if nobody else convince me it's a big mistake.

    Last edited: Jul 29, 2017
  8. _alex

    _alex Active Member

    Jan 28, 2016
    Likes Received:
    Hi Séb,
    for the way zfs would read, i'm not sure how this is handled.
    This is not really dependent on the controllers the disks are attached, but more generally if ZFS reads from both disks in a mirror.
    I'm quite sure mdadm does so in a RAID10 setup.

    Yes, the 3-way mirror is sort of a waste of disks.
    As total capacity is lower, you could go with more than 30 disks / 10vdev on the main pool and have fewer disks in the backup-pool then. This would balance usable capacity of the whole box a bit more. But totally agree that this is maybe too costly.

    With 2-way mirrors i'd check SMART of the disks closely and build the mirrors with older/newer hdd if ever possible
    i.e. 6k running hours with 25k running hours. Or mix different vendor HDD's. Just to do everything possible to prevent the loss of a whole mirror.

    I'm also curious what others think, just my thoughts on this.
    In the end it totally depends on the performance, capacity and fault tolerance needs.
    Last edited: Jul 29, 2017
  9. nephri

    nephri Active Member

    Sep 23, 2015
    Likes Received:
    I just installed 36 disks inside the 4U60

    Dual port is enabled and FreeNas show them into a menu "Storage / View Multipaths"
    I looked up theses disks on "Storage / View Disks" and i was a bit confused when i didn't saw my disks" ....

    Maybe it can help someone else, i wanted to determine where disks are located in the enclosure in order to locate them when a failure will arise.

    So, i found a way like this:

    I get all disks detected on the enclosure using sas3ircu
    • sas3ircu 1 display
    that give by example for each disks

    Device is a Hard disk
    Enclosure # : 3
    Slot # : 58
    SAS Address : 5000cca-0-1b3d-c29e
    State : Available (AVL)
    Manufacturer : HITACHI
    Model Number : HUS72302CLAR2000
    Firmware Revision : C442
    Serial No : YFH2YY8D
    GUID : N/A
    Protocol : SAS
    Drive Type : SAS_HDD

    The Slot # provide the number of the slot in the 4U60 enclosure.
    The Serial No provide the serial number of the disk located in this slot.

    Now, we have to determine which device is bound to this disks
    In "Storage / View Multipaths", i saw each disk like this

    [-] multipath/disk10
    da36 PASSIVE
    da35 ACTIVE

    So, i lookup info on the active disk (it's the same from the passive one)

    smartctl -a /dev/da35

    give the Serial No of the disk bound to this device.

    Now, i'm running smartctl and badblocks on them....
    T_Minus likes this.
Similar Threads: ZFS/FreeNas Large
Forum Title Date
FreeBSD and FreeNAS How a very large ZFS pools configured ? May 13, 2016
FreeBSD and FreeNAS ZFS storage space maximization in large zpools May 24, 2015

Share This Page