1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

Decision to make: SAN Replacement with Direct Attached Storage

Discussion in 'DIY Server and Workstation Builds' started by Benten93, Jul 15, 2017.

  1. Benten93

    Benten93 New Member

    Joined:
    Nov 16, 2015
    Messages:
    24
    Likes Received:
    3
    Hi guys,

    right now i am looking in converting my SAN Setup (see below) to DAS Setup for my 2-Hosts VMware-Cluster.

    The Specs of the SAN right now:

    Chassis: Intertech 4U 20bay
    CPU: Intel Xeon E5-1630 V3 (4x 3,7Ghz + HT)
    Mainboard: Supermicro X10SRH-CLN4F
    RAM: 2x 32GB Samsung DDR4-2133 (64GB)
    Networking: 2x Intel X520-DA2
    Software: FREENAS 11

    Disks:
    8x 5TB WD RED in Mirrored vDevs (aka RAID10)
    6x 480GB Samsung PM863a in mirrored vDevs (aka RAID10)
    1x SSD internal for OS

    VMware Cluster:

    2x HP DL380p G8 each with:

    CPU: 2x E5-2690 v2
    RAM: 256GB DRR3
    Networking: 10Gbit 2-Port SFP+
    no harddrives, enough room for HBAs

    The setup is running fine right now, but i notice weak IO Performance of my vms on the iSCSI volumes, even when the SSDs are only 10-20% busy.
    The Reason why im not going with the storage in the VMware hosts and using vSAN is the licensing costs.
    I am running about 40 vms on the cluster, from AD, Exchange, minor My- and MSSQL Databases to some game and testing ones.
    Due to network latency the maximum IOPS should be about 4000 (1sec/0.25ms) if requesting QD1, but i cant tell if thats the bottleneck to pull the vms down that much.
    All in all im reaching out to you now and hopefully get some hints if its a good idea to replace my SAN with a DAS solution like an HP D2600 or similar.

    If you need any further information... no problem!
     
    #1
  2. wildchild

    wildchild Active Member

    Joined:
    Feb 4, 2014
    Messages:
    387
    Likes Received:
    46
    Running 10g...?
    Latency shouldnt be a problem on your san config....
     
    #2
  3. Benten93

    Benten93 New Member

    Joined:
    Nov 16, 2015
    Messages:
    24
    Likes Received:
    3
    Latency is 0.25ms (ping-Time), so Maximum IOPS with QD1 is like mentioned above isnt it?
    If higher QD Its definitly better, But i don't Know why my vms get Bad responsiveness...
    So my guess was The SAN "Overhead"

    EDIT: To add, Each Host has one dedicated Link to The SAN direct and another connected via a 10G switch:

    SAN_L1 -- Host1_L1
    SAN_L2 -- Host2_L1
    SAN_L3 -- SWITCH -- Host1_L2
    SAN_L4 -- SWITCH -- Host2_L2

    The interfaces in VMware Are configured As roundrobin with 1 iop Change frequency!
     
    #3
  4. wildchild

    wildchild Active Member

    Joined:
    Feb 4, 2014
    Messages:
    387
    Likes Received:
    46
    Pretty much running the same config as you are, difference is my hdd pool size ( 12 * 4 tb hgst sata disks) 2* intel 3700 as slog and samsung 840 pro 256 mb op'ed tot 70% .
    Memory 128 gb and intel 10 g.

    San Os is omnios
    Vmware 6 2* hp dl360 g6 , 96 mb

    Not seeing any of you latency's , so either you're having freenas iscsi issues, or vmware 10g driver errors
     
    #4
  5. Benten93

    Benten93 New Member

    Joined:
    Nov 16, 2015
    Messages:
    24
    Likes Received:
    3
    What Ping Time Do you have Host to san?
     
    #5
  6. wildchild

    wildchild Active Member

    Joined:
    Feb 4, 2014
    Messages:
    387
    Likes Received:
    46
    Why would you want to set i/o to 1 on 10g ?
    That would be limiting your throughput.
    Even on 1gb 10 would be more appropiate

    Experiment a bit with that.

    I have multipathing too, running through 2 seperated 10 ip subnets, on 2 l2 switch fabrics ( ubiquiti unifi 10g switches, running their alpha firmware)

    Vmkping times vary between 0.201 - 0.280 under full load
     
    #6
  7. Benten93

    Benten93 New Member

    Joined:
    Nov 16, 2015
    Messages:
    24
    Likes Received:
    3
    I tested that some Time ago, If i have The Test results nearby i will post them.
    For best Performance Direct attached via SAS would Be better?
     
    #7
  8. wildchild

    wildchild Active Member

    Joined:
    Feb 4, 2014
    Messages:
    387
    Likes Received:
    46
    Maybe, depends on what is needed, if you have a disk behaving badly, as DAS wouldnt give you a better performance.
    Point being, i think there's something off with you config, or maybe with the freebsd you're running..

    Try an openindiana or oracle solaris live cd and test once more.

    Occationally even performing a low level format on your ssd's may improve things a lot
     
    #8
  9. Benten93

    Benten93 New Member

    Joined:
    Nov 16, 2015
    Messages:
    24
    Likes Received:
    3
    Thanks for your answer!
    Today i tried an centos 7 on that SAN and i don't Know what should Be wring.. i got similar results..
    I am Not sure what i should expect of my SAN Performance.
     
    #9
  10. wildchild

    wildchild Active Member

    Joined:
    Feb 4, 2014
    Messages:
    387
    Likes Received:
    46
    Now, as omnios/openindiana/solaris are the birthplace of solaris, i woukd suggest using them ( maybe combine with gea's excellent napp-it webgui) to baseline your stuff, start back checking from the what could be off, because them you would have a "know to work" well base os.
    If zfs is still off, a good look at your pool(s) and their config would be needed.

    Have you tried full erasing (check thomas krenn ag for a how to) yet ?
     
    #10
  11. wildchild

    wildchild Active Member

    Joined:
    Feb 4, 2014
    Messages:
    387
    Likes Received:
    46
    Birthplace of zfs is what i mean of course :)
     
    #11
  12. niekbergboer

    niekbergboer Member

    Joined:
    Jun 21, 2016
    Messages:
    50
    Likes Received:
    23
    The latency of a once-per-second ping is not indicative of the latency you'll see on a busy system; that one ping has to wait for all power-saving modes (hosts, VMs) to exit.

    Try pinging 100 times per second, and you'll see far lower times.

    Case in point (pinging a Proxmox VE peer for another peer):
    $ ping -c 10 -n 192.168.13.11
    PING 192.168.13.11 (192.168.13.11) 56(84) bytes of data.
    64 bytes from 192.168.13.11: icmp_seq=1 ttl=64 time=0.215 ms
    64 bytes from 192.168.13.11: icmp_seq=2 ttl=64 time=0.295 ms
    64 bytes from 192.168.13.11: icmp_seq=3 ttl=64 time=0.251 ms
    64 bytes from 192.168.13.11: icmp_seq=4 ttl=64 time=0.152 ms
    64 bytes from 192.168.13.11: icmp_seq=5 ttl=64 time=0.360 ms
    64 bytes from 192.168.13.11: icmp_seq=6 ttl=64 time=0.406 ms
    64 bytes from 192.168.13.11: icmp_seq=7 ttl=64 time=0.147 ms
    64 bytes from 192.168.13.11: icmp_seq=8 ttl=64 time=0.170 ms
    64 bytes from 192.168.13.11: icmp_seq=9 ttl=64 time=0.354 ms
    64 bytes from 192.168.13.11: icmp_seq=10 ttl=64 time=0.348 ms

    --- 192.168.13.11 ping statistics ---
    10 packets transmitted, 10 received, 0% packet loss, time 9206ms
    rtt min/avg/max/mdev = 0.147/0.269/0.406/0.093 ms


    and many times per second:

    $ sudo ping -A -c 100 -n 192.168.13.11
    PING 192.168.13.11 (192.168.13.11) 56(84) bytes of data.
    64 bytes from 192.168.13.11: icmp_seq=1 ttl=64 time=0.073 ms
    64 bytes from 192.168.13.11: icmp_seq=2 ttl=64 time=0.086 ms
    64 bytes from 192.168.13.11: icmp_seq=3 ttl=64 time=0.049 ms
    64 bytes from 192.168.13.11: icmp_seq=4 ttl=64 time=0.039 ms
    64 bytes from 192.168.13.11: icmp_seq=5 ttl=64 time=0.031 ms
    64 bytes from 192.168.13.11: icmp_seq=6 ttl=64 time=0.052 ms
    64 bytes from 192.168.13.11: icmp_seq=7 ttl=64 time=0.053 ms
    64 bytes from 192.168.13.11: icmp_seq=8 ttl=64 time=0.041 ms
    64 bytes from 192.168.13.11: icmp_seq=9 ttl=64 time=0.056 ms
    [...]
    64 bytes from 192.168.13.11: icmp_seq=97 ttl=64 time=0.024 ms
    64 bytes from 192.168.13.11: icmp_seq=98 ttl=64 time=0.024 ms
    64 bytes from 192.168.13.11: icmp_seq=99 ttl=64 time=0.023 ms
    64 bytes from 192.168.13.11: icmp_seq=100 ttl=64 time=0.025 ms

    --- 192.168.13.11 ping statistics ---
    100 packets transmitted, 100 received, 0% packet loss, time 3ms
    rtt min/avg/max/mdev = 0.022/0.027/0.086/0.010 ms, ipg/ewma 0.037/0.024 ms
     
    #12
    Last edited: Jul 17, 2017
  13. Benten93

    Benten93 New Member

    Joined:
    Nov 16, 2015
    Messages:
    24
    Likes Received:
    3
    Thanks for your suggestions!

    I made two test benchmarks, for each Datastore one. In comparison i noticed the nearly identical 4K-QD1 (2000-3000 IOPS read, 4000-5000 IOPS write) results.
    Regarding the 4K-QD32 on the HDD Datastore (111MB/s read, 20MB/s write), one can cleary see the caching effect of the SANs RAM.
    If i take a look at the 4K Performance on the SSD Datastore, i was expecting far more.. (50k read, 27k write)

    I am in the process of setting up and intermedium Storage to tinker around with that SAN Box a little more.

    Anyone has a number or info, how much IOPS an iSCSI 10G connection can deliver (theoretical maximum)?
    Later i will try the same benchmarks again and will watch the esxtop statistics. Maybe there is a hidden misconfiguration...
     

    Attached Files:

    #13
    Last edited: Jul 17, 2017
  14. i386

    i386 Active Member

    Joined:
    Mar 18, 2016
    Messages:
    536
    Likes Received:
    96
    #14
    Benten93 likes this.
  15. Benten93

    Benten93 New Member

    Joined:
    Nov 16, 2015
    Messages:
    24
    Likes Received:
    3
    Really nice article! Maybe i am testing with to few hosts?

    EDIT:

    I shortly tested the SSD Datastore with the ATTO Benchmark, see results as attachment.
    Maybe someone can explane me why the reads are capped, but the SSDs are doing nothing at all? Seems like a FREENAS Cache mechanism blocking higher speeds..
     

    Attached Files:

    #15
    Last edited: Jul 17, 2017
Similar Threads: Decision Replacement
Forum Title Date
DIY Server and Workstation Builds 1U Server Fan replacement. Dec 21, 2016
DIY Server and Workstation Builds Replacement Server Build Planning May 15, 2016
DIY Server and Workstation Builds [Noob] Synology Replacement Mar 6, 2016
DIY Server and Workstation Builds Synology Replacement Feb 14, 2016
DIY Server and Workstation Builds BBU LCD Replacement Tripp Lite Unit Aug 4, 2015

Share This Page