1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

VMware vSAN Performance :-(

Discussion in 'VMware, VirtualBox, Citrix' started by Yves, Apr 4, 2018.

  1. Rand__

    Rand__ Well-Known Member

    Joined:
    Mar 6, 2014
    Messages:
    2,118
    Likes Received:
    253
    I passed them through- thats why i only changed the driver in the Starwind VM
     
    #61
  2. Yves

    Yves Member

    Joined:
    Apr 4, 2017
    Messages:
    53
    Likes Received:
    13
    So today I finally had some spare time to give vSAN a spin again... I installed mpio on the Win2k16 DC vSAN Client System this are my first results:

    [​IMG]

    512, 1k, 2k, 4k still pretty low... during the middle now big change and in the top I am getting pretty descent speeds I think...

    edit: I guess for the bigger blocksizes 256 mb is to small so I did a rerun with 4gb, here you go:
    [​IMG]

    edit2: some more optimations done
    [​IMG]
     
    #62
    Last edited: Apr 12, 2018
  3. Rand__

    Rand__ Well-Known Member

    Joined:
    Mar 6, 2014
    Messages:
    2,118
    Likes Received:
    253
    What have you changed then?
     
    #63
  4. Yves

    Yves Member

    Joined:
    Apr 4, 2017
    Messages:
    53
    Likes Received:
    13
    Oh big sorry... I thought I wrote it down. But I guess I correct and changed some stuff of my text and forgot the most important part...

    I went from 1x 10Gbit link to 2x 10Gbit links with round robin... and installed mpio for handeling the two iSCSi connections.

    Hopefully tomorrow I will get some feedback from the Starwind guys why my 512b,1k,2k,4k blocks are so slow...
     
    #64
  5. Rand__

    Rand__ Well-Known Member

    Joined:
    Mar 6, 2014
    Messages:
    2,118
    Likes Received:
    253
    So I realized I already have abysmal performance on the nvme drives natively on the Starwind box so I'll have to fix that before re-running tests. Not sure what happened to them :/
     
    #65
  6. Yves

    Yves Member

    Joined:
    Apr 4, 2017
    Messages:
    53
    Likes Received:
    13
    Well I hope I also get some feedback tomorrow about the small blocksizes...

    Also I took a re run at VMware vSAN... During studying the vSAN books some more and reading some more I found out something interessting... But I want to try it out first :) will keep you posted.


    Sent from my mobile
     
    #66
  7. Rand__

    Rand__ Well-Known Member

    Joined:
    Mar 6, 2014
    Messages:
    2,118
    Likes Received:
    253
    So what have you found out?

    I think I got hit by Spectre/Meltdown fix - Big Time :(
    Will create a new post but ...
    So Starwind VM on a patched ESX (no Bios patch, no win patch) on a p3700 based datastore
    upload_2018-4-14_20-54-3.png


    Same vm on a not patched ESX (another p3700/400gb based datastore)

    upload_2018-4-14_20-54-31.png
     

    Attached Files:

    #67
    Yves likes this.
  8. K D

    K D Well-Known Member

    Joined:
    Dec 24, 2016
    Messages:
    1,179
    Likes Received:
    243
    Ouch!
     
    #68
  9. Evan

    Evan Well-Known Member

    Joined:
    Jan 6, 2016
    Messages:
    1,922
    Likes Received:
    273
    About expected I Guess, all the context switching with IO and the nature of the Spectre/meltdown fixes it’s bound to affect I/O benchmarks the most
     
    #69
  10. K D

    K D Well-Known Member

    Joined:
    Dec 24, 2016
    Messages:
    1,179
    Likes Received:
    243
    Yeah but I was not thinking 85% like your results.
     
    #70
  11. Yves

    Yves Member

    Joined:
    Apr 4, 2017
    Messages:
    53
    Likes Received:
    13
    To enlighten all with the idea I spoke of two days ago (which actually did not work as well as I had it in my head)...

    I doing some reading up on technical deep dives into VMware vSAN and saw that if you have a All-Flash vSAN setup the Caching Tier drive will only be used for read acceleration. Not for writing. As far as I understand writing is done directly to the capacity drives if they are flash... This is how the vSAN reading speeds are very high and the writing are pretty bad... So I thought I am very clever and tag the flash drives as normal magnetic drives... So the Intel Optane will be also used for caching writing... But (at least to me) this did not work out. Still very poor writing speeds. So I did some more digging into vSAN, setup my vSAN cluster again... single node just for testing purpose... and changed the benchmarking tool to a benchmarking solution (official HCIBench from VMware) and the results are getting in the right direction...

    if I am not totally mistaken this is not that bad after all...

    HCIbench easy run. (4k, 70/30% read/write 100% random)
    Code:
    Datastore:  vsanDatastore
    =============================
    Run Def:    RD=run1; I/O rate: Uncontrolled MAX; elapsed=3600 warmup=1800; For loops: None
    VMs         = 2
    IOPS        = 49323.74 IO/s
    THROUGHPUT  = 192.67 MB/s
    LATENCY     = 1.2795 ms
    R_LATENCY   = 1.3070 ms
    W_LATENCY   = 1.2160 ms
    95%tile_LAT = 2.2106 ms
    =============================
    Resource Usage:
    CPU USAGE  = 73.35%
    RAM USAGE  = 21.6%
    VSAN PCPU USAGE = 25.1843%
    =============================
    If you are interested in improving the IOPS/THROUGHPUT/LATENCY, please find the details in file performance_diag_result.html in directory vdb-8vmdk-100ws-4k-70rdpct-100randompct-4threads-1523779786 
     
    #71
  12. Rand__

    Rand__ Well-Known Member

    Joined:
    Mar 6, 2014
    Messages:
    2,118
    Likes Received:
    253
    Well benchmarks only get you so far ... in the end the question is what is your use case and how well does the system there.

    For me it was quick vmotion (>500MB/s) and fast local client storage on random files and that didnt work with my VSan setup (despite an all NVME setup). Very sad because I really like its simplicity.

    On a brighter note - I reran Starwind Benchmarking today from an unpatched host (Optane based File Datastore, no Cache, no Flash associated).

    Windows VM on Native Optane based ESX datastore (Based on latest ESX patch from 2017, no newer NVMe driver)
    upload_2018-4-15_16-47-13.png

    Same VM on other ESX (patched!) on iSCSI datastore from Starwind VM on unpatched host (40 GB Ethernet, MTU 9000)

    upload_2018-4-15_16-49-1.png

    Now thats o/c without HA, no optimization, only 280GB capacity drive - but that gives hope - even if the 4K numbers are horrible. Surprising they are so good on Optane anyway, P3700's were around 36.

    Will setup a Raid based on S3700's next
     
    #72
  13. Yves

    Yves Member

    Joined:
    Apr 4, 2017
    Messages:
    53
    Likes Received:
    13
    Very nice values :) the iSCSI datastore values are also nice. Did you do multiple connections or only one? I had better performance values with multiple connections and on the Tech White Paper the have some optimization settings starting on page 45

    About that I just wanted to ask maybe I need to do a new thread for that... Or ask somewhere else. But isn't there a statistic where you can see what block sizes your vm use most? if there is more linar reading / writing or more random? I thought there was a way to get that out of the system.

    P.s. I am also in contact with a guy from SmartX Halo who says their SDS performs better than Ceph, vSAN and Nutanix... Maybe I will give that a spin too.
     
    #73
  14. Rand__

    Rand__ Well-Known Member

    Joined:
    Mar 6, 2014
    Messages:
    2,118
    Likes Received:
    253
    Ah I wondered that myself (what to optimize for and what to benchmark with).
    Compuverde has a nice breakdown in their GUI that shows some of the values (clustered unfortunately) but that at least gives an idea

    And no this is single VM, single connection - since my use case is always few users (2-3) it does not make sense to use multiple connections/multipath for testing for me.
    But with Starwind I can go for an iSer setup so network wise there will be options, just trying to get a feeling about basic capabilities.

    Does SmartX have a free/Home version?;)
     
    #74
  15. Rand__

    Rand__ Well-Known Member

    Joined:
    Mar 6, 2014
    Messages:
    2,118
    Likes Received:
    253
    So another try, Raid10 (on M1015 in IR Mode) of 8 S3700's (400GB):
    Native on Starwind Box:
    upload_2018-4-16_0-24-52.png

    On VM via iSCSI
    upload_2018-4-16_0-28-3.png


    Still wonder why writes are better than reads...


    As a comparison, my original 3 node vSan setup boosted with Optane as Cache (& 750's capacity) (pre M/S fixes o/c)
    upload_2018-4-16_0-30-16.png


    Will need to see how that works for real.
    Just wondering whether Starwind HA is enough to forgo the Raid10 on each box and go Raid0 per box... maybe with a spare disk at Hand and daily backups?... Dunno, sounds risky.
     
    #75
  16. Evan

    Evan Well-Known Member

    Joined:
    Jan 6, 2016
    Messages:
    1,922
    Likes Received:
    273
    @Rand__ if you can would be better to run some raid on each system, but for home use only what your thinking may well work on your situation.

    (For me the other node replicated is ok for me, ssd’s have pretty low failure rates anyway)
     
    #76
  17. Rand__

    Rand__ Well-Known Member

    Joined:
    Mar 6, 2014
    Messages:
    2,118
    Likes Received:
    253
    Well Raid 0 is also 'a' Raid ;)
    But I have enough drives to run Raid 10 on 2 nodes, just was looking for the speed increase ;)
    Will need to eval the likelihood of errors here - some of my drives have been used a bit already but on the other hand its Datacenter SSD with still a ton of life left.

    Questions probably would be - is my latest backup recently enough, how long is my recovery time (in case of catastrophic failure of both Raid 0's) and is that going to be ok/manageable (with my main ESX datastore including all infra/client VMs (except tape & online backup) gone then).

    It might work but I'd need to setup appropriately. Ah need to think about that.

    Will also need to look into 3 way Starwind setup (I know, not with free, might be able to get a NFR) whether that would help.
    3x7 Raid0 should beat a 2x10 disk Raid 10.
    Also would go easier on the Raid cards needed (3 x 8i vs 2x16i or 4x8i). Not even sure if I can create a hardware raid spanning two controllers. Guess not, but HW Raid has been ages since I last used that.
     
    #77
  18. ecosse

    ecosse Active Member

    Joined:
    Jul 2, 2013
    Messages:
    246
    Likes Received:
    34
    Guys - will repeat my question in the hope you can cover this - on Starwind how easy is it to manage from the CLI? I'm not too bothered about speed, , my use cases are largely bound by my internet speeds. But management and resilience is key. I guess I could look at this myself but I find the Starwind site somewhat confusing when it comes to a standard design (borne out by this thread!) and with limited time I don't really want to learn something without some inkling that it will be worthwhile :)
     
    #78
Similar Threads: VMware vSAN
Forum Title Date
VMware, VirtualBox, Citrix Vmware VSAN nodes Feb 9, 2016
VMware, VirtualBox, Citrix VMware vCenter, vSphere, vSAN EVALExperience - help with 6.0 setup Mar 14, 2015
VMware, VirtualBox, Citrix VMWare and vSAN licenses Oct 6, 2014
VMware, VirtualBox, Citrix VMWare shared storage optimization question (blocksizes) Yesterday at 3:55 AM
VMware, VirtualBox, Citrix VMware Xeon W-2145 support Saturday at 1:03 PM

Share This Page