VMware vSAN Performance :-(

Discussion in 'VMware, VirtualBox, Citrix' started by Yves, Apr 4, 2018.

  1. Yves

    Yves Member

    Joined:
    Apr 4, 2017
    Messages:
    54
    Likes Received:
    13
    Hi there,

    After successfully building my SDS Cluster I tinkerd around for about 3 days configurating everything how I wanted (jbod / raid0 mode, updating all the fw's and bioses, checking all the vibs for all the components etc.) today I finally was had all 3 nodes up and running esxi 6.5u1... made some inital baseline benchmarks on all of the systems.

    This is a Win2k16 DC with pvscsi on the datastore (vmfs6) located on the Intel Optane 900p
    [​IMG]

    This is the same Win2k16 DC with pvscsi on the vsan Datastore located arcoss 3 nodes with 3x 900p cache tiers and 6x intel dc s4600....
    [​IMG]

    Write is HORRIBLE.... I can't understand what is going wrong here... everything is passed and everything is green in the vsan menu... and yes I know its not HCIBench but sorry.... 148 writes... on 512b? seriously? An old qnap ts-2xx over iscsi is faster...

    Any ideas?

    Thanks guys... I am a bit frustrated and tierd from all the tinkering around... with this endresult...

    Ah btw. this is the storage the vsan cluster should replace.... (my good old TS-EC1279U-RP with 2x 10GB SFP+ iSCSI Uplinks - which seam to beat the shit out of the vsan...):
    [​IMG]
     
    #1
    Last edited: Apr 4, 2018
  2. Evan

    Evan Well-Known Member

    Joined:
    Jan 6, 2016
    Messages:
    2,096
    Likes Received:
    291
    I would never expect that bad and I am sure something can’t be right but certianly don’t expect a 3 disk group vsan system to run amazingly even with Optane cache drives, it’s just not build like that, it doesn’t do data locality etc, it’s designed for equal performance for all VM’s. Having said this all flash config is ‘ok’
     
    #2
  3. Dean

    Dean Member

    Joined:
    Jun 18, 2015
    Messages:
    58
    Likes Received:
    6
    Poop... I am literally about to do the same, 3 node setup, but with some 12gb sas drives and ssd for cache. Not giving me the warm and fuzzies.

    Sent from my Moto Z (2) using Tapatalk
     
    #3
  4. hlhjedsfg

    hlhjedsfg New Member

    Joined:
    Feb 2, 2018
    Messages:
    12
    Likes Received:
    2
    I recently build a 3 nodes SDS with Proxmox + Ceph (2 x DS S3500 for OSD + 1 DC S3700 for logs runing 10Gb/s direct attachment by node), and I was about 7000 IOPS in 4k random on IOMETTER, don't do a ATTO. Was able to run 72 windows 2016/10 VM for my student, and all was pretty smooth. Your number seems pretty low to me in 4k sequantial (14Mo/s). I will give a try with ATTO and after will try with VMware software.
     
    #4
  5. Yves

    Yves Member

    Joined:
    Apr 4, 2017
    Messages:
    54
    Likes Received:
    13
    @Evan Thanks for your reply. I am as surprised as you are...

    @Dean yaaappp! that's how I feel right now... after spending quite a bit on my homelab sds...

    I know we are not talking dual P4800X cache tier with P4600 as capacity drives here... and 40GB interconnects but.... 148... and the rest... that can't be right... but to be honest... that's exactly also my biggest fear of vmware vsan... not that it performs not how I want... that I can not explain why it is not performing how I want... even after watching all the vSAN Network Architecture: vSAN Architecture Series youtube videos where elver explains the complete vsan (very good) you still have almost a black box where you save all your data... and if I read threads like "Garbage vSAN performance from an all-flash HCL-approved rocketship" or my "My All Flash VSAN Nightmare" on reddit you are getting a wierd feeling in your stomach...

    I will start posting some pictures of my vsan software settings... maybe something sticks out...
     
    #5
  6. Yves

    Yves Member

    Joined:
    Apr 4, 2017
    Messages:
    54
    Likes Received:
    13
    This is the policy I created for benchmarking its not even raid1 its only on the one system... I know I could change number of disk stripes per object to 2 to make is use both capacity disks...
    [​IMG]

    According to the vSAN health everything is in order
    [​IMG]

    Also a pretty common issue is the queue depth of the raid controller (hba0) which in my case is good enough
    [​IMG]

    This is the result I get (right now the drives are in RAID0 mode as it should be according to the HCL... before they where in JBOD mode)
    [​IMG]
     
    #6
  7. child of wonder

    Joined:
    Dec 30, 2016
    Messages:
    32
    Likes Received:
    2
    VSAN is a steaming pile so these results don't really surprise me. Try turning on EC, dedupe, and checksums!
     
    #7
  8. Rand__

    Rand__ Well-Known Member

    Joined:
    Mar 6, 2014
    Messages:
    2,301
    Likes Received:
    279
    I think he means turn it off;) Especially checksums is known to have significant impact.

    I have had somewhat better results with similar hardware (40GBe, Optane + Intel 750s capactiy), also tried P3700's.
    It should definitely better than what you have now, but don't expect miracles. I have similar complain/pleas for help threads in this forum.

    Lessons learned until know
    -vSan is designed to give consistent performance to many concurrent users; it is *not* good at providing great performance for few users.
    -vSan does not make great use of NVMe (but no system I have tested yet does)
    -vSan speed is governed only by Cache drives x Disk Groups. If you need more speed and you have fast cache already you need to add more disk groups that are written to concurrently. For a time I ran 6 P3700's and 6 750's in 6 Disk groups using 2 concurrent writes - better but still far from good. I am still contemplating running 12 disk groups (1 S3700 cache, 1 S3700 capacity) to see what that will give me...
    -vSan is great for ease of implementation - integration into ESX is really simple, its fairly easy to set up and to maintain, but its not made for speed with small deployments.
     
    #8
    Yves likes this.
  9. child of wonder

    Joined:
    Dec 30, 2016
    Messages:
    32
    Likes Received:
    2
    Look at all the things a person has to consider with VSAN:

    - Cache drives
    - Disk Groups
    - Drive types
    - Controller types
    - VSAN HCL
    - FTT
    - Erasure Coding
    - Dedupe
    - Compression
    - Encryption
    - Checksums
    - on and on and on

    How is this simple? Why not just get a simple shared storage array and not worry about all this complexity and time waste that ultimately doesn't perform very well anyway.
     
    #9
  10. Rand__

    Rand__ Well-Known Member

    Joined:
    Mar 6, 2014
    Messages:
    2,301
    Likes Received:
    279
    It totally depends on the use case;)
    If you are happy running 2 extra nodes to your compute cluster thats perfectly fine.

    I'd totally love a well performing hyperconverged 2 or 3 node cluster. Just haven't found one yet:p

    Whats your take on a 2 node full synced HA capable shared storage system with low power utilization that doesn't cost an arm and a leg? I have not found one yet unfortunately.
     
    #10
  11. Yves

    Yves Member

    Joined:
    Apr 4, 2017
    Messages:
    54
    Likes Received:
    13
    @Rand_ Thanks a lot for your very detailed response. Its exactly what I feared most. That I have to abandon this idea of a vSAN cluster... since it performance is way of compared to my NAS this vSAN cluster actually should replace... Only strange thing is that I really really think it should perform better. But I dont need vSAN... I have a compute Cluster on a BC C7000 but I thoughtlets give vSAN a spin... But I guess this is why u wont find benchmark results about vSAN.

    Does anyone have another High Performance solution I have 3 good servers who can run anything u throw at them... Nutanix / RH Ceph Cluster / etc...?
     
    #11
  12. Yves

    Yves Member

    Joined:
    Apr 4, 2017
    Messages:
    54
    Likes Received:
    13
    Totally agree... And I really thought I was thinking about everything... But does not seam like that...


    Sent from my mobile
     
    #12
  13. Rand__

    Rand__ Well-Known Member

    Joined:
    Mar 6, 2014
    Messages:
    2,301
    Likes Received:
    279
    I still think its not too bad what you have to think about, most of these things have a preconfigured setting and are just options. If you have a RAID Box you have options too;)

    And no I dont know of a High Perf solution with 3 boxes yet - Scale IO was the closest I found until now but thats dead (free version). Have not tried Nutanix due to their calling home functionality. Compuverde has that too but they at least tell you what they are sending.

    Thats why I wondered what @child of wonder suggests as option :)
     
    #13
  14. Yves

    Yves Member

    Joined:
    Apr 4, 2017
    Messages:
    54
    Likes Received:
    13
    True true... But I still think numbers are way off. If I look at 1 Node tests from Florian Gehl at virten.com or the VMworld Hackaton Setup of William Lam... Still trying to find the bug...

    Alternatives... What about Starwinds vSAN / SmartX Halo or Redhat Ceph Cluster?
     
    #14
  15. Evan

    Evan Well-Known Member

    Joined:
    Jan 6, 2016
    Messages:
    2,096
    Likes Received:
    291
    Ceph is same as vSAN for small deployments.
    Starwind or other 1-node and replicate solutions probably do best in a 2 or 3 node situation.

    MS storage replica another option except for license costs unless your already enterprise licensing for the VM’s anyway.

    I pretty much just gave up on any workable and cheap/free solution in a 2/3 node config that I actually likes to use and play with.
     
    #15
  16. Rand__

    Rand__ Well-Known Member

    Joined:
    Mar 6, 2014
    Messages:
    2,301
    Likes Received:
    279
    Starwind was next on my list but didnt have Raid Cards... might go for all NVMe instead, but depending on write strategy many drives might be of benefit (thats why ScaleIO was good, it distributed writes to a lot of drives).
    Wanted to wait for feedback on Compuverde before I went on and dismantle the test systems.

    MS was an option but I read bad things about 2 node cluster with it so I am not sure its really an option

    And yes your numbers are off, check my other thread for comparison values (fio&cdm usually)
     
    #16
  17. Evan

    Evan Well-Known Member

    Joined:
    Jan 6, 2016
    Messages:
    2,096
    Likes Received:
    291
    MS has options... S2D (storage spaces direct) and storage replica, replica is more like a mirror used for DR

    Storage Replica Overview

    Hyper-V replica could also be an option, similar function in ESX but no idea how it scales etc

    Set up Hyper-V Replica
     
    #17
    Last edited: Apr 5, 2018
  18. Yves

    Yves Member

    Joined:
    Apr 4, 2017
    Messages:
    54
    Likes Received:
    13
    Did Dell/EMC not „create“ a new free programm called Dell EMC ECS Community Edition? Or am I mistaken?
     
    #18
  19. Evan

    Evan Well-Known Member

    Joined:
    Jan 6, 2016
    Messages:
    2,096
    Likes Received:
    291
    ECS is not scaleIO though... atleast not as I know but would need to read up on that.
     
    #19
  20. Rand__

    Rand__ Well-Known Member

    Joined:
    Mar 6, 2014
    Messages:
    2,301
    Likes Received:
    279
    Yes when I said MS I meant S2D; replication was only possible to a storage server i think. On top I would have needed to lay an iscsi layer to go back to ESX, sounded like quite a hassle.

    Have not heard about ECS, let me read up.

    Hm, the fancy stuff seems to be in Cloud Array which is not freely available :/
     
    #20
    Last edited: Apr 5, 2018
Similar Threads: VMware vSAN
Forum Title Date
VMware, VirtualBox, Citrix VMWare vSAN, quad M.2 NVME + 1 cache vs SATA SSDs? Jul 5, 2018
VMware, VirtualBox, Citrix Vmware VSAN nodes Feb 9, 2016
VMware, VirtualBox, Citrix VMware vCenter, vSphere, vSAN EVALExperience - help with 6.0 setup Mar 14, 2015
VMware, VirtualBox, Citrix VMWare and vSAN licenses Oct 6, 2014
VMware, VirtualBox, Citrix VMware Guest RAID pass through question May 1, 2018

Share This Page