ESXi 6.5 with ZFS backed NFS Datastore - Optane Latency AIO benchmarks

Discussion in 'Solaris, Nexenta, OpenIndiana, and napp-it' started by J-san, Apr 5, 2019.

  1. J-san

    J-san Member

    Joined:
    Nov 27, 2014
    Messages:
    66
    Likes Received:
    42
    Doing some benchmarking for reducing latency in AIO - ESXi 6.5 to OmniOS ZFS backed NFS datastore.

    The hardware is:
    • Supermicro X10DRi-T
    • 2 x Intel(R) Xeon(R) CPU E5-2620 v3 @ 2.40GHz
    • 128 GB RAM DDR4 (@ 1866 Mhz - cpu limited from 2133Mhz)
    • 3 x 9211-8i in HBA IT mode /w P20.0.0.7 firmware
    • 1 x AOC-SLG3-2E4 (non R) HBA card to connect NVMe Optane
    BIOS:
    • Updated to Version 3.1
    • Bifurcation enabled and setup in Slot 5 to 4x4
    • Slot 5 EFI Oprom -> Legacy
    • Power Settings
      • Custom -> C State Control
        • C6 (Retention) State, CPU C6 Report Enable, C1E Enable
      • Custom -> P State Control
        • EIST (P-States) Enable, Turbo Mode Enable, P-state Coord – HW_ALL, Boost perf mode Max performance
      • Custom -> T State Control
        • Enable
    • USB 3.0 Support -> Enabled (to support USB 3.0 key booting)
    I've assigned the following to the OmniOS VM:
    • 4 x Cpus
    • 59392 MB RAM (All reserved)
    • 3 x 9211-8i in passthrough
    Disks:
    • 4 x Intel S4600 – 960GB
    • 4 x WD Gold 6TB
    • 1 x Intel S3700 - 200GB (slog)
    • 1 x Intel Optane P4800X - 375GB (slog)
    NFS Sharing AIO setup:
    • OmniOS VM with VMXNET3 adapter attached to Port group "storagenet"
    • ESXi VMkernel NIC vmk2 for NFS storage attached to own Port Group "storagevmk"
    • Both portgroups "storagenet" and "storagevmk" attached to separate vSwitch2
      • Not attached to any Physical NIC
      • vSwitch2 set to 9000MTU
    • OmniOS VM VMXNET3 adapter set to 9000MTU
    OmniOS VM setup:
    • omnios-r151028
    • Intel Optane P4800X - 375GB (slog) - added as 30 GB VMDK on local Optane backed datastore.
    • NUMA affinity set to numa.nodeAffinity=1
      • To match Optane in Slot5 (CPU2) of Dual CPU MB
      • Results were about 300MB/s seq slower if ESXi flipped the OmniOS VM onto Numa node 0 (CPU1)
    • /etc/system modified:
    Code:
    * Thanks Gea!
    * napp-it_tuning_begin:                                                                                                          
    * enable sata hotplug                                                                                                            
    set sata:sata_auto_online=1                                                                                                      
    * set disk timeout 15s (default 60s=0x3c)                                                                                        
    set sd:sd_io_time=0xF
    * increase NFS number of threads                                                                                                  
    set nfs:nfs3_max_threads=64                                                                                                      
    set nfs:nfs4_max_threads=64                                                                                                      
    * increase NFS read ahead count                                                                                                  
    set nfs:nfs3_nra=32                                                                                                              
    set nfs:nfs4_nra=32                                                                                                              
    * increase NFS maximum transfer size                                                                                              
    set nfs3:max_transfer_size=1048576                                                                                                
    set nfs4:max_transfer_size=1048576                                                                                                
    * increase NFS logical block size                                                                                                
    set nfs:nfs3_bsize=1048576                                                                                                        
    set nfs:nfs4_bsize=1048576                                                                                                        
    * tuning_end:
    
    • sd.conf modified:
    Code:
    # DISK tuning                                                                                                                    
    # Set correct physical-block-size and non-volitile settings for SSDs                                                              
    # S3500 - 480 GB                                                                                                                  
    # S3700 - 100 + 200 + 400 GB                                                                                                      
    # S4600 - 480 + 960 GB                                                                                                            
    # Set fake physical-block-size for WD RE drives to set pools to ashift12 (4k) so drives are replaceable by larger disks.          
    # WD RE4                                                                                                                          
    # WD RE gold                                                                                                                      
    sd-config-list=                                                                                                                  
    "ATA     INTEL SSDSC2BB48", "physical-block-size:4096, cache-nonvolatile:true, throttle-max:32, disksort:false",                  
    "ATA     INTEL SSDSC2BA10", "physical-block-size:4096, cache-nonvolatile:true, throttle-max:32, disksort:false",                  
    "ATA     INTEL SSDSC2BA20", "physical-block-size:4096, cache-nonvolatile:true, throttle-max:32, disksort:false",                  
    "ATA     INTEL SSDSC2BA40", "physical-block-size:4096, cache-nonvolatile:true, throttle-max:32, disksort:false",                  
    "ATA     INTEL SSDSC2KG48", "physical-block-size:4096, cache-nonvolatile:true, throttle-max:32, disksort:false",                  
    "ATA     INTEL SSDSC2KG96", "physical-block-size:4096, cache-nonvolatile:true, throttle-max:32, disksort:false",                  
    "ATA     INTEL SSDPE21K37", "physical-block-size:4096, cache-nonvolatile:true, throttle-max:32, disksort:false",                  
    "ATA     WDC WD2000FYYZ-0", "physical-block-size:4096",                                                                          
    "ATA     WDC WD2005FBYZ-0", "physical-block-size:4096";
    


    Now onto the benchmarks:

    Testing out latency reduction from VSphere 6.5 best practices:
    https://www.vmware.com/content/dam/...performance/Perf_Best_Practices_vSphere65.pdf


    p.43
    OmniOS VM ESXi advanced param ethernet2.coalescingScheme:
    (for "storagenet" VMXNet3 adapter ethernet2)

    Default setup (param not present - coalescing enabled by default)

    4 x WD GLD in 2 stripe x 2 mirror - with S3700 slog:

    CryDskMrk6-2x6tb-s3700-slog-lz4_recordsize_128k_CPUusage_X2APIC-OFF.PNG

    ethernet2.coalescingScheme : disable

    CryDskMrk6-2x6tb-S3700slog-lz4-recsize_128k-ethcoalesc_disable.PNG


    4 x WD GLD in 2 stripe x 2 mirror - with P4800X Optane slog:

    ethernet2.coalescingScheme - not present

    CryDskMrk6-2x6tb-P4800X-esxiDatastore-lz4_recordsize_128k.PNG

    ethernet2.coalescingScheme : disable

    CryDskMrk6-2x6tb-P4800X-esxiDatastore-lz4-recsize_128k-ethcoalesc_disable.PNG



    OmniOS VM set to Latency Sensitivity - High (from Normal)
    - Side effect is reserving all CPU cores assigned to VM for that VM.

    S3700 200GB Slog

    CryDskMrk6-2x6tb-S3700slog-lz4-recsize_128k-ethcoalesc_disable_latencyHigh-4cores_try2.PNG


    P4800X 375GB Optane Slog (as 30GB VMware VMDK - Thick Eager Zeroed)
    - 4 CPU cores

    CryDskMrk6-2x6tb-P4800Xslog(vdsk)-lz4-recsize_128k-ethcoalesc_disable_latencyHigh-4cores.PNG


    Testing CPU Cores assigned to OmniOS with Latency Sensitivity High:

    P4800X 375GB Optane Slog (as 30GB VMware VMDK - Thick Eager Zeroed)
    - 2 CPU cores


    CryDskMrk6-2x6tb-P4800Xslog(vdsk)-lz4-recsize_128k-ethcoalesc_disable_latencyHigh-2cores.PNG


    - 4 Cores:

    CryDskMrk6-2x6tb-P4800Xslog(vdsk)-lz4-recsize_128k-ethcoalesc_disable_latencyHigh-4cores.PNG



    - 5 Cores:

    CryDskMrk6-2x6tb-P4800Xslog(vdsk)-lz4-recsize_128k-ethcoalesc_disable_latencyHigh-5cores.PNG

    - 6 CPU cores

    CryDskMrk6-2x6tb-P4800Xslog(vdsk)-lz4-recsize_128k-ethcoalesc_disable_latencyHigh-6cores.PNG


    Finally - Optane VMDK to native ESXi6 Datastore:

    P4800X 375GB Optane Slog (30GB VMware VMDK - to NON-NFS local Optane ESXi6 Datastore)

    CryDskMrk6-P4800X-esxiDatastore-direct-vmdk.PNG


    Profit?
     
    #1
    Last edited: Apr 9, 2019
  2. J-san

    J-san Member

    Joined:
    Nov 27, 2014
    Messages:
    66
    Likes Received:
    42
    If I get time I may do some Optane benchmarks with it directly passed through to OmniOS instead of as a VMDK hard disk added to the OmniOS VM.

    Tips on how to directly pass it through to OmniOS:
    Bug #26508: Intel Optane 900p will not work in ESX passthrough - FreeNAS - iXsystems & FreeNAS Redmine

    1. ESXi modification:
    - ssh to ESXi
    - edit /etc/vmware/passthru.map
    - add following lines at the end of the file:
    # Intel Optane P4800X can not be shared with d3d0
    8086 2701 d3d0 false
    - restart hypervisor
    2. Toggle Passthrough for Optane in ESXi then reboot
    3. Add to OmniOS as PCI device
     
    #2
  3. gea

    gea Well-Known Member

    Joined:
    Dec 31, 2010
    Messages:
    2,183
    Likes Received:
    708
    For a performance sensitive vmdk you should not use Thin but Thick Provisioning Eager-Zeroed
     
    #3
  4. J-san

    J-san Member

    Joined:
    Nov 27, 2014
    Messages:
    66
    Likes Received:
    42
    Ah, actually looking at my config I actually did use Thick Provisioned Eager-Zeroed for the Optane SLOG VMDK... I added a 200GB Thin provisioned to test later as L2ARC. I'll update the OP to note this..

    Cheers!
     
    #4
Similar Threads: ESXi backed
Forum Title Date
Solaris, Nexenta, OpenIndiana, and napp-it ESXi VMDK on ZFS backed NFS - thin provisioning Mar 25, 2019
Solaris, Nexenta, OpenIndiana, and napp-it OmniOS 151030 VM (ESXi) with LSI 9400-8i Tri-Mode HBA freezing up Aug 10, 2019
Solaris, Nexenta, OpenIndiana, and napp-it how to create iscsi volume for datastore use in esxi Mar 1, 2019
Solaris, Nexenta, OpenIndiana, and napp-it Napp-IT / ESXi issues Jan 6, 2019
Solaris, Nexenta, OpenIndiana, and napp-it Esxi 6.7 / OmniOS 151028 Dec 18, 2018

Share This Page