omnios+nappit 10gb performance: iperf fast, zfs-send slow

Discussion in 'Solaris, Nexenta, OpenIndiana, and napp-it' started by chune, Jun 17, 2019.

  1. chune

    chune Member

    Joined:
    Oct 28, 2013
    Messages:
    107
    Likes Received:
    22
    I finally made the jump to 10GB on my nappit AIO boxes and thought I had followed all of the recommendations, but I am still getting gigabit speed ZFS sends using NC. The weird thing is iperf gives me 8.9 Gbits/sec throughput so I'm not sure if any of the tuneables will help me here. The pool is a stripe of 8 mirrors of 4TB HGST ultrastars.

    This is going from one physical AIO box to another AIO box over the 10gb link. Everything along the way is configured with 9000 MTU and the switch is a cisco SG500XG-8F-8T

    Code:
    Iperf:
    [ ID] Interval           Transfer     Bandwidth
    [  4]   0.00-1.00   sec   876 MBytes  7.34 Gbits/sec
    [  4]   1.00-2.00   sec  1.03 GBytes  8.87 Gbits/sec
    [  4]   2.00-3.00   sec  1.03 GBytes  8.85 Gbits/sec
    [  4]   3.00-4.00   sec  1.04 GBytes  8.94 Gbits/sec
    [  4]   4.00-5.00   sec  1.05 GBytes  8.97 Gbits/sec
    [  4]   5.00-6.00   sec  1.04 GBytes  8.95 Gbits/sec
    [  4]   6.00-7.00   sec  1.04 GBytes  8.93 Gbits/sec
    [  4]   7.00-8.00   sec  1.04 GBytes  8.92 Gbits/sec
    [  4]   8.00-9.00   sec  1.04 GBytes  8.93 Gbits/sec
    [  4]   9.00-10.00  sec  1.04 GBytes  8.92 Gbits/sec
    - - - - - - - - - - - - - - - - - - - - - - - - -
    [ ID] Interval           Transfer     Bandwidth
    [  4]   0.00-10.00  sec  10.2 GBytes  8.76 Gbits/sec                  sender
    [  4]   0.00-10.00  sec  10.2 GBytes  8.76 Gbits/sec                  receiver
    
    Zfs send:
     zfs send pool01-esx/datastore09a-hdd-backup@weekly-1517848527_2018.09.14.23.00.14 | pv | nc -w 20 10.10.10.10 9090
    10.2GiB 0:02:21 [74.0MiB/s]
    Additional hardware info below:

    intel X520-da2 nics
    intel/finisar SFP modules
    2x E5-2690 v2
    768 GB ECC ram
    nappit VM has 16 cores and 64 GB ram
    LSI 2116 HBA

    Let me know if you have any suggestions for what to try. My iozone 1gb benchmarks are looking decent too so I dont think its pool speed:
    upload_2019-6-17_9-47-9.png
     
    #1
    Last edited: Jun 19, 2019
  2. Rand__

    Rand__ Well-Known Member

    Joined:
    Mar 6, 2014
    Messages:
    3,592
    Likes Received:
    544
    What does single core load tell you?

    Iirc that is the limiting factor here
     
    #2
  3. gea

    gea Well-Known Member

    Joined:
    Dec 31, 2010
    Messages:
    2,273
    Likes Received:
    752
    Is the parent of the target filesystem set to sync enabled?
    What is the output of Pools > Benchmark (some benchmarks with sync enabled and disabled )
     
    #3
  4. chune

    chune Member

    Joined:
    Oct 28, 2013
    Messages:
    107
    Likes Received:
    22
    Code:
    begin tests ..
    Bennchmark filesystem: /pool01-hdd/_Pool_Benchmark
    Read: filebench, Write: filebench_sequential, date: 06.18.2019
    
    begin test 4 ..singlestreamwrite.f ..
    begin test 4sync ..singlestreamwrite.f ..
    set sync=disabled
    begin test 7 randomread.f ..
    begin test 8 randomrw.f ..
    begin test 9 singlestreamread.f ..
    pool: pool01-hdd
    
       NAME                       STATE     READ WRITE CKSUM
       pool01-hdd                 ONLINE       0     0     0
         mirror-0                 ONLINE       0     0     0
           c0t5000CCA253C19709d0  ONLINE       0     0     0
           c0t5000CCA253C197FAd0  ONLINE       0     0     0
           c0t5000CCA253C19856d0  ONLINE       0     0     0
         mirror-1                 ONLINE       0     0     0
           c0t5000CCA253C1D3F9d0  ONLINE       0     0     0
           c0t5000CCA253C1D439d0  ONLINE       0     0     0
           c0t5000CCA253C1D43Ad0  ONLINE       0     0     0
         mirror-2                 ONLINE       0     0     0
           c0t5000CCA253C1D443d0  ONLINE       0     0     0
           c0t5000CCA253C1D54Ad0  ONLINE       0     0     0
           c0t5000CCA253C1DF0Dd0  ONLINE       0     0     0
         mirror-3                 ONLINE       0     0     0
           c0t5000CCA253C1E048d0  ONLINE       0     0     0
           c0t5000CCA253C1E4C0d0  ONLINE       0     0     0
           c0t5000CCA253C1E4C8d0  ONLINE       0     0     0
         mirror-4                 ONLINE       0     0     0
           c0t5000CCA253C1E57Cd0  ONLINE       0     0     0
           c0t5000CCA253C1E57Ed0  ONLINE       0     0     0
           c0t5000CCA253C268DEd0  ONLINE       0     0     0
    
    
    hostname                        san03  Memory size: 65536 Megabytes
    pool                            pool01-hdd (recsize=128k, compr=off, readcache=all)
    slog                            -
    remark                       
    
    
    Fb3                             sync=always                     sync=disabled                 
    
    Fb4 singlestreamwrite.f         sync=always                     sync=disabled                 
                                   197 ops                         9275 ops
                                   26.663 ops/s                    1122.947 ops/s
                                   9687us cpu/op                   3513us cpu/op
                                   25.7ms latency                  0.8ms latency
                                    26.5 MB/s                       1122.8 MB/s
    ________________________________________________________________________________________
     
    read fb 7-9 + dd (opt)          randomread.f     randomrw.f     singlestreamr
    pri/sec cache=all               95.0 MB/s        149.1 MB/s     1.2 GB/s
     
    #4
    Last edited: Jun 19, 2019
  5. gea

    gea Well-Known Member

    Joined:
    Dec 31, 2010
    Messages:
    2,273
    Likes Received:
    752
    You should insert the output as code (menu [+] in the menues) to make it readable.

    If you write to the target filesystem with sync enabled, this would explain your slow results. Your pool offers a write performance of 26.5 MB/s sequential sync write vs 1122.8 MB/s when sync is disabled. Even with a raid-10 setup this is as expected without a fast Slog (ex Intel Optane NVMe, WD SS Ultrastar SAS).

    Another explanation for a slow transfer would be pool near full and Jumboframes on some setups.
     
    #5
  6. chune

    chune Member

    Joined:
    Oct 28, 2013
    Messages:
    107
    Likes Received:
    22
    I have 64GB of RAM allocated to the omniOS VM, I read that more ram is preferred over a SLOG but maybe that is no longer the case. I do have sync disabled for my pools but i still get the slow speed on ZFS send. My target pool is empty and the sending pool is 50% full. I understand that my random IO performance will not be good but i thought ZFS send was doing sync writes and with sync disabled this should be quite fast.
     
    #6
  7. thulle

    thulle New Member

    Joined:
    Apr 11, 2019
    Messages:
    14
    Likes Received:
    7
    I'd try adding some buffering in pv on both ends to see if it does anything. Ie:
    zfs send pool01-esx/datastore09a-hdd-backup@weekly-1517848527_2018.09.14.23.00.14 | pv -B 1G | nc -w 20 10.10.10.10 9090
     
    #7
    Last edited: Jun 19, 2019
  8. gea

    gea Well-Known Member

    Joined:
    Dec 31, 2010
    Messages:
    2,273
    Likes Received:
    752
    RAM is performance relevant as it is used as write cache (default max 4GB/10% RAM) and readcache. Without sync enabled the content of the write ramcache is lost on a crash. When you enable sync, every committed write is logged to a ZIL or Slog device to redo the write on next bootup after a crash.

    ZIL is onpool while Slog is an additional drive that can be much faster than the pool itself. Sync write to your pool is very slow so this would be an explanation for bad write values if enabled. A near full pool would be another explanation as well as a single weak disk.

    Jumbo can also be a problem. I would retry with Jumbo disabled and check if iostat of disk wait and busy are quite equal.

    btw
    The built in napp-it replication does nc buffered tranfers automatically.
     
    #8
  9. chune

    chune Member

    Joined:
    Oct 28, 2013
    Messages:
    107
    Likes Received:
    22
    Buffer did not appear to help. Weird thing is if i vmotion a VM from one AIO box to another AIO box i get the full 10gb speed. Any other suggestions?
     
    #9
  10. gea

    gea Well-Known Member

    Joined:
    Dec 31, 2010
    Messages:
    2,273
    Likes Received:
    752
    Have you disabled Jumbo?
    Is the performance ok on transfers between two local filesystems
     
    #10
Similar Threads: omnios+nappit 10gb
Forum Title Date
Solaris, Nexenta, OpenIndiana, and napp-it Looking for Atom C3x00+10Gb owner Nov 21, 2017
Solaris, Nexenta, OpenIndiana, and napp-it Solarish 10GBE compatibility Oct 29, 2017
Solaris, Nexenta, OpenIndiana, and napp-it Oracle Solaris 11.3 and Intel X552/X554 10GbE drivers May 21, 2017
Solaris, Nexenta, OpenIndiana, and napp-it OmniOS and Pentium-D 1508: Does the on-chip 10GbE NIC work? Mar 3, 2017
Solaris, Nexenta, OpenIndiana, and napp-it 10gb NIC issues OmniOS Mar 29, 2016

Share This Page