VMware 6.5 and A2SDi-16C-HLN4F (cluster 2-node)

Discussion in 'VMware, VirtualBox, Citrix' started by Marco Neri, Feb 27, 2018.

  1. Scoped

    Scoped New Member

    Joined:
    Apr 14, 2018
    Messages:
    2
    Likes Received:
    0
    i tested with CIFFs traffic and got 110MB/s transfer on a 1GBe link which is about what you will expect considering all other vmkernel traffic was going over this NIC

    using version 6
     
    #41
  2. omega

    omega New Member

    Joined:
    Apr 4, 2018
    Messages:
    2
    Likes Received:
    0
    Craig thank you so much! works great for me. I have a supermicro E200-9A and after injecting the drivers into esxi 6.7, I was able to get it installed. Its been working for almost 2 weeks now. Speeds seem very good, close to a gigabit when transfering data.
     
    #42
  3. Kev

    Kev Active Member

    Joined:
    Feb 16, 2015
    Messages:
    270
    Likes Received:
    41
    Isn't x553 a 10GB capable mac? Can anyone do performance tests with this driver in ESXI 6.5 or 6.7?
     
    #43
  4. JJ Duru

    JJ Duru New Member

    Joined:
    Sep 15, 2018
    Messages:
    4
    Likes Received:
    2
    Hi Craig,

    I started using this driver you compiled, in the following setup:
    - mobo: Supermicro A2SDi-2C-HLN4F - same ixgbe NICs as on your motherboard
    - esxi 6.7: customized to the lastest patchset (ESXi670-201808001/release date 14AUG2018/build 9484548)
    - the ixgbe driver you attached to this thread
    - 3 VMs:
    - 2 x Centos 7 x64, for DNS/DHCP and authoritative DNS
    - 1 x pfSense x64, internet gateway

    The load average on each individual VM is negligible: each of the above machines is running with the load average between 0.09 and 0.45.
    However, when monitoring the esxi host from native hmtl5 client, the CPU usage hovers betwen 20% and 55%, which is really high.
    By comparison, I do have another esxi 6.7 host which has 4 virtual machines, with host's CPUs having a lower frequency and I can tell that the overall CPUs usage is at max 30%.

    So, something in the networking area of this combo, x553 NICs with the driver attached to this thread and ESXi 6.7 is not functioning correctly.
    I did not notice low throughput: when testing internet connectivity I am able to obtain the stated speed of 120Mb/s, a sign that the x553 nic is doing its job. How it's doing its job, it may be another matter.

    Do you experience high CPU usage on esxi host? Have you since your last post started using a different driver?
    Any help is appreciated. Thank you.
     
    #44
    Last edited: Sep 16, 2018
  5. Craig Thomson

    Craig Thomson New Member

    Joined:
    Mar 5, 2018
    Messages:
    18
    Likes Received:
    12
    Hi JJ,

    I'm sorry to hear you're experiencing problems. I'm still using build 6 of my driver (attached to post #37 of this thread). I am not (and have not) experienced any CPU load problems. In fact, I've not experienced problems of any kind with this driver.

    My setup:
    • Supermicro A2SDi-16C-HLN4F
    • Driver net-ixgbe_x553_7-4.5.3-6.x86_64
    • ESXi 6.7 customized to ESXi-6.7.0-20180604001-standard (release date 25 June 2018)
    • 6 VMs (2 x Solaris 11, 2 x CentOS 7 x64, 1 x CentOS 7 x32, 1 x Ubuntu 16.04 x64)
    The load average on each individual VM is, most of the time, negligible (same as you). When I monitor the ESXi host itself (via the HTML5 embedded host client) the CPU usage is 0.31% (min 0.3%, max 7.7%, avg 0.59%).

    Even when my VMs are working hard (1 is a web/db server, 2 are compilers, 1 does video encoding) my load average never gets above 25% (but remember, I'm working with 16 cores).

    Can I ask, what makes you suspect the load is caused by the network and/or the driver?

    The only time I've experienced ESXi load that did not appear to be attributable to a guest VM was with Solaris 11 (11/11). The cause was an interrupt storm due to an incompatibility between ESXi and the Solaris interrupt timing mode. It was a problem unique to that specific version of Solaris and did not show up as load in the Solaris VM (i.e. ESXi load 100%, Solaris VM load 0%). Adding the following line to /etc/system on the Solaris VM solved the problem.
    Code:
    set pcplusmp:apic_timer_preferred_mode = 0x0
    I mention this only to show that finding the cause of ESXi load can sometimes be tricky. I suspect you'll have to do much more investigation to get to the root cause.

    As a starting point, what does the output of esxtop show?
     
    #45
    JJ Duru likes this.
  6. JJ Duru

    JJ Duru New Member

    Joined:
    Sep 15, 2018
    Messages:
    4
    Likes Received:
    2

    Darn it, now I realized what happened: I did use your driver from post #32, not post #37.

    Back to the drawing board. I'll do the reinstalls and provide updates.
     
    #46
  7. Craig Thomson

    Craig Thomson New Member

    Joined:
    Mar 5, 2018
    Messages:
    18
    Likes Received:
    12
    Before you go and reinstall everything, I should tell you that I don't believe build 6 of the driver will make any difference to your problem.

    Both build 5 and build 6 are very similar. Build 6 differs only in some performance tweaks relating to throughput. I can't see that the differences between build 5 and build 6 would solve this issue. I suspect you'll encounter the same result.

    Also, I know some people are still using build 5 and are not experiencing your load problem.

    Rather than reinstall everything, I would focus on investigating the problem. From what I gather, the issue is not impacting services, it's just overworking your CPU, so you have time on your side.
     
    #47
    JJ Duru likes this.
  8. JJ Duru

    JJ Duru New Member

    Joined:
    Sep 15, 2018
    Messages:
    4
    Likes Received:
    2
    Craig,

    That's the thing: time is not on my side. With a needy family that consumes internet with bread, I have to keep the interwebz available as much as possible (hence the highly available internal network architecture).

    I performed the reinstall: I get lower CPU usage overall. With all the HA relevant services moved over to the A2SDi-2C-HLN4F host, with netflix streaming to one of the machines, the CPU graph shows usage between 10% and 18% with spikes going up to 26-30% at times.

    When running the speedtest.net test and attaining the 120Mbit/s download and 11-12Mbit/s upload, the CPU usage jumps as following:
    - one core jumps to 63.38%
    - one core jumps to 29.59%
    The spike happens on the download phase.
    I expected them not to be equally impacted because the PF filtering system inside pfsense is running on one CPU only, as far as I know.

    Given the fact that the overall CPU usage is down, and the box is not performing any internal VLAN routing (therefore I do not get high latencies for the inter VLAN routing), I declare the operation a stunning success.

    And all I can say is a big THANK YOU for creating this driver.

    P.S. I suspect that whatever buffer numbers you changed, it worked. I declare myself disappointed that I chose a dual core mobo - never again.
     
    #48
    Craig Thomson likes this.
  9. Craig Thomson

    Craig Thomson New Member

    Joined:
    Mar 5, 2018
    Messages:
    18
    Likes Received:
    12
    Hi JJ,

    I'm really glad performance has improved for you, but I have to say I'm also quite surprised. You've made me really curious now. Over the next few weeks I'll do some tests of my own.

    My previous testing was all done using a private internal LAN and simple file transfers. I never tested any kind of networking software (like pfSense) nor did I test any kind of Internet traffic. I'm now wondering - could a different traffic profile, or a different type of load, yield different results?

    I'll report back here with my findings.

    You're welcome. :)
     
    #49
  10. Kev

    Kev Active Member

    Joined:
    Feb 16, 2015
    Messages:
    270
    Likes Received:
    41
    Can you share your slipstream esxi install?
     
    #50
  11. JJ Duru

    JJ Duru New Member

    Joined:
    Sep 15, 2018
    Messages:
    4
    Likes Received:
    2
    #51
    SlinkingAnt likes this.
  12. IT33513

    IT33513 New Member

    Joined:
    Mar 14, 2018
    Messages:
    6
    Likes Received:
    0
    May I wonder for what need you are willing to build 2 node Ha cluster?
     
    #52
  13. Stril

    Stril Member

    Joined:
    Sep 26, 2017
    Messages:
    155
    Likes Received:
    8
    Hi!

    Are there any news about x553-support?

    I want to buy a board with 2 x553 NICs and 10 GBaseT.

    Does it work with vSphere 6.7U1?

    Best wishes
     
    #53
Similar Threads: VMware A2SDi-16C-HLN4F
Forum Title Date
VMware, VirtualBox, Citrix LUNs per target - VMWare - iSCSI Mar 19, 2019
VMware, VirtualBox, Citrix Short GPU for VMWare passthrough? Feb 21, 2019
VMware, VirtualBox, Citrix HELP vmware vrealize network insight KEY? Jan 9, 2019
VMware, VirtualBox, Citrix VMWare - how to assign NICs to Management/vMotion/LAN/iSCSI Dec 10, 2018
VMware, VirtualBox, Citrix Can you help me with the vmware esxi 6.5? Oct 17, 2018

Share This Page