ESXi 5.0.0 network performance issues

Metaluna

Member
Dec 30, 2010
64
0
6
I just upgraded my home server from ESXi 4.1 to 5.0.0.

Prior to the upgrade I had been running mainly a ZFS server based on SE11, which I recently moved to OpenIndiana151a. I also have a Debian 6 VM and a few other miscellaneous VMs.

Anyway, when I first migrated, I installed ESXi 5 on a freshly secure-erased 80GB SSD (intel X25-M Gen1). Other than that, there were no server hardware or network topology changes. I then migrated my OI VM over, and updated to the new tools. This is where the problems started.

After installing the tools, and the new VMXNET3 driver, I began to get messages spammed to the console every few seconds that look like this:

Nov 14 17:21:29 (hostname) vmxnet3s: [ID 654879 kern.notice] vmxnet3s:0: getcapab(0x200000) -> no

In addition to that, I noticed a significant dropoff of network bandwidth. I used to be able to copy large files from Win7 clients to the server over GigE at almost full saturation (over 90MB/s). Now it's closer to 50-60. iperf gives similar results.

So, thinking it was a problem migrating the OI VM, I tried doing a clean install of OI on the SSD datastore. No change. Still the same performance, and same error messages. I then tried the new Solaris 11 release (since ESXi now has an option specifically for Solaris 11 in the new VM wizard). It also gives the same error message and about the same results.


Doing iperf tests between Debian and OI VMs on the virtual switch gives about 670MB/s from OI->Debian, and around 500MB/s from Debian->OI. While this is faster that my physical gigabit network, notice that the ratio of theoretical maximum to actual is about the same: 50-70% of max. I don't know if that's significant or not.

Another test I tried was using iperf between two nearly identical Debian VMs. This was up in the 900MB/s range.

Finally, using iperf with multiple threads seems to come close to the old ESXi 4.1 performance. I need to do -P 4 or -P8 to get there though.

Has anyone else experienced issues with ESXi 5.0 network performance?
 

PigLover

Moderator
Jan 26, 2011
2,967
1,280
113
VMXNET3 driver is defective - was bad in ESXi4.0 and worse in 5.0. Go back to the standard driver. Your performance will increase and the error messages will stop.
 

albacona119

New Member
Jan 11, 2012
3
0
0
Were you able to resolve this issue? I'm having the exact same problem (see my thread at http://communities.vmware.com/message/1892255#1892255).

Network speeds have not really changed for me (I never used to get more than 50-60 MB/s via NFS anyways) - but this is still bugging me.

I hear the VMXNET3 NIC is more efficient than the (outdated) e1000(e) - i.e. less CPU overhead and more functionality.
 

Metaluna

Member
Dec 30, 2010
64
0
6
I have not resolved it. The network speed issue may not have been related, because a day or two after posting that the performance seemed to have gone back to normal (90-100MB/s or so). But I'm still getting the error messages spammed to my console (but not to the system log apparently, only the boot logs). Since I rarely look at the console, it hasn't been bothering me too much.
 

albacona119

New Member
Jan 11, 2012
3
0
0
Assuming a bug in Solaris Express b151 I upgraded to Solaris 11 b173 yesterday but that did not change a thing.

Are you sure it's just during boot? Because if I check /var/adm/messages it's all over the place at any time of the day.

Code:
Jan 13 10:40:18 linda vmxnet3s: [ID 654879 kern.notice] vmxnet3s:0: getcapab(0x200000) -> no
Jan 13 10:46:48 linda last message repeated 27 times
Everything seems to work fine though. Nevertheless I am curious what this is about.

I'm beginning to think it has to do something with the new features introduced with VMXNET3 that either solaris or the underlying hardware NIC does not support. (Maybe it's the mtu 9000 since VMXNET3 is a 10 GbE NIC).

I'm using a Supermicro X8SIL-F motherboard with Dual Intel 82574L Gigabit Ethernet.
 

Metaluna

Member
Dec 30, 2010
64
0
6
You're right, I'm getting them in /var/adm/messages as well. I just wasn't looking in the right file before. Very annoying as it pretty much swamps out any important messages that might be in there.

Now that you mention it, I remember running across a list of what those getcapab codes meant (I think I Googled for it and found a link to a header file from some source code repository). I vaguely recall that the one it's complaining about was some kind of checksum offloading function.
 
Last edited:

acesea

New Member
Oct 7, 2011
8
1
3
Seeing same getcapab log messages. My pseudo-scientific testing shows the vmxnet adapter as being a little more cpu efficient but maybe slower in some repeat runs than e1000.

albacona119, did you perform an in place upgrade from Solaris Express b151 to Solaris 11 b173? Or was it a clean install? 11-11-11 wont successfully install vmware tools. Any benefits with 11 over express?
 

Metaluna

Member
Dec 30, 2010
64
0
6
Seeing same getcapab log messages. My pseudo-scientific testing shows the vmxnet adapter as being a little more cpu efficient but maybe slower in some repeat runs than e1000.
My motivation for using the vmxnet driver is that it allows 10Gbps to the virtual switch. With that kind of bandwidth, you can create a VMware datastore on your ZFS pool and export it via NFS back to ESXi without losing too much performance. That way, I can keep all my VMs (except the Solaris one itself) on the ZFS pool.
 

albacona119

New Member
Jan 11, 2012
3
0
0
albacona119, did you perform an in place upgrade from Solaris Express b151 to Solaris 11 b173? Or was it a clean install? 11-11-11 wont successfully install vmware tools. Any benefits with 11 over express?
Sorry for the late reply. I performed an upgrade via pkg image-update. No issues at all. Regarding vmware tools you have to perform the installation manually via console, Solaris won't mount the *.iso image automatically. I suggest you get the *.iso image from ESXi server using scp directly from ESXi console. They're stored in /vmimages.
 
Last edited: