Mellanox ConnectX-2 and ESXi 6.0 - Barely Working - Terrible Performance

humbleThC

Member
Nov 7, 2016
99
9
8
44
I've been messing around for quite some time.... Trying to figure out the correct set of firmware & drivers to use on these ConnectX-2 dual port QDR cards (MT26428) on ESXi 6.0, with the best results.

Currently running 2.10.720 firmware on all adapters. (tried 2.9.1200 and 2.9.1000)
Currently testing 1.8.2.5 OFED drivers for ESXi (tried 1.8.2.4 alot, just moved on to 1.8.2.5)

ESX List of Drivers Installed

net-ib-cm 1.8.2.5-1OEM.600.0.0.2494585 MEL PartnerSupported 2016-11-05
net-ib-core 1.8.2.5-1OEM.600.0.0.2494585 MEL PartnerSupported 2016-11-05
net-ib-ipoib 1.8.2.5-1OEM.600.0.0.2494585 MEL PartnerSupported 2016-11-05
net-ib-mad 1.8.2.5-1OEM.600.0.0.2494585 MEL PartnerSupported 2016-11-05
net-ib-sa 1.8.2.5-1OEM.600.0.0.2494585 MEL PartnerSupported 2016-11-05
net-ib-umad 1.8.2.5-1OEM.600.0.0.2494585 MEL PartnerSupported 2016-11-05
net-memtrack 1.8.2.5-1OEM.600.0.0.2494585 MEL PartnerSupported 2016-11-05
net-mlx4-core 1.8.2.5-1OEM.600.0.0.2494585 MEL PartnerSupported 2016-11-05
net-mlx4-ib 1.8.2.5-1OEM.600.0.0.2494585 MEL PartnerSupported 2016-11-05
scsi-ib-srp 1.8.2.5-1OEM.600.0.0.2494585 MEL PartnerSupported 2016-11-05

I've got the links up, and stable (via 4036E switch with Subnet Mgr)

My Windows 2012 R2 Server is working perfectly (NAS)
My Windows 10 Desktop is working perfectly, and achieving amazing performance from the NAS
- (Yeah I threw a QDR IB in my desktop, for faster NAS access)

My ESXi 6.0 U2 server however, is having massive performance issues.
I originally tried a NFS mount from Windows 2012 R2 to the ESX Server.
- I get about 3MBytes/s or 24Mbit bandwidth
- vMotions, Deploying OVFs, etc all time-out

The storage is basically unusable, thus my entire ESXi lab is unusable. Any advice would be greatly appreciated.
 
Last edited:

humbleThC

Member
Nov 7, 2016
99
9
8
44
First start with directly connect iperf tests, go from there
If I go direct connect, i'd need to install/configure a subnet manager on the hosts.
I'm 99.9% sure it's not the switch, that's causing the issue.

But here are the iperf2 results from Win2012 R2 to ESXi 6.0

TCP Results

1x Thread
[root@esx01:/opt/iperf/bin] ./iperf -s
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size: 64.0 KByte (default)
------------------------------------------------------------
[ 4] local 10.0.0.10 port 5001 connected with 10.0.0.4 port 61032
[ ID] Interval Transfer Bandwidth
[ 4] 0.0-10.0 sec 9.13 GBytes 7.83 Gbits/sec

2x Threads
[ 12] local 10.0.0.10 port 5001 connected with 10.0.0.4 port 61051
[ 4] local 10.0.0.10 port 5001 connected with 10.0.0.4 port 61050
[ 12] 0.0-60.0 sec 45.7 GBytes 6.54 Gbits/sec
[ 4] 0.0-60.0 sec 46.4 GBytes 6.65 Gbits/sec
[SUM] 0.0-60.0 sec 92.1 GBytes 13.2 Gbits/sec

It appears my raw TCP performance between Windows & ESX is on par with my performance between Windows & Windows.

However, when I mount an NFS mount across that same pipe, i'm getting about 3MBytes or 24Mbits

I just tested Iperf on UDP only.... I think the problem might be here....

[root@esx01:/opt/iperf/bin] ./iperf -s -u
------------------------------------------------------------
Server listening on UDP port 5001
Receiving 1470 byte datagrams
UDP buffer size: 41.1 KByte (default)
------------------------------------------------------------
[ 3] local 10.0.0.10 port 5001 connected with 10.0.0.4 port 62633
[ 4] local 10.0.0.10 port 5001 connected with 10.0.0.4 port 62634
[ 5] local 10.0.0.10 port 5001 connected with 10.0.0.4 port 62635
[ 6] local 10.0.0.10 port 5001 connected with 10.0.0.4 port 62636
[ ID] Interval Transfer Bandwidth Jitter Lost/Total Datagrams
[ 3] 0.0-60.0 sec 7.50 MBytes 1.05 Mbits/sec 0.764 ms 0/ 5350 (0%)
[ 4] 0.0-60.0 sec 7.50 MBytes 1.05 Mbits/sec 0.089 ms 0/ 5350 (0%)
[ 4] 0.0-60.0 sec 1 datagrams received out-of-order
[ 5] 0.0-60.0 sec 7.50 MBytes 1.05 Mbits/sec 0.109 ms 1/ 5350 (0.019%)
[ 6] 0.0-60.0 sec 7.50 MBytes 1.05 Mbits/sec 0.064 ms 3/ 5350 (0.056%)
[SUM] 0.0-60.0 sec 30.0 MBytes 4.19 Mbits/sec
 
Last edited:

wildchild

Active Member
Feb 4, 2014
394
57
28
You're running infiniband..
I'm sorry , thought you were running 10g, which runs fine on my connect-x esxi6 setup :)

I do see lots of variation in bw, usually spells buffers..
I know there is some tuning required to get nfs working good in windows.
Maybe give iscsi a try ?
 

humbleThC

Member
Nov 7, 2016
99
9
8
44
You're running infiniband..
I'm sorry , thought you were running 10g, which runs fine on my connect-x esxi6 setup :)

I do see lots of variation in bw, usually spells buffers..
I know there is some tuning required to get nfs working good in windows.
Maybe give iscsi a try ?
Just confirmed NFS in ESX uses TCP not UDP, so still not sure what's going on here...

Is there some trick to getting ESXi 6 to recognize the MT26428 as iSCSI capable?

upload_2016-11-7_12-18-56.png

I also confirmed my Windows 2012 R2 NFS settings, are to use TCP only.
 
Last edited:

wildchild

Active Member
Feb 4, 2014
394
57
28
Well.. you have the connect-x 2 vpi carts correct ?
Those run either ib or 10g , just like the connect-x 3's
 

rpross3

New Member
Feb 16, 2016
12
3
3
50
I have a similar setup and also find the NFS performance sucks wind. I've exhausted my Google-fu. Very interested in what you come up with.

Sent from my HTC One_M8 using Tapatalk
 

Marsh

Moderator
May 12, 2013
2,207
1,044
113
What command do you use test the NFS performance?

In the past 18-24 months, I been using the old Mellanox CX-2 EN cards in the ESXi 6.0u2 and Xpenology hosts.
The Xpenology server hosting many VMs via NFS service without any issue.
 

humbleThC

Member
Nov 7, 2016
99
9
8
44
What command do you use test the NFS performance?

In the past 18-24 months, I been using the old Mellanox CX-2 EN cards in the ESXi 6.0u2 and Xpenology hosts.
The Xpenology server hosting many VMs via NFS service without any issue.
Just using an NFS3 export for testing.
On any VM, i've tested things like IOmeter, diskspd, etc.
But its the vMotion/Deploy OVF, and any other read/write intensive operation to the NFS datastore which suffers.

Which is wierd, because i've validated my NFS is only being shared via TCP (not UDP), and my raw iperf performance for TCP is amazing.
 

humbleThC

Member
Nov 7, 2016
99
9
8
44
Minor Update~

So I didnt fix NFS yet... but I just got my 1st decent result out of iSCSI. Added a virtual ISCSI adapter, bound it to the VMK of one of the IB adapters, and got 1.2GB/sec sustained bandwidth over iSCSI. (about the max i would suspect out of the underlying disks behind it now). Going to have to add a few more SSDs to the Windows Pool, to see if I can get above that.

Still very much interested in getting NFS to perform however, as it would allow me to share the same space for ESX/VMs as I use for CIFS/NFS, versus carving out and dedicating specific space for iSCSI.
 
  • Like
Reactions: wildchild

Marsh

Moderator
May 12, 2013
2,207
1,044
113
I suspect if you are using Windows NFS , you will be very disappointed with the performance.
Microsoft never really put any serious effect in their NFS server, it is just afterthought.
Also Microsoft "got off" the iSCSI train in the past few years.
 

DaSaint

Active Member
Oct 3, 2015
245
55
28
Colorado
Currently testing 1.8.2.5 OFED drivers for ESXi (tried 1.8.2.4 alot, just moved on to 1.8.2.5)

on the CX2 i would try fW 2.9.1530 and I would use OFED 2.3.3.1 to see if u can get RDMA and SR-IOV to play nice, thats the build i heard worked good...

IIR there was also a Dell Firmware that could go beyond that which IIR there was a need for FW 2.9.8xxx to get RDMA over SMB to play nice

IIR the Dell FW is 2.10.720
See Custom Firmware for Mellanox OEM Infiniband Cards - WS2012 RDMA

You will also have limitations on your hardware based on your PCIe BUS too, keep that in mind when you are wanting larger bandwidth without a high enough PCIe BUS you wont get FAST throughput...
 
Last edited:

humbleThC

Member
Nov 7, 2016
99
9
8
44
Currently testing 1.8.2.5 OFED drivers for ESXi (tried 1.8.2.4 alot, just moved on to 1.8.2.5)

on the CX2 i would try fW 2.9.1530 and I would use OFED 2.3.3.1 to see if u can get RDMA and SR-IOV to play nice, thats the build i heard worked good...

IIR there was also a Dell Firmware that could go beyond that which IIR there was a need for FW 2.9.8xxx to get RDMA over SMB to play nice

IIR the Dell FW is 2.10.720
See Custom Firmware for Mellanox OEM Infiniband Cards - WS2012 RDMA

You will also have limitations on your hardware based on your PCIe BUS too, keep that in mind when you are wanting larger bandwidth without a high enough PCIe BUS you wont get FAST throughput...
Yeah, i've got them in PCIe Gen2 x8, which is the full spec of the adapters themselves.
I am running FW 2.10.720 on all my adapters atm.

My RDMA over SMB3 is working amazing right now, from Windows 10 to Windows 2012 R2. Maxing out the underlying disk on all benchmarks/tests.

And as of yesterday, I bound a software iSCSI initiator to separate VMNics on separate subnets, and got multi-pathing working well.

The only real issue is with straight up NFS performance.

Still very confused while NFS is 1/1000th the speed of iperf raw TCP , and 1/800th the speed of iSCSI and CIFS.
And i'm not totally willing to give up on it, with the assumption "oh that's just Microsoft's NFS implementation".
 

wildchild

Active Member
Feb 4, 2014
394
57
28
Glad to hear iscsi is performing on par. Not sure ms iscsi supports vaai.. would be interesting to know though
 

humbleThC

Member
Nov 7, 2016
99
9
8
44
Read & Write performance are working "well enough". Still not ideal.

All I did was add an iSCSI software adapter, and bound it to both of my VMKernel ports for the IB networks. Discovered both targets, and wallah MPIO is working between ESX & Win2012 R2.

That being said, when I hit it hard, i'm seeing about 3-5Gbit of network bandwidth equating to about 300-550MBytes/sec sustained read/write.

It's not quite the 1.2GBytes/sec sustain read/write I can get from my Windows box using SMB3/RDMA. But its amazingly better than the 3MBytes/sec write I was getting over NFS in the same setup.

I suspect my max performance is disk based atm, as I only have a pair of Samsung 850 Evos caching my Hitachi HDD array. I'm getting close to springing for another pair of SSDs + 5x more HDDs to double the NAS, which should yield 2GBs+ sustained bandwidth, pushing the Infinband to closer to 16Gbit.
 

humbleThC

Member
Nov 7, 2016
99
9
8
44
No, I meant what was your NFS sucking? Read&write, or just write...
Actually it was mostly just write. I think the read performance was probably OK, because VMs booted up fine, and for the most part operated OK. But any storage operations like cloning a VM, vMotion, or deploying a VM via OVF would horribly fail/time-out due to poor write performance.