Why oh why do I suck at the IB

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

mpogr

Active Member
Jul 14, 2016
115
95
28
53
@T_Minus not sure why would you entertain with bridging unless you have a mix of old (X2) and newer (X3/X4) cards. That's the only case where you have a big enough difference between IB and ETH speeds which might justify a bridge. Otherwise, I'd just suggest sticking to either IB or ETH (depending on what switch you've got and OS support, which, as we know, is challenging when ESXi is involved) and avoid unnecessary complications.
 

epicurean

Active Member
Sep 29, 2014
785
80
28
sorry to bother again @mpogr, but I got a dependency error whilst trying to remove the 2.4.0 drivers in esxi. How do I go about this?
 

mpogr

Active Member
Jul 14, 2016
115
95
28
53
Just for everyone's reference, this is the list of Mellanox modules I've got on my ESXi 6.0U2 hosts (this is including MFT, which is only needed for firmware flashing):

net-ib-cm 1.8.2.5-1OEM.600.0.0.2494585 MEL PartnerSupported 2017-01-07
net-ib-core 1.8.2.5-1OEM.600.0.0.2494585 MEL PartnerSupported 2017-01-07
net-ib-ipoib 1.8.2.5-1OEM.600.0.0.2494585 MEL PartnerSupported 2017-01-07
net-ib-mad 1.8.2.5-1OEM.600.0.0.2494585 MEL PartnerSupported 2017-01-07
net-ib-sa 1.8.2.5-1OEM.600.0.0.2494585 MEL PartnerSupported 2017-01-07
net-ib-umad 1.8.2.5-1OEM.600.0.0.2494585 MEL PartnerSupported 2017-01-07
net-memtrack 1.8.2.5-1OEM.600.0.0.2494585 MEL PartnerSupported 2017-01-07
net-mlx4-core 1.8.2.5-1OEM.600.0.0.2494585 MEL PartnerSupported 2017-01-07
net-mlx4-ib 1.8.2.5-1OEM.600.0.0.2494585 MEL PartnerSupported 2017-01-07
net-mst 4.5.0.31-1OEM.600.0.0.2494585 MEL PartnerSupported 2016-12-06
scsi-ib-srp 1.8.2.5-1OEM.600.0.0.2494585 MEL PartnerSupported 2017-01-07
mft 4.5.0.31-0 Mellanox PartnerSupported 2016-12-06


Also, and this is important, if you want to use 4K MTU, you need to set the following parameters on the ESXi hosts (reboot is necessary):
esxcli system module parameters set -m mlx4_core -p='mtu_4k=1'

Then, any virtual switch and VMkernel adapter associated with Mellanox NICs can have MTU set to 4092. Same applies to virtual NICs of VMs connected to this switch.

If you don't do this and have MTU4K enabled in your SM config, you ESXi hosts won't connect!
 
Last edited:

mpogr

Active Member
Jul 14, 2016
115
95
28
53
@mpogr ...I'm not sure how it works for bridging and SM and/or if I need to bridge them or just run separate networks...
There is another important piece of information related to bridging that can save some frustration. I have a newer SX6012 switch that has been hacked from its original EMC state to run MLNX-OS. I also hacked it to have all the features enabled on the way, including the gateway.
So, what I discovered was that you can't have both SM and the Gateway run on the switch at the same time. Which means, if you want to use its built-in Gateway functionality and route traffic between IB and ETH ports, you'd need to disable the built-in SM and run it on another computer/switch connected to one of its IB ports. Annoying as it is, that's the reality.
 

epicurean

Active Member
Sep 29, 2014
785
80
28
@mpogr,
How do I determine if the firmware on my cards is relevant for IPoIB and RDMA that I intend to use, and if update to a particular firmware is needful ?
What MFT version would you advise for flashing that firmware?
 

mpogr

Active Member
Jul 14, 2016
115
95
28
53
Not sure about ConnectX2, as I switched to X3 a while ago. I think, for ESXi, the latest official should be OK. I read somewhere that some Dell firmware enabled using RDMA under Windows Server OS (which wasn't working with the official firmware), but I doubt it would make any difference on the ESXi side. I did try flashing it and it worked OK under ESXi though, so it doesn't cause any harm at least...
 
  • Like
Reactions: epicurean

T_Minus

Build. Break. Fix. Repeat
Feb 15, 2015
7,640
2,057
113
Thanks, I have about a dozen NIB X2s but slowly started acquiring x3 VPI (maybe some EN too I think) so it sounds like I'll sell off the x2s and keep x3s for simplicity all around.
 

mpogr

Active Member
Jul 14, 2016
115
95
28
53
what's wrong with the X2? :)
X2 has already been officially phased out by Mellanox. No official current drivers for ESXi and Windows support it. Every new build of ESXi 6.0 has a potential of breaking the unofficial workarounds described above in this thread.
That said, it's completely unclear if Mellanox are going to continue supporting X3 in ESXi 6.5. Right now there are no drivers at all for 6.5 supporting either SRP or iSER, and this is really bad...
 

T_Minus

Build. Break. Fix. Repeat
Feb 15, 2015
7,640
2,057
113
@epicurean for sale cheap :p LOL!! NIB!! I think a lot of people are still using them for 10G or maybe it would be time for me @mpogr to just dump all this gear and go 40GigE, screw it 100GigE :eek: if only had the budget!