1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

Why oh why do I suck at the IB

Discussion in 'Networking' started by whitey, Nov 29, 2016.

  1. mpogr

    mpogr Member

    Joined:
    Jul 14, 2016
    Messages:
    78
    Likes Received:
    34
    @T_Minus not sure why would you entertain with bridging unless you have a mix of old (X2) and newer (X3/X4) cards. That's the only case where you have a big enough difference between IB and ETH speeds which might justify a bridge. Otherwise, I'd just suggest sticking to either IB or ETH (depending on what switch you've got and OS support, which, as we know, is challenging when ESXi is involved) and avoid unnecessary complications.
     
    #41
  2. epicurean

    epicurean Member

    Joined:
    Sep 29, 2014
    Messages:
    305
    Likes Received:
    7
    sorry to bother again @mpogr, but I got a dependency error whilst trying to remove the 2.4.0 drivers in esxi. How do I go about this?
     
    #42
  3. mpogr

    mpogr Member

    Joined:
    Jul 14, 2016
    Messages:
    78
    Likes Received:
    34
    Put -f flag: esxcli software vib remove -f -n... -n...
     
    #43
    T_Minus and epicurean like this.
  4. mpogr

    mpogr Member

    Joined:
    Jul 14, 2016
    Messages:
    78
    Likes Received:
    34
    Just for everyone's reference, this is the list of Mellanox modules I've got on my ESXi 6.0U2 hosts (this is including MFT, which is only needed for firmware flashing):

    net-ib-cm 1.8.2.5-1OEM.600.0.0.2494585 MEL PartnerSupported 2017-01-07
    net-ib-core 1.8.2.5-1OEM.600.0.0.2494585 MEL PartnerSupported 2017-01-07
    net-ib-ipoib 1.8.2.5-1OEM.600.0.0.2494585 MEL PartnerSupported 2017-01-07
    net-ib-mad 1.8.2.5-1OEM.600.0.0.2494585 MEL PartnerSupported 2017-01-07
    net-ib-sa 1.8.2.5-1OEM.600.0.0.2494585 MEL PartnerSupported 2017-01-07
    net-ib-umad 1.8.2.5-1OEM.600.0.0.2494585 MEL PartnerSupported 2017-01-07
    net-memtrack 1.8.2.5-1OEM.600.0.0.2494585 MEL PartnerSupported 2017-01-07
    net-mlx4-core 1.8.2.5-1OEM.600.0.0.2494585 MEL PartnerSupported 2017-01-07
    net-mlx4-ib 1.8.2.5-1OEM.600.0.0.2494585 MEL PartnerSupported 2017-01-07
    net-mst 4.5.0.31-1OEM.600.0.0.2494585 MEL PartnerSupported 2016-12-06
    scsi-ib-srp 1.8.2.5-1OEM.600.0.0.2494585 MEL PartnerSupported 2017-01-07
    mft 4.5.0.31-0 Mellanox PartnerSupported 2016-12-06


    Also, and this is important, if you want to use 4K MTU, you need to set the following parameters on the ESXi hosts (reboot is necessary):
    esxcli system module parameters set -m mlx4_core -p='mtu_4k=1'

    Then, any virtual switch and VMkernel adapter associated with Mellanox NICs can have MTU set to 4092. Same applies to virtual NICs of VMs connected to this switch.

    If you don't do this and have MTU4K enabled in your SM config, you ESXi hosts won't connect!
     
    #44
    Last edited: Jan 10, 2017
  5. mpogr

    mpogr Member

    Joined:
    Jul 14, 2016
    Messages:
    78
    Likes Received:
    34
    There is another important piece of information related to bridging that can save some frustration. I have a newer SX6012 switch that has been hacked from its original EMC state to run MLNX-OS. I also hacked it to have all the features enabled on the way, including the gateway.
    So, what I discovered was that you can't have both SM and the Gateway run on the switch at the same time. Which means, if you want to use its built-in Gateway functionality and route traffic between IB and ETH ports, you'd need to disable the built-in SM and run it on another computer/switch connected to one of its IB ports. Annoying as it is, that's the reality.
     
    #45
    Rand__, epicurean and T_Minus like this.
  6. epicurean

    epicurean Member

    Joined:
    Sep 29, 2014
    Messages:
    305
    Likes Received:
    7
    @mpogr,
    How do I determine if the firmware on my cards is relevant for IPoIB and RDMA that I intend to use, and if update to a particular firmware is needful ?
    What MFT version would you advise for flashing that firmware?
     
    #46
  7. mpogr

    mpogr Member

    Joined:
    Jul 14, 2016
    Messages:
    78
    Likes Received:
    34
    Not sure about ConnectX2, as I switched to X3 a while ago. I think, for ESXi, the latest official should be OK. I read somewhere that some Dell firmware enabled using RDMA under Windows Server OS (which wasn't working with the official firmware), but I doubt it would make any difference on the ESXi side. I did try flashing it and it worked OK under ESXi though, so it doesn't cause any harm at least...
     
    #47
    epicurean likes this.
  8. T_Minus

    T_Minus Moderator

    Joined:
    Feb 15, 2015
    Messages:
    5,109
    Likes Received:
    927
    Thanks, I have about a dozen NIB X2s but slowly started acquiring x3 VPI (maybe some EN too I think) so it sounds like I'll sell off the x2s and keep x3s for simplicity all around.
     
    #48
  9. epicurean

    epicurean Member

    Joined:
    Sep 29, 2014
    Messages:
    305
    Likes Received:
    7
    what's wrong with the X2? :)
     
    #49
  10. mpogr

    mpogr Member

    Joined:
    Jul 14, 2016
    Messages:
    78
    Likes Received:
    34
    X2 has already been officially phased out by Mellanox. No official current drivers for ESXi and Windows support it. Every new build of ESXi 6.0 has a potential of breaking the unofficial workarounds described above in this thread.
    That said, it's completely unclear if Mellanox are going to continue supporting X3 in ESXi 6.5. Right now there are no drivers at all for 6.5 supporting either SRP or iSER, and this is really bad...
     
    #50
  11. T_Minus

    T_Minus Moderator

    Joined:
    Feb 15, 2015
    Messages:
    5,109
    Likes Received:
    927
    @epicurean for sale cheap :p LOL!! NIB!! I think a lot of people are still using them for 10G or maybe it would be time for me @mpogr to just dump all this gear and go 40GigE, screw it 100GigE :eek: if only had the budget!
     
    #51
Similar Threads: suck
Forum Title Date
Networking Mellanox ConnectX-3 and Ubuntu 16.04 - why does this suck? Jan 11, 2017

Share This Page