Infiniband PCIe card preventing boot in one server but not another

Discussion in 'Networking' started by alltheasimov, May 12, 2018.

  1. alltheasimov

    alltheasimov Member

    Joined:
    Feb 17, 2018
    Messages:
    48
    Likes Received:
    7
    Hi,

    I have a weird problem. I purchased 4x used identical Sun QDR X4242A Infiniband cards (Mellanox MHQH29B rebrands) for my 4 node supermicro 6027TR-HTR (X9DRT-HF) server. 3/4 cause boot to hang at post code 91, which is when the PCI stuff is loaded. The fourth one seems to let the node boot fine. Switching nodes/pci slots doesn't help, and I know all the pci slots work fine. "OK, so you have 3 dead cards."...except not. Here's the weird part: they all boot fine in my work station (i7-5960x, X99-SLI motherboard) and are recognized by lspci and ibstat.

    I did some more troubleshooting. The only difference I could find was the card firmware version. The one that works in the supermicro server has 2010 firmware while all of the others have 2012 firmware. That's kind of odd to me...you'd think the older firmware would have problems, not the newer.

    My supermicro server nodes had a 2013 bios, which I updated (after updating IMPI) to the latest (2015) bios. Same problem. I tried various bios settings, none helped. I tried the SMBus pin covering trick, but that didn't help either. The older firmware card still works with the system, but the newer firmware cards all prevent boot from completing.

    Any ideas?

    I don't have a Sun/Oracle support contract, so I can't access any of the firmware updates for their cards. I also don't have access to the older firmware version, so I can't roll back my 2012 cards. I could try to reflash with the latest ConnectX-2 Mellanox firmware for the MHQH29B using my desktop, but I'd like to exhaust all other options because that looks difficult to do.

    Thanks
     
    #1
  2. Rand__

    Rand__ Well-Known Member

    Joined:
    Mar 6, 2014
    Messages:
    2,360
    Likes Received:
    293
    #2
    alltheasimov likes this.
  3. alltheasimov

    alltheasimov Member

    Joined:
    Feb 17, 2018
    Messages:
    48
    Likes Received:
    7
    Nice. I didn't know that extracting firmware was possible.

    I think I'll try two things: 1. Flash one to Mellanox firmware and see if that works. 2. If it does not, extract the firmware from the older Sun firmware card and flash all of the cards with it.

    Do you know where I can find the latest Mellanox ConnectX-2 firmware for the MHQH29B? I think I read somewhere that what they have on their website isn't actually the latest.

    Thanks
     
    #3
  4. Rand__

    Rand__ Well-Known Member

    Joined:
    Mar 6, 2014
    Messages:
    2,360
    Likes Received:
    293
    Latest is not always best with MLX... but I'd get the one from the MLX page, not sure where else (HP?) you would get it
     
    #4
  5. alltheasimov

    alltheasimov Member

    Joined:
    Feb 17, 2018
    Messages:
    48
    Likes Received:
    7
    I went back through all my bookmarks. There are two versions that people seem to use:
    • 2.9.1000, which is the only firmware available for these cards on Mellanox's site now that they've taken down the custom firmware table.
    • 2.10.720 (link 1 2) , which apparently allows windows machines to use RDMA. There is one person in the first link hosting the files for MHQH19 and MHQH29 still. The original files were in the custom firmware table, which again, is now gone.
    Interestingly, all of these are lower numbers than the firmwares in my Sun cards. I assumed they were dates, but perhaps they are actually firmware revsions. Does anyone know if the Sun fw revisions are numbered consistently with the mellanox ones? If so, both of my cards have firmware newer than these versions.
     
    #5
  6. alltheasimov

    alltheasimov Member

    Joined:
    Feb 17, 2018
    Messages:
    48
    Likes Received:
    7
    The firmware from my Sun cards was 2.11.2012, which doesn't allow boot, and 2.11.2010, which does allow boot in the SM server. I copied the 2.11.2010 firmware, then flashed it to one of the 2.11.2012 cards using my desktop (which both work in). Flash worked fine. Then I put the newly flashed card in my SM server. Success! It boots and ibstat shows link up at rate 40 (QDR).

    That means that there is something in the 2.11.2012 firmware that prevents this SM server from booting. Heck knows what. I'll post a complete guide for firmware extraction and burning, along with all of the files I've dug up for these cards, soon.
     
    #6
  7. Rand__

    Rand__ Well-Known Member

    Joined:
    Mar 6, 2014
    Messages:
    2,360
    Likes Received:
    293
    Glad to hear it works - and told you newer is not always better;)
     
    #7
  8. alltheasimov

    alltheasimov Member

    Joined:
    Feb 17, 2018
    Messages:
    48
    Likes Received:
    7
    Do you know if the Sun fw revisions are numbered consistently with the mellanox ones?
     
    #8
  9. Rand__

    Rand__ Well-Known Member

    Joined:
    Mar 6, 2014
    Messages:
    2,360
    Likes Received:
    293
    No idea, sorry. Can you compare change histories?
     
    #9
  10. i386

    i386 Well-Known Member

    Joined:
    Mar 18, 2016
    Messages:
    1,324
    Likes Received:
    302
    If he has access to oracle download/update/support portal he could check the release notes.
    (I think I read in the op that he had no access to oracle downloads, so it's unlikely.)
     
    #10
  11. Rand__

    Rand__ Well-Known Member

    Joined:
    Mar 6, 2014
    Messages:
    2,360
    Likes Received:
    293
    ah good point :/
     
    #11
  12. rippiedoos

    rippiedoos New Member

    Joined:
    Mar 7, 2018
    Messages:
    8
    Likes Received:
    1
    In the release-notes of the firmware, it says that the BAR-space has been increased from 8MB to 128MB. Results: no boot in some SUN-machines.

    In the accompanied article is says the issue can be avoided by changing BIOS settings as described below (if you can):
    1. Access the BIOS Setup Utility.
    2. Navigate to RC Settings > QPI and change
      * MMIOH Size per IOH from 2Gb to 4Gb (the default is 2Gb).
    3. Navigate to Chipset > North Bridge Configuration and change
      * PCI MMIO 64 Bits Support to Enabled (the default is Disabled)
    4. Save your changes and exit the BIOS Setup Utility.
    5. Reboot the server.
    Further more, the 2012-firmware is equal to the generic firmware version 2.11.2010, but has increased BAR Space (increased from 8 MBytes to 128 MBytes per Function).
     
    #12
    alltheasimov likes this.
  13. alltheasimov

    alltheasimov Member

    Joined:
    Feb 17, 2018
    Messages:
    48
    Likes Received:
    7
    Thank you for the good responses. Most of this will go in the guide I'm writing up.

    I do not have a support contract, so I do not have access.

    I do not have those BIOS settings in my SM server. However, I thought they might be similar to some of SM's settings that I haven't tried yet, so I messed around in the BIOS some more. I know from working with Xeon Phi's that larger BAR-spaces are called different things in different servers, e.g. “above 4G decoding”, "large PCI MMIO", “large BAR support”. Specifically for my server, under the PCIe/PCI/PNP Configuration tab, if I enable "Above 4G Decoding", then boot works! Furthermore, ibstat shows link up at rate 40 (QDR). So I believe this is a total success.

    I bet that the no-boot problem isn't limited to Supermicro servers. It might be a fairly universal problem with any firmware version (probably all newer ones) that requires larger BAR-space.

    Does that mean the 2.11.2010 follows the Mellanox firmware numbering scheme, and therefore is newer than Mellanox 2.9.1000? Or does Sun use a different numbering scheme for firmware?

    Andreas: Since you have access to the oracle downloads, can you tell me a bit about what is available? I don't think I need anything since I've found two solutions, but I am putting together a big guide for future people. Are there any newer versions of the firmware for these HCAs? Do they have multiple firmware versions for every ConnectX-2/ConnectX-3 card, or just one like Mellanox's website? Do they have their own flashing tool, or do they use Mellanox's? Thanks!
     
    #13
  14. rippiedoos

    rippiedoos New Member

    Joined:
    Mar 7, 2018
    Messages:
    8
    Likes Received:
    1
    The firmware 2.11.2012 is named as last available firmware for an Oracle Engineered System (Exadata). The 2.11.2010 for 'normal' systems. I see one other device with the CX2-chip and one CX3-chip. The CX3-chip has a newer firmware but both CX2-cards have these versions stated as last.

    All 3 devices have in there notes stated that non-default version-number have BAR-space-changes, the other seems the same as the stock Mellanox firmware.

    And in regards to firmware-versions, it looks like the version numbers are the same or similar enough to be on par with the stock firmware with some Oracle/Sun-specific changes added.
     
    #14
    Last edited: May 14, 2018
  15. alltheasimov

    alltheasimov Member

    Joined:
    Feb 17, 2018
    Messages:
    48
    Likes Received:
    7
    Excellent information. Thank you. So we've determined that these firmware versions are the latest. It sounds like they created the 2.11.2012 version specifically for motherboards that allow for BAR-space increases, and that 2.11.2010 should be used for all normal motherboards. If we ignore the special large BAR-space firmwares, then Sun, like Mellanox, does not provide more than 1 firmware version per card type. It's also likely that the Sun numbering scheme is the same as Mellanox's, meaning that the firmware they provide is much newer than the version Mellanox does (2.9.1000). Also interesting is that the newer MHQH29C has the same Mellanox firmware listed (2.9.1000). The latest one HP lists for their equivalent to the MHQH29B/C cards is 2.9.1530. The latest Mellanox firmware used commonly here to enable RDMA for Windows on these cards is 2.10.720. So Sun/Oracle's seems to be the most recent for the MHQH29B. Interesting.
     
    #15
  16. alltheasimov

    alltheasimov Member

    Joined:
    Feb 17, 2018
    Messages:
    48
    Likes Received:
    7
    Ooo, another interesting question: What do you think the performance impacts of enabling the larger BAR-space are? I assume it must improve performance or they wouldn't do it. Anyone know?

    UPDATE: here
     
    #16
    Last edited: May 15, 2018
  17. rippiedoos

    rippiedoos New Member

    Joined:
    Mar 7, 2018
    Messages:
    8
    Likes Received:
    1
    Sun isn't the first where there are later versions, frequently made by the ODM themselves, that are only released to OEM's. I've seen this with IBM, HP and Dell. The latest ConnectX-2-firmware from HPE for instance is 2.9.1530 (also increasing the BAR-size from 8MB to 32MB for that matter).

    And with Broadcom/Emulex it's even worse. Those changes are still being pushed to OEM's but not nearly as recent as the ancient firmware-versions on the broadcom/emulex-site itself.
     
    #17
  18. alltheasimov

    alltheasimov Member

    Joined:
    Feb 17, 2018
    Messages:
    48
    Likes Received:
    7
    So here the ODM is Mellanox, and the OEMs are Sun/Oracle, IBM, HP, Dell, etc?

    Nice catch with HP increasing the BAR size. I wonder if 32MB would work without turning on above 4G decoding. I bet most motherboards have a default upper limit on BAR size.

    So the versions on the broadcom/emulex site are ancient but the OEMs have access to newer versions? *sigh...this is all rather silly
     
    #18
  19. alltheasimov

    alltheasimov Member

    Joined:
    Feb 17, 2018
    Messages:
    48
    Likes Received:
    7
    Here is my guide for flashing firmware to these cards. It pulls from various sources and covers flashing firmware with and without psid changes. Also summarizes everything I've done about this specific issue.
     
    #19
    JustinClift, gzorn and rippiedoos like this.
  20. gzorn

    gzorn Member

    Joined:
    Jan 10, 2017
    Messages:
    67
    Likes Received:
    12
    Just saw this thread. I have a similar(-ish) problem with a Dell-branded ConnectX-3 (312 dual port). It works in an Supermicro X11SSi-ln4f but not on a gigabyte G87 UD3H. Machine wouldn't even boot. Crossflashing to the latest Mellanox firmware screwed up the LED's but alas no improvement in booting on the Supermicro. I wasn't aware of the SMbus pin masking. Maybe I'll give that a shot.
     
    #20

Share This Page