Infiniband PCIe card preventing boot in one server but not another

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

alltheasimov

Member
Feb 17, 2018
59
12
8
33
Hi,

I have a weird problem. I purchased 4x used identical Sun QDR X4242A Infiniband cards (Mellanox MHQH29B rebrands) for my 4 node supermicro 6027TR-HTR (X9DRT-HF) server. 3/4 cause boot to hang at post code 91, which is when the PCI stuff is loaded. The fourth one seems to let the node boot fine. Switching nodes/pci slots doesn't help, and I know all the pci slots work fine. "OK, so you have 3 dead cards."...except not. Here's the weird part: they all boot fine in my work station (i7-5960x, X99-SLI motherboard) and are recognized by lspci and ibstat.

I did some more troubleshooting. The only difference I could find was the card firmware version. The one that works in the supermicro server has 2010 firmware while all of the others have 2012 firmware. That's kind of odd to me...you'd think the older firmware would have problems, not the newer.

My supermicro server nodes had a 2013 bios, which I updated (after updating IMPI) to the latest (2015) bios. Same problem. I tried various bios settings, none helped. I tried the SMBus pin covering trick, but that didn't help either. The older firmware card still works with the system, but the newer firmware cards all prevent boot from completing.

Any ideas?

I don't have a Sun/Oracle support contract, so I can't access any of the firmware updates for their cards. I also don't have access to the older firmware version, so I can't roll back my 2012 cards. I could try to reflash with the latest ConnectX-2 Mellanox firmware for the MHQH29B using my desktop, but I'd like to exhaust all other options because that looks difficult to do.

Thanks
 

alltheasimov

Member
Feb 17, 2018
59
12
8
33
Nice. I didn't know that extracting firmware was possible.

I think I'll try two things: 1. Flash one to Mellanox firmware and see if that works. 2. If it does not, extract the firmware from the older Sun firmware card and flash all of the cards with it.

Do you know where I can find the latest Mellanox ConnectX-2 firmware for the MHQH29B? I think I read somewhere that what they have on their website isn't actually the latest.

Thanks
 

Rand__

Well-Known Member
Mar 6, 2014
6,626
1,767
113
Latest is not always best with MLX... but I'd get the one from the MLX page, not sure where else (HP?) you would get it
 

alltheasimov

Member
Feb 17, 2018
59
12
8
33
I went back through all my bookmarks. There are two versions that people seem to use:
  • 2.9.1000, which is the only firmware available for these cards on Mellanox's site now that they've taken down the custom firmware table.
  • 2.10.720 (link 1 2) , which apparently allows windows machines to use RDMA. There is one person in the first link hosting the files for MHQH19 and MHQH29 still. The original files were in the custom firmware table, which again, is now gone.
Interestingly, all of these are lower numbers than the firmwares in my Sun cards. I assumed they were dates, but perhaps they are actually firmware revsions. Does anyone know if the Sun fw revisions are numbered consistently with the mellanox ones? If so, both of my cards have firmware newer than these versions.
 

alltheasimov

Member
Feb 17, 2018
59
12
8
33
The firmware from my Sun cards was 2.11.2012, which doesn't allow boot, and 2.11.2010, which does allow boot in the SM server. I copied the 2.11.2010 firmware, then flashed it to one of the 2.11.2012 cards using my desktop (which both work in). Flash worked fine. Then I put the newly flashed card in my SM server. Success! It boots and ibstat shows link up at rate 40 (QDR).

That means that there is something in the 2.11.2012 firmware that prevents this SM server from booting. Heck knows what. I'll post a complete guide for firmware extraction and burning, along with all of the files I've dug up for these cards, soon.
 

i386

Well-Known Member
Mar 18, 2016
4,221
1,540
113
34
Germany
If he has access to oracle download/update/support portal he could check the release notes.
(I think I read in the op that he had no access to oracle downloads, so it's unlikely.)
 

rippiedoos

New Member
Mar 7, 2018
26
8
3
In the release-notes of the firmware, it says that the BAR-space has been increased from 8MB to 128MB. Results: no boot in some SUN-machines.

In the accompanied article is says the issue can be avoided by changing BIOS settings as described below (if you can):
  1. Access the BIOS Setup Utility.
  2. Navigate to RC Settings > QPI and change
    * MMIOH Size per IOH from 2Gb to 4Gb (the default is 2Gb).
  3. Navigate to Chipset > North Bridge Configuration and change
    * PCI MMIO 64 Bits Support to Enabled (the default is Disabled)
  4. Save your changes and exit the BIOS Setup Utility.
  5. Reboot the server.
Further more, the 2012-firmware is equal to the generic firmware version 2.11.2010, but has increased BAR Space (increased from 8 MBytes to 128 MBytes per Function).
 
  • Like
Reactions: alltheasimov

alltheasimov

Member
Feb 17, 2018
59
12
8
33
Thank you for the good responses. Most of this will go in the guide I'm writing up.

If he has access to oracle download/update/support portal he could check the release notes.
I do not have a support contract, so I do not have access.

In the release-notes of the firmware, it says that the BAR-space has been increased from 8MB to 128MB. Results: no boot in some SUN-machines.

In the accompanied article is says the issue can be avoided by changing BIOS settings as described below (if you can):
  1. Access the BIOS Setup Utility.
  2. Navigate to RC Settings > QPI and change
    * MMIOH Size per IOH from 2Gb to 4Gb (the default is 2Gb).
  3. Navigate to Chipset > North Bridge Configuration and change
    * PCI MMIO 64 Bits Support to Enabled (the default is Disabled)
  4. Save your changes and exit the BIOS Setup Utility.
  5. Reboot the server.
Further more, the 2012-firmware is equal to the generic firmware version 2.11.2010, but has increased BAR Space (increased from 8 MBytes to 128 MBytes per Function).
I do not have those BIOS settings in my SM server. However, I thought they might be similar to some of SM's settings that I haven't tried yet, so I messed around in the BIOS some more. I know from working with Xeon Phi's that larger BAR-spaces are called different things in different servers, e.g. “above 4G decoding”, "large PCI MMIO", “large BAR support”. Specifically for my server, under the PCIe/PCI/PNP Configuration tab, if I enable "Above 4G Decoding", then boot works! Furthermore, ibstat shows link up at rate 40 (QDR). So I believe this is a total success.

I bet that the no-boot problem isn't limited to Supermicro servers. It might be a fairly universal problem with any firmware version (probably all newer ones) that requires larger BAR-space.

the 2012-firmware is equal to the generic firmware version 2.11.2010
Does that mean the 2.11.2010 follows the Mellanox firmware numbering scheme, and therefore is newer than Mellanox 2.9.1000? Or does Sun use a different numbering scheme for firmware?

Andreas: Since you have access to the oracle downloads, can you tell me a bit about what is available? I don't think I need anything since I've found two solutions, but I am putting together a big guide for future people. Are there any newer versions of the firmware for these HCAs? Do they have multiple firmware versions for every ConnectX-2/ConnectX-3 card, or just one like Mellanox's website? Do they have their own flashing tool, or do they use Mellanox's? Thanks!
 

rippiedoos

New Member
Mar 7, 2018
26
8
3
The firmware 2.11.2012 is named as last available firmware for an Oracle Engineered System (Exadata). The 2.11.2010 for 'normal' systems. I see one other device with the CX2-chip and one CX3-chip. The CX3-chip has a newer firmware but both CX2-cards have these versions stated as last.

All 3 devices have in there notes stated that non-default version-number have BAR-space-changes, the other seems the same as the stock Mellanox firmware.

And in regards to firmware-versions, it looks like the version numbers are the same or similar enough to be on par with the stock firmware with some Oracle/Sun-specific changes added.
 
Last edited:

alltheasimov

Member
Feb 17, 2018
59
12
8
33
The firmware 2.11.2012 is named as last available firmware for an Oracle Engineered System (Exadata). The 2.11.2010 for 'normal' systems. I see one other device with the CX2-chip and one CX3-chip. The CX3-chip has a newer firmware but both CX2-cards have these versions stated as last.

All 3 devices have in there notes stated that non-default version-number have BAR-space-changes, the other seems the same as the stock Mellanox firmware.
Excellent information. Thank you. So we've determined that these firmware versions are the latest. It sounds like they created the 2.11.2012 version specifically for motherboards that allow for BAR-space increases, and that 2.11.2010 should be used for all normal motherboards. If we ignore the special large BAR-space firmwares, then Sun, like Mellanox, does not provide more than 1 firmware version per card type. It's also likely that the Sun numbering scheme is the same as Mellanox's, meaning that the firmware they provide is much newer than the version Mellanox does (2.9.1000). Also interesting is that the newer MHQH29C has the same Mellanox firmware listed (2.9.1000). The latest one HP lists for their equivalent to the MHQH29B/C cards is 2.9.1530. The latest Mellanox firmware used commonly here to enable RDMA for Windows on these cards is 2.10.720. So Sun/Oracle's seems to be the most recent for the MHQH29B. Interesting.
 

alltheasimov

Member
Feb 17, 2018
59
12
8
33
Ooo, another interesting question: What do you think the performance impacts of enabling the larger BAR-space are? I assume it must improve performance or they wouldn't do it. Anyone know?

UPDATE: here
 
Last edited:

rippiedoos

New Member
Mar 7, 2018
26
8
3
Excellent information. Thank you. So we've determined that these firmware versions are the latest. It sounds like they created the 2.11.2012 version specifically for motherboards that allow for BAR-space increases, and that 2.11.2010 should be used for all normal motherboards. If we ignore the special large BAR-space firmwares, then Sun, like Mellanox, does not provide more than 1 firmware version per card type. It's also likely that the Sun numbering scheme is the same as Mellanox's, meaning that the firmware they provide is much newer than the version Mellanox does (2.9.1000). Also interesting is that the newer MHQH29C has the same Mellanox firmware listed (2.9.1000). The latest one HP lists for their equivalent to the MHQH29B/C cards is 2.9.1530. The latest Mellanox firmware used commonly here to enable RDMA for Windows on these cards is 2.10.720. So Sun/Oracle's seems to be the most recent for the MHQH29B. Interesting.
Sun isn't the first where there are later versions, frequently made by the ODM themselves, that are only released to OEM's. I've seen this with IBM, HP and Dell. The latest ConnectX-2-firmware from HPE for instance is 2.9.1530 (also increasing the BAR-size from 8MB to 32MB for that matter).

And with Broadcom/Emulex it's even worse. Those changes are still being pushed to OEM's but not nearly as recent as the ancient firmware-versions on the broadcom/emulex-site itself.
 

alltheasimov

Member
Feb 17, 2018
59
12
8
33
So here the ODM is Mellanox, and the OEMs are Sun/Oracle, IBM, HP, Dell, etc?

Nice catch with HP increasing the BAR size. I wonder if 32MB would work without turning on above 4G decoding. I bet most motherboards have a default upper limit on BAR size.

So the versions on the broadcom/emulex site are ancient but the OEMs have access to newer versions? *sigh...this is all rather silly
 

gzorn

Member
Jan 10, 2017
76
14
8
Just saw this thread. I have a similar(-ish) problem with a Dell-branded ConnectX-3 (312 dual port). It works in an Supermicro X11SSi-ln4f but not on a gigabyte G87 UD3H. Machine wouldn't even boot. Crossflashing to the latest Mellanox firmware screwed up the LED's but alas no improvement in booting on the Supermicro. I wasn't aware of the SMbus pin masking. Maybe I'll give that a shot.