Mellanox ConnectX-3 - Unsupported Cable but replugging many times eventually works? ...and how to update firmware?

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

sofakng

Member
Apr 27, 2011
36
0
6
I have a Mellanox ConnectX-3 with the following setup:

Debian 12.6 (server)
Mellanox ConnectX-3
QSFP-to-SFP adapter (HP #655874-B21)
SFP+ (FS.COM #SFP-10GSR-85)
1.5M LC UPC OM3 fiber (FS.COM #OM3LCDX)
SFP+ (Ubiquiti UACC-OM-MM-10G-D)
Ubiquiti US-48-750W Switch

When I boot (or reboot) the server it will randomly stop working (ie. no link lights on switch or card). The Linux kernel log shows the following message:

Code:
[39797.228374] mlx4_core 0000:06:00.0: Unsupported cable detected
If I unplug/replug the cable (or transceiver) enough times it will eventually work.

Does anybody have any ideas how to troubleshoot? I have a desktop machine with the same model card and QSFP adapter and it also shows similar problems but it uses much longer fiber. However, I don't have that system connected any longer so I can't verify it's still happening.

Can I try to update the firmware on the card?

Here is the mstflint output:
Code:
Image type:            FS2
FW Version:            2.42.5000
FW Release Date:       5.9.2017
Product Version:       02.42.50.00
Rom Info:              type=PXE version=3.4.752
Device ID:             4099
Description:           Node             Port1            Port2            Sys image
GUIDs:                 redacted         redacted         redacted         redacted
MACs:                                   redacted     redacted
VSD:                   
PSID:                  MT_1090120019
 

Markess

Well-Known Member
May 19, 2018
1,210
833
113
Northern California
Debian 12.6 (server)
Mellanox ConnectX-3
QSFP-to-SFP adapter (HP #655874-B21)
SFP+ (FS.COM #SFP-10GSR-85)
1.5M LC UPC OM3 fiber (FS.COM #OM3LCDX)
SFP+ (Ubiquiti UACC-OM-MM-10G-D)
Ubiquiti US-48-750W Switch

If it helps your troubleshooting, my backup NAS running TrueNAS Scale (also Debian 12) has a ConnectX-3 card and the same HP Branded Mellanox QSFP to SFP+ adapter as you. No issues after a few months running time. Setup details:

- Mellanox CX354A-FCBT with (HP #655874-B21) QSFP-SFP+ Adapter installed
- Cheap as dirt AOI A7EL-SN85-ADMA transceivers on both ends
- Generic 5 meter Amazon Basics OM3 cable.
- Low end XikeStor SKS8300-8X SFP+ switch
- Totally vanilla default setup for the 10G connection

Here's a thought...I believe your PSID of MT_1090120019 means you've got your card's firmware is set for 56Gb Infiniband and 40/45Gb Ethernet. From what I understand, it defaults to Infiniband when connected and IN THEORY will automatically switch to Ethernet if it detects the connection is ethernet, and will further autonegotiate the speed down to 10G if that's what's sensed.

But, in practice, I've heard that doesn't always happen and people sometimes have to unplug and replug cables till the correct protocol and speed are identified. Maybe that's what's happening to you?

You may want to try changing your port type to default to Ethernet and see if that helps: ESPCommunity
 

sofakng

Member
Apr 27, 2011
36
0
6
Thanks for the replies. I did not try anything besides FS.COM transceivers so I'll purchase a different one and try it.

Is there a recommended transceiver I should try? It looks like 10gtek has one on Amazon... (I can't find the AOI one that you mentioned)

Also, according to mstconfig I think I'm already running in Ethernet mode but maybe it's not the default?

Code:
/home/sofakng # mstconfig -d 06:00.0 query                          root@debian

Device #1:
----------

Device type:    ConnectX3       
Device:         06:00.0         

Configurations:                              Next Boot
         SRIOV_EN                            False(0)       
         NUM_OF_VFS                          8               
         LINK_TYPE_P1                        ETH(2)         
         LINK_TYPE_P2                        ETH(2)         
         LOG_BAR_SIZE                        3               
         BOOT_PKEY_P1                        0               
         BOOT_PKEY_P2                        0               
         BOOT_OPTION_ROM_EN_P1               True(1)         
         BOOT_VLAN_EN_P1                     False(0)       
         BOOT_RETRY_CNT_P1                   0               
         LEGACY_BOOT_PROTOCOL_P1             PXE(1)         
         BOOT_VLAN_P1                        1               
         BOOT_OPTION_ROM_EN_P2               True(1)         
         BOOT_VLAN_EN_P2                     False(0)       
         BOOT_RETRY_CNT_P2                   0               
         LEGACY_BOOT_PROTOCOL_P2             PXE(1)         
         BOOT_VLAN_P2                        1               
         IP_VER_P1                           IPv4(0)         
         IP_VER_P2                           IPv4(0)         
         CQ_TIMESTAMP                        True(1)
 

Markess

Well-Known Member
May 19, 2018
1,210
833
113
Northern California
Also, according to mstconfig I think I'm already running in Ethernet mode but maybe it's not the default?
Maybe wait till you boot/reboot and get a failure to connect and then see what connection type it's showing then? If it's showing IB instead of Eth, then its probably failing to autoidentify the connection properly.

As for a transceiver that works, I'm not sure. I've been lucky with the AOI ones and have never had them fail to work on anything, so it tends to be all I use. I recently bought another set of 20 on Ebay for about $1 each. They are that cheap. But again, I'm not using any advanced 10Gb networking features and nothing that tends to be "propiretary picky", so YMMV.

I am using a couple SFP/SFP+ to copper converters from 10GTek though, and haven't had any problems with them.
 

blunden

Well-Known Member
Nov 29, 2019
971
312
63
I've been lucky with the AOI ones and have never had them fail to work on anything, so it tends to be all I use. I recently bought another set of 20 on Ebay for about $1 each. They are that cheap. But again, I'm not using any advanced 10Gb networking features and nothing that tends to be "propiretary picky", so YMMV.
I wonder why they are so cheap. What vendor coding do they have?

Regardless, they don't seem to be available for for anywhere close to that cheap in Europe unfortunately.