KAIAM XQX2502 + ConnectX-3 => NO CARRIER

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

human_capitalist

New Member
Dec 22, 2020
16
9
3
Hi,

I have a Debian 11 workstation and a FreeBSD 13 server that have been connected with ConnectX-3 cards and a DAC. In order to move the FreeBSD server far enough away that I can't hear it any more, I picked up some KAIAM XQX2502 40GBASE-LR4L modules from EBay and a 3m "9/125 Single Mode Fiber Patch Cable LC/UPC-LC/UPC Duplex" from FS.com.

After initially trying with the 3m cable, I could only get "no carrier" at both ends. Fearing it might be dust from the modules having been stored badly (and that I might have damaged the cable due to dust...), I flushed the ports with "Sticklers(tm) Fibre Optic Splice & Connector Cleaner", before wetting some FS.com foam swabs with the same fluid and gently cleaning the ports, before trying again with a brand new 1m cable. No luck.

As per the listings below, it looks to me like both machines are recognizing the modules ok - is this correct? Does anyone have an idea what might be causing the "no carrier"? I have carefully checked that the "T" port of each module is connected to the "R" port of the other module, and both the cables were brand new from FS.com and handled carefully.

Code:
Linux machine, ip a
====================

8: ens4d1: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 9000 qdisc mq state DOWN group default qlen 1000
    link/ether 24:be:05:89:6c:31 brd ff:ff:ff:ff:ff:ff
    altname enp1s0d1
    inet6 fe80::26be:5ff:fe89:6c31/64 scope link
       valid_lft forever preferred_lft forever



   
   
Linux machine, mlxfwmanager
===========================

root@workstation:/dev/mst# mlxfwmanager --query
Querying Mellanox devices firmware ...

Device #1:
----------

  Device Type:      ConnectX3
  Part Number:      649281-B21_Bx
  Description:      HP IB 4X FDR CX-3 PCI-e G3 Dual Port HCA
  PSID:             HP_0280210019
  PCI Device Name:  /dev/mst/mt4099_pci_cr0
  Port1 MAC:        24be05896c30
  Port2 MAC:        24be05896c31
  Versions:         Current        Available  
     FW             2.42.5000      N/A        
     CLP            8025           N/A        
     PXE            3.4.0752       N/A        

  Status:           No matching image found



Linux machine, ethtool
======================

root@workstation:/dev/mst# ethtool -m ens4d1
        Identifier                                : 0x0d (QSFP+)
        Extended identifier                       : 0x80
        Extended identifier description           : 2.5W max. Power consumption
        Extended identifier description           : No CDR in TX, No CDR in RX
        Extended identifier description           : High Power Class (> 3.5 W) not enabled
        Connector                                 : 0x07 (LC)
        Transceiver codes                         : 0x02 0x00 0x00 0x00 0x00 0x00 0x00 0x00
        Transceiver type                          : 40G Ethernet: 40G Base-LR4
        Encoding                                  : 0x00 (unspecified)
        BR, Nominal                               : 10300Mbps
        Rate identifier                           : 0x00
        Length (SMF,km)                           : 2km
        Length (OM3 50um)                         : 0m
        Length (OM2 50um)                         : 0m
        Length (OM1 62.5um)                       : 0m
        Length (Copper or Active cable)           : 0m
        Transmitter technology                    : 0x40 (1310 nm DFB)
        Laser wavelength                          : 1310.000nm
        Laser wavelength tolerance                : 6.500nm
        Vendor name                               : KAIAM CORP
        Vendor OUI                                : 14:ed:e4
        Vendor PN                                 : XQX2502
        Vendor rev                                : 1A
        Vendor SN                                 : KD60923294
        Date code                                 : 16092300
        Revision Compliance                       : SFF-8636 Rev 1.5
        Module temperature                        : 50.83 degrees C / 123.50 degrees F
        Module voltage                            : 3.3107 V
        Alarm/warning flags implemented           : No
        Laser tx bias current (Channel 1)         : 28.158 mA
        Laser tx bias current (Channel 2)         : 27.170 mA
        Laser tx bias current (Channel 3)         : 28.404 mA
        Laser tx bias current (Channel 4)         : 28.158 mA
        Transmit avg optical power (Channel 1)    : 0.0000 mW / -inf dBm
        Transmit avg optical power (Channel 2)    : 0.0000 mW / -inf dBm
        Transmit avg optical power (Channel 3)    : 0.0000 mW / -inf dBm
        Transmit avg optical power (Channel 4)    : 0.0000 mW / -inf dBm
        Rcvr signal avg optical power(Channel 1)  : 1.1000 mW / 0.41 dBm
        Rcvr signal avg optical power(Channel 2)  : 1.0066 mW / 0.03 dBm
        Rcvr signal avg optical power(Channel 3)  : 0.8614 mW / -0.65 dBm
        Rcvr signal avg optical power(Channel 4)  : 0.8625 mW / -0.64 dBm

     

     
     
FreeBSD machine, ifconfig
=========================
   
mlxen0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=ed07bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,LRO,VLAN_HWFILTER,VLAN_HWTSO,LINKSTATE,RXCSUM_IPV6,TXCSUM_IPV6>
        ether 00:02:c9:3c:76:60
        inet 10.2.2.1 netmask 0xffff0000 broadcast 10.2.255.255
        inet 10.2.2.2 netmask 0xffff0000 broadcast 10.2.255.255
        media: Ethernet autoselect
        status: no carrier
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>


     
FreeBSD machine, mlxfwmanager
=============================
     
root@photoserver:~ # mlxfwmanager --query
Querying Mellanox devices firmware ...

Device #1:
----------

  Device Type:      ConnectX3
  Part Number:      MCX353A-FCB_A2-A5
  Description:      ConnectX-3 VPI adapter card; single-port QSFP; FDR IB (56Gb/s) and 40GigE; PCIe3.0 x8 8GT/s; RoHS R6
  PSID:             MT_1100120019
  PCI Device Name:  pci0:17:0:0
  Port1 MAC:        0002c93c7660
  Port2 MAC:        0002c93c7661
  Versions:         Current        Available  
     FW             2.42.5000      N/A        
     PXE            3.4.0752       N/A        

  Status:           No matching image found


FreeBSD machine, ifconfig
=========================
     
root@photoserver:~ # ifconfig -v mlxen0
mlxen0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=ed07bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,LRO,VLAN_HWFILTER,VLAN_HWTSO,LINKSTATE,RXCSUM_IPV6,TXCSUM_IPV6>
        ether 00:02:c9:3c:76:60
        inet 10.2.2.1 netmask 0xffff0000 broadcast 10.2.255.255
        inet 10.2.2.2 netmask 0xffff0000 broadcast 10.2.255.255
        media: Ethernet autoselect
        status: no carrier
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
        plugged: QSFP+ 40GBASE-LR4 (LC)
        vendor: KAIAM CORP PN: XQX2502 SN: KD61021025 DATE: 2016-10-21
        module temperature: 37.62 C voltage: 3.28 Volts
        lane 1: RX power: 0.00 mW (-inf dBm) TX bias: 50.19 mA
        lane 2: RX power: 0.00 mW (-inf dBm) TX bias: 46.29 mA
        lane 3: RX power: 0.00 mW (-inf dBm) TX bias: 45.67 mA
        lane 4: RX power: 0.00 mW (-inf dBm) TX bias: 39.66 mA
 
Last edited:

Wolfcastle

Member
Jan 3, 2022
50
24
8
Looks like there's a problem with your Tx side, there laser current but the optical monitors aren't seeing any power.
 
  • Like
Reactions: Stephan

human_capitalist

New Member
Dec 22, 2020
16
9
3
Looks like there's a problem with your Tx side, there laser current but the optical monitors aren't seeing any power.
It's weird. Those figures are from the Linux machine. I tried powering off the FreeBSD machine and checking them again:

Code:
       Vendor name                               : KAIAM CORP
        Vendor OUI                                : 14:ed:e4
        Vendor PN                                 : XQX2502
        Vendor rev                                : 1A
        Vendor SN                                 : KD60923294
        Date code                                 : 16092300
        Revision Compliance                       : SFF-8636 Rev 1.5
        Module temperature                        : 35.11 degrees C / 95.19 degrees F
        Module voltage                            : 3.3119 V
        Alarm/warning flags implemented           : No
        Laser tx bias current (Channel 1)         : 28.652 mA
        Laser tx bias current (Channel 2)         : 25.934 mA
        Laser tx bias current (Channel 3)         : 28.652 mA
        Laser tx bias current (Channel 4)         : 28.898 mA
        Transmit avg optical power (Channel 1)    : 0.0000 mW / -inf dBm
        Transmit avg optical power (Channel 2)    : 0.0000 mW / -inf dBm
        Transmit avg optical power (Channel 3)    : 0.0000 mW / -inf dBm
        Transmit avg optical power (Channel 4)    : 0.0000 mW / -inf dBm
        Rcvr signal avg optical power(Channel 1)  : 0.0101 mW / -19.96 dBm
        Rcvr signal avg optical power(Channel 2)  : 0.0000 mW / -inf dBm
        Rcvr signal avg optical power(Channel 3)  : 0.0022 mW / -26.58 dBm
        Rcvr signal avg optical power(Channel 4)  : 0.0000 mW / -inf dBm
...and you can see that the Rcvr signal has dropped to basically read noise, which means that previously a signal was coming from the FreeBSD machine, right?

So I switched the cable around, so the module with serial number KD60923294 was in FreeBSD this time:

Code:
        Vendor name                               : KAIAM CORP
        Vendor OUI                                : 14:ed:e4
        Vendor PN                                 : XQX2502
        Vendor rev                                : 1A
        Vendor SN                                 : KD61021025
        Date code                                 : 16102100
        Revision Compliance                       : SFF-8636 Rev 1.5
        Module temperature                        : 42.15 degrees C / 107.87 degrees F
        Module voltage                            : 3.3021 V
        Alarm/warning flags implemented           : No
        Laser tx bias current (Channel 1)         : 26.182 mA
        Laser tx bias current (Channel 2)         : 23.218 mA
        Laser tx bias current (Channel 3)         : 22.970 mA
        Laser tx bias current (Channel 4)         : 22.230 mA
        Transmit avg optical power (Channel 1)    : 0.0000 mW / -inf dBm
        Transmit avg optical power (Channel 2)    : 0.0000 mW / -inf dBm
        Transmit avg optical power (Channel 3)    : 0.0000 mW / -inf dBm
        Transmit avg optical power (Channel 4)    : 0.0000 mW / -inf dBm
        Rcvr signal avg optical power(Channel 1)  : 0.9413 mW / -0.26 dBm
        Rcvr signal avg optical power(Channel 2)  : 1.3412 mW / 1.27 dBm
        Rcvr signal avg optical power(Channel 3)  : 1.3135 mW / 1.18 dBm
        Rcvr signal avg optical power(Channel 4)  : 1.5677 mW / 1.95 dBm
Now SN KD61021025 which had been in the FreeBSD machine previously is now in the Linux machine. We know it was sending a signal before, but here we can see it's transmit optical power is zero... and to add to the confusion, we can see that KD60923294, which had previously had trx power of zero, now seems to be sending quite a strong signal back to KD61021025.

I mean, it doesn't seem like the receive numbers are internally generated, since they drop to basically noise when the other end is powered down.

And it doesn't seem to be the card or it's compatibility with Linux, since the link works fine with DAC or with SR4 modules...

I don't know what to make of it!
 
Last edited:

human_capitalist

New Member
Dec 22, 2020
16
9
3

klui

Well-Known Member
Feb 3, 2019
844
463
63
Thanks for posting your resolution. I never realized non-Mellanox FW has this restriction. My HPs are not installed, only Mellanox.
 
  • Like
Reactions: human_capitalist