Mellanox/Nvidia Connectx-7 fw update

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

Civiloid

Member
Jan 15, 2024
68
47
18
Switzerland
what are you using to test small frame sizes?
Cisco trex (with some custom patches, but that is a long story) and for now - just loop-back QSFP28 100G DAC (I should get QSFP112 later).

I have bunch of Mellanoxes, and on CX6-DX I can easily get about 110 Mpps TX per port and 80 Mpps RX per port. On this one - 120 Mpps TX per port, but total RX is 60 Mpps, combined for both ports and once it crosses that level one of the ports locks up and stops receiving/transmiting anything. And according to the changelog that was one of the issues that was fixed in newer firmwares (at some point in 2023)

What are your system specs?
2xXeon 8490H ES E0, 16x16GB Ram, Debian 12, OFED 24.01-0.3.3.1, cisco trex from master git branch and with few patches (basically I have my own fork where it have bundled dpdk 23.11 and some patches to make it work on higher core count machines with sub-numa clustering enabled)

I have a backup machine that is just slightly overclocked Xeon W5-3435X and has 8x16GB RAM, the same OS, and the same software, though.

What does the output of lspci -vv look like for the pci id of the cx7 nic?
Code:
# lspci -vv -s 98:00.0
98:00.0 Ethernet controller: Mellanox Technologies MT2910 Family [ConnectX-7]
    Subsystem: Mellanox Technologies MT2910 Family [ConnectX-7]
    Physical Slot: 1
    Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx+
    Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
    Latency: 0, Cache Line Size: 32 bytes
    Interrupt: pin A routed to IRQ 16
    NUMA node: 4
    IOMMU group: 74
    Region 0: Memory at 20bff4000000 (64-bit, prefetchable) [size=32M]
    Capabilities: [60] Express (v2) Endpoint, MSI 00
        DevCap:    MaxPayload 512 bytes, PhantFunc 0, Latency L0s unlimited, L1 unlimited
            ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset+ SlotPowerLimit 75W
        DevCtl:    CorrErr- NonFatalErr- FatalErr- UnsupReq-
            RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+ FLReset-
            MaxPayload 512 bytes, MaxReadReq 4096 bytes
        DevSta:    CorrErr+ NonFatalErr- FatalErr- UnsupReq+ AuxPwr- TransPend-
        LnkCap:    Port #0, Speed 32GT/s, Width x16, ASPM not supported
            ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp+
        LnkCtl:    ASPM Disabled; RCB 64 bytes, Disabled- CommClk+
            ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
        LnkSta:    Speed 32GT/s, Width x16
            TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
        DevCap2: Completion Timeout: Range ABC, TimeoutDis+ NROPrPrP- LTR-
             10BitTagComp+ 10BitTagReq+ OBFF Not Supported, ExtFmt- EETLPPrefix-
             EmergencyPowerReduction Not Supported, EmergencyPowerReductionInit-
             FRS- TPHComp- ExtTPHComp-
             AtomicOpsCap: 32bit- 64bit- 128bitCAS-
        DevCtl2: Completion Timeout: 260ms to 900ms, TimeoutDis- LTR- 10BitTagReq- OBFF Disabled,
             AtomicOpsCtl: ReqEn+
        LnkCap2: Supported Link Speeds: 2.5-32GT/s, Crosslink- Retimer+ 2Retimers+ DRS-
        LnkCtl2: Target Link Speed: 32GT/s, EnterCompliance- SpeedDis-
             Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
             Compliance Preset/De-emphasis: -6dB de-emphasis, 0dB preshoot
        LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete+ EqualizationPhase1+
             EqualizationPhase2+ EqualizationPhase3+ LinkEqualizationRequest-
             Retimer- 2Retimers- CrosslinkRes: unsupported
    Capabilities: [48] Vital Product Data
        End
    Capabilities: [9c] MSI-X: Enable+ Count=64 Masked-
        Vector table: BAR=0 offset=00002000
        PBA: BAR=0 offset=00003000
    Capabilities: [c0] Vendor Specific Information: Len=18 <?>
    Capabilities: [40] Power Management version 3
        Flags: PMEClk- DSI- D1- D2- AuxCurrent=375mA PME(D0-,D1-,D2-,D3hot-,D3cold+)
        Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
    Capabilities: [100 v1] Advanced Error Reporting
        UESta:    DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
        UEMsk:    DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
        UESvrt:    DLP+ SDES- TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
        CESta:    RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr+
        CEMsk:    RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr+
        AERCap:    First Error Pointer: 04, ECRCGenCap+ ECRCGenEn- ECRCChkCap+ ECRCChkEn-
            MultHdrRecCap- MultHdrRecEn- TLPPfxPres- HdrLogCap-
        HeaderLog: 00000000 00000000 00000000 00000000
    Capabilities: [150 v1] Alternative Routing-ID Interpretation (ARI)
        ARICap:    MFVC- ACS-, Next Function: 1
        ARICtl:    MFVC- ACS-, Function Group: 0
    Capabilities: [180 v1] Single Root I/O Virtualization (SR-IOV)
        IOVCap:    Migration- 10BitTagReq- Interrupt Message Number: 000
        IOVCtl:    Enable- Migration- Interrupt- MSE- ARIHierarchy+ 10BitTagReq-
        IOVSta:    Migration-
        Initial VFs: 64, Total VFs: 64, Number of VFs: 0, Function Dependency Link: 00
        VF offset: 2, stride: 1, Device ID: 101e
        Supported Page Size: 000007ff, System Page Size: 00000001
        Region 0: Memory at 000020bffa000000 (64-bit, prefetchable)
        VF Migration: offset: 00000000, BIR: 0
    Capabilities: [1c0 v1] Secondary PCI Express
        LnkCtl3: LnkEquIntrruptEn- PerformEqu-
        LaneErrStat: 0
    Capabilities: [230 v1] Access Control Services
        ACSCap:    SrcValid- TransBlk- ReqRedir- CmpltRedir- UpstreamFwd- EgressCtrl- DirectTrans-
        ACSCtl:    SrcValid- TransBlk- ReqRedir- CmpltRedir- UpstreamFwd- EgressCtrl- DirectTrans-
    Capabilities: [320 v1] Lane Margining at the Receiver <?>
    Capabilities: [370 v1] Physical Layer 16.0 GT/s <?>
    Capabilities: [3b0 v1] Extended Capability ID 0x2a
    Capabilities: [420 v1] Data Link Feature <?>
    Kernel driver in use: mlx5_core
    Kernel modules: mlx5_core
 

jpmomo

Active Member
Aug 12, 2018
547
197
43
the good news is that the link capacity and more importantly link status look correct: 32GTs and x16.
also, the qsfp112 cx7 is a lot easier to use than the osfp version.

You should be using a qsfp56 dac as each port will use the full capacity (200G).

What is your eventual use case for your setup?

Can you also let us know additional info on the changelog that describes the low performance with small pkt sizes?
 

Civiloid

Member
Jan 15, 2024
68
47
18
Switzerland
the good news is that the link capacity and more importantly link status look correct: 32GTs and x16.
Yup, that I checked.

You should be using a qsfp56 dac as each port will use the full capacity (200G).
So far I'm way below even 100G port speed on RX. But I eventually will get few DACs.


What is your eventual use case for your setup?
Dev machine to develop packet generator as I'm very much unhappy with trex. So I wanted to have different generations of different cards (I have CX4, 5 and 6, BlueField2 and Intel 710 already), and eventually I want to do regression tests.

Can you also let us know additional info on the changelog that describes the low performance with small pkt sizes?
If you'll go to nvidia's website, you'll see that all firmwares below 28.39.1002 was retracted "due to critical issue". And for 28.39.1002 changelog have A LOT of interesting stuff:
  • Description: Fixed an issue that led to packet drops on lossless fabric due to an Rx buffer overflow.
  • Description: Fixed a HW bug that resulted in transaction loss that when cache replacement transaction occurs in parallel to code transcoding.
  • Description: Fixed a statics issue that caused the i2c access to module to lock and stuck the switch. --- that is probably what I have.

And there were more fixes like that in between:
  • Description: When connecting a ConnectX-7 adapter card to ConnectX-7 adapter card and one side is configured to RM Loopback, and the port is toggled, link flap maybe experienced.
  • Description: Fixed inconsistent TCP performance when sending multiple streams

And so on, that seems rather important to me and that might be related to my exact use case.



What is interesting for me, that initial firmware version for CX7 is 28.33.2028, but what I got is 28.33.0751 which is obviously older than that.



Meanwhile, I've tried to force-flash the card with mstflint with no luck:
My idea was that in the commit from few years ago they removed possibility to flash encrypted firmware on unencrypted device, so I restored the code around reflashing to that state. However result was that I temporary bricked my card - it was still treating FW as unencrypted and refused to start, so I flashed original one back. I needed to do
Code:
mstflint -d ${PCI_ID} -i ${FW} -nofs --ignore_dev_data burn
to make it work again.
 
Last edited:

Civiloid

Member
Jan 15, 2024
68
47
18
Switzerland
CleanShot 2024-03-23 at 20.05.55.png

That is what I get when I'm trying to run trex test. And behavior is extremely weird, I'm not getting proper performance on a single port configuration either:
CleanShot 2024-03-23 at 20.08.49.png

Just for comparison, that is CX6 (queue_full events are unfortunately normal on line rate on cx6, it can't do more than ~220 Mpps TX and 160Mpps RX):
CleanShot 2024-03-23 at 20.10.20.png

CleanShot 2024-03-23 at 20.11.58.png
 
  • Like
Reactions: nexox

Civiloid

Member
Jan 15, 2024
68
47
18
Switzerland
please try with a qsfp56 dac and let us know the results.
I will probably have QSFP56 DACs in a few weeks, unless they're delayed for any reason. But honestly, I don't expect them to behave much better. Considering that DPDK might require certain firmware in the future, unless there is a way to make it work with encrypted firmware, I'll just resell it.

Btw, I saw your posts in the CX6 thread, where you had some success with mtusb-1. By any chance, have you tried the same approach with CX7?
 

Civiloid

Member
Jan 15, 2024
68
47
18
Switzerland
Quick update:
I've flashed my card with
Code:
FW Version:            28.33.0800
FW Release Date:       27.3.2022
Part Number:           MCX753106ASHEA_DK_Ax
Description:           NVIDIA ConnectX-7 VPI adapter card; 200Gb/s; dual-port QSFP; single port InfiniBand and second port VPI (InfiniBand or Ethernet); PCIe 5.0 x16; secure boot; no crypto; for Nvidia DGX storage - IPN for QA
It is still not encrypted, with secure boot disabled, but that way, it can at least perform after I've switched both ports to Eth (it requires a full power cycle to apply, not just a reboot).

I still wonder how the card would perform with the latest firmware, but unless there is a breakthrough in fixing those cards - I doubt it would happen.
 

klui

༺༻
Feb 3, 2019
907
518
93
New firmware zip package has two files, bin and cbor. cbor has a public key address.
Here is public key:

So, I think private key is hold in other chip on the NIC. maybe is eeprom or other roms.
No, private key will not be on the NIC. A public key is all that's required to validate the authenticity of any signed resource. The private key would be kept at Nvidia at a secure location and only used to sign that resource. In this case this is the internal certificate authority from Nvidia. Meaning it issues the certificates and for a customer to confirm the authenticity of something that was signed all they need to do is use OpenSSL to validate the integrity of the certificate chain stored in the NIC then compare the root certificate against what is published on Nvidia's website.
Code:
> openssl x509 -in nvidia-corim-signer-cx7-id-1.pem -text
Certificate:
    Data:
        Version: 3 (0x2)
        Serial Number:
            dd:8a:f0:a3:8b:9a:37:0a:95:40:25:85:1e:ed:7e:87
        Signature Algorithm: ecdsa-with-SHA384
        Issuer: O = NVIDIA, CN = NVIDIA CoRIM signing CX7 ICA
        Validity
            Not Before: Mar 16 15:46:11 2023 GMT
            Not After : Mar 15 15:46:11 2025 GMT
        Subject: O = NVIDIA, CN = NVIDIA CoRIM signer CX7 ID 1
        Subject Public Key Info:
            Public Key Algorithm: id-ecPublicKey
                Public-Key: (384 bit)
                pub:
                    04:c3:de:e6:41:29:16:85:9a:cd:06:40:83:9f:d0:
                    df:03:56:22:d9:c7:6a:d4:df:1c:1a:71:ef:43:43:
                    1b:ef:10:6d:64:76:a0:a2:33:fb:6e:71:bb:96:2c:
                    d2:72:03:9a:04:62:07:3d:9e:68:4c:ff:10:e8:fa:
                    4d:40:9b:6a:12:06:25:30:ae:20:6f:df:58:f3:ce:
                    a9:2f:b2:66:ba:45:2e:12:fb:13:17:7d:5f:87:65:
                    f5:21:a2:b7:e6:60:89
                ASN1 OID: secp384r1
                NIST CURVE: P-384
        X509v3 extensions:
            X509v3 Subject Key Identifier:
                D8:56:9B:4B:BB:1E:E0:EE:B6:A8:E4:8E:2C:59:BD:24:54:9C:86:D8
            X509v3 Authority Key Identifier:
                CA:22:5A:D5:19:7C:39:49:4D:E2:B4:08:32:7D:7F:9D:AF:A4:62:12
            X509v3 Basic Constraints: critical
                CA:FALSE
            X509v3 Key Usage: critical
                Digital Signature
            Authority Information Access:
                OCSP - URI:http://ocsp.ndis.nvidia.com
            X509v3 CRL Distribution Points:
                Full Name:
                  URI:http://crl.ndis.nvidia.com/crl-corim/l2-cx7
    Signature Algorithm: ecdsa-with-SHA384
    Signature Value:
        30:65:02:31:00:e6:0e:e9:41:97:95:8d:d4:74:cc:80:49:0a:
        52:aa:65:21:b8:e7:46:52:83:5a:0f:7c:fc:37:02:b9:6b:44:
        29:66:10:55:e0:1d:15:3b:6d:af:04:30:c7:86:fe:ba:6b:02:
        30:20:13:a5:d9:93:b9:05:be:5b:5d:56:3e:f5:cd:c5:6e:07:
        28:8a:36:70:9f:2b:83:40:ab:ab:51:3b:23:89:b1:f9:ed:48:
        2e:78:da:f1:89:7d:29:e4:41:c2:b7:85:3f
 

empire4th

New Member
Mar 30, 2024
2
0
1
No, private key will not be on the NIC. A public key is all that's required to validate the authenticity of any signed resource. The private key would be kept at Nvidia at a secure location and only used to sign that resource. In this case this is the internal certificate authority from Nvidia. Meaning it issues the certificates and for a customer to confirm the authenticity of something that was signed all they need to do is use OpenSSL to validate the integrity of the certificate chain stored in the NIC then compare the root certificate against what is published on Nvidia's website.
Code:
> openssl x509 -in nvidia-corim-signer-cx7-id-1.pem -text
Certificate:
    Data:
        Version: 3 (0x2)
        Serial Number:
            dd:8a:f0:a3:8b:9a:37:0a:95:40:25:85:1e:ed:7e:87
        Signature Algorithm: ecdsa-with-SHA384
        Issuer: O = NVIDIA, CN = NVIDIA CoRIM signing CX7 ICA
        Validity
            Not Before: Mar 16 15:46:11 2023 GMT
            Not After : Mar 15 15:46:11 2025 GMT
        Subject: O = NVIDIA, CN = NVIDIA CoRIM signer CX7 ID 1
        Subject Public Key Info:
            Public Key Algorithm: id-ecPublicKey
                Public-Key: (384 bit)
                pub:
                    04:c3:de:e6:41:29:16:85:9a:cd:06:40:83:9f:d0:
                    df:03:56:22:d9:c7:6a:d4:df:1c:1a:71:ef:43:43:
                    1b:ef:10:6d:64:76:a0:a2:33:fb:6e:71:bb:96:2c:
                    d2:72:03:9a:04:62:07:3d:9e:68:4c:ff:10:e8:fa:
                    4d:40:9b:6a:12:06:25:30:ae:20:6f:df:58:f3:ce:
                    a9:2f:b2:66:ba:45:2e:12:fb:13:17:7d:5f:87:65:
                    f5:21:a2:b7:e6:60:89
                ASN1 OID: secp384r1
                NIST CURVE: P-384
        X509v3 extensions:
            X509v3 Subject Key Identifier:
                D8:56:9B:4B:BB:1E:E0:EE:B6:A8:E4:8E:2C:59:BD:24:54:9C:86:D8
            X509v3 Authority Key Identifier:
                CA:22:5A:D5:19:7C:39:49:4D:E2:B4:08:32:7D:7F:9D:AF:A4:62:12
            X509v3 Basic Constraints: critical
                CA:FALSE
            X509v3 Key Usage: critical
                Digital Signature
            Authority Information Access:
                OCSP - URI:http://ocsp.ndis.nvidia.com
            X509v3 CRL Distribution Points:
                Full Name:
                  URI:http://crl.ndis.nvidia.com/crl-corim/l2-cx7
    Signature Algorithm: ecdsa-with-SHA384
    Signature Value:
        30:65:02:31:00:e6:0e:e9:41:97:95:8d:d4:74:cc:80:49:0a:
        52:aa:65:21:b8:e7:46:52:83:5a:0f:7c:fc:37:02:b9:6b:44:
        29:66:10:55:e0:1d:15:3b:6d:af:04:30:c7:86:fe:ba:6b:02:
        30:20:13:a5:d9:93:b9:05:be:5b:5d:56:3e:f5:cd:c5:6e:07:
        28:8a:36:70:9f:2b:83:40:ab:ab:51:3b:23:89:b1:f9:ed:48:
        2e:78:da:f1:89:7d:29:e4:41:c2:b7:85:3f
Thanks.

SHA384, encrypt need seed and key , but decrypt only need key.

Hmm, very very hard.