Help getting 56Gb on ConnectX-3 Pros

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

mellanvidia

New Member
Jun 14, 2023
6
0
1
Hello,

Long time, first time...

I bought one of these 100G QSFP28 DAC Cable - 100GBASE-CR4 QSFP28 to QSFP28 to connect 2x Mellanox ConnectX-3 Pro ENs (MCX314A-BCCT) with, hoping to get 56Gbps out of them but they only negotiate at 10Gbps.

According to Testing Results: Mellanox ConnectX-3 and achievable link speeds, MCP1600-C003 should negotiate at 56Gbps in Ethernet mode. I also read somewhere on here that QSFP28 DAC is backward compatible with QSFP14 DAC and figured they would just be Plug-n-Play.

Do I need to configure anything on the cards to get 56Gbps? Or anything I can try before returning this cable?

Thank you.

Code:
PS C:\Program Files\Mellanox\WinMFT> get-netadapter

Name                      InterfaceDescription                    ifIndex Status       MacAddress             LinkSpeed
----                      --------------------                    ------- ------       ----------             ---------
SMB-1                     Mellanox ConnectX-3 Pro Ethernet Ada...      14 Up           E4-1D-2D-12-8B-30        10 Gbps
SMB-2                     Mellanox ConnectX-3 Pro Ethernet A...#2       8 Up           E4-1D-2D-12-8B-31        40 Gbps
Code:
PS C:\Program Files\Mellanox\WinMFT> mlxfwmanager --query
Querying Mellanox devices firmware ...

Device #2:
----------

  Device Type:      ConnectX3Pro
  Part Number:      MCX314A-BCC_Ax
  Description:      ConnectX-3 Pro EN network interface card; 40GigE; dual-port QSFP; PCIe3.0 x8 8GT/s; RoHS R6
  PSID:             MT_1090111023
  PCI Device Name:  mt4103_pci_cr1
  Port1 MAC:        e41d2d128b30
  Port2 MAC:        e41d2d128b31
  Versions:         Current        Available
     FW             2.42.5000      N/A
     PXE            3.4.0752       N/A

  Status:           No matching image found
 

i386

Well-Known Member
Mar 18, 2016
4,253
1,548
113
34
Germany
56gbe should be plug and play, 10gbe is the fallback solution that all hosts/network devices involved support in that setup.

Is there a switch involved?
Do you have 1 known working cable? (to sort out any problems related to the cable)
 
  • Like
Reactions: mellanvidia

Renat

Member
Jun 8, 2016
61
19
8
41
only Mellanox switch can setup 56gb for you!
SX6/1*** or next generation SN2xx

You can`t connect 2 cards and get 56Gb, only 40Gb

But this record is strange "E4-1D-2D-12-8B-30 10 Gbps"

Why one/two cards show different speed? or this is your settings?
 

klui

Well-Known Member
Feb 3, 2019
846
465
63
That's not true. You can change the speed using ethtool.

Code:
6: enp5s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
    link/ether f4:52:14:15:4c:11 brd ff:ff:ff:ff:ff:ff
    inet 10.10.10.10/32 scope global enp5s0
       valid_lft forever preferred_lft forever
    inet6 fe80::f652:14ff:fe15:4c11/64 scope link
       valid_lft forever preferred_lft forever
7: enp5s0d1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
    link/ether f4:52:14:15:4c:12 brd ff:ff:ff:ff:ff:ff
    inet 10.10.10.20/32 scope global enp5s0d1
       valid_lft forever preferred_lft forever
    inet6 fe80::f652:14ff:fe15:4c12/64 scope link
       valid_lft forever preferred_lft forever
root@ubuntu:~# ethtool enp5s0
Settings for enp5s0:
        Supported ports: [ FIBRE ]
        Supported link modes:   1000baseKX/Full
                                10000baseKX4/Full
                                10000baseKR/Full
                                40000baseCR4/Full
                                40000baseSR4/Full
                                56000baseCR4/Full
                                56000baseSR4/Full
        Supported pause frame use: Symmetric Receive-only
        Supports auto-negotiation: Yes
        Supported FEC modes: Not reported
        Advertised link modes:  56000baseCR4/Full
                                56000baseSR4/Full
        Advertised pause frame use: Symmetric
        Advertised auto-negotiation: Yes
        Advertised FEC modes: Not reported
        Speed: 56000Mb/s
        Duplex: Full
        Port: None
        PHYAD: 0
        Transceiver: internal
        Auto-negotiation: off
        Supports Wake-on: d
        Wake-on: d
        Current message level: 0x00000014 (20)
                               link ifdown
        Link detected: yes
 
  • Like
Reactions: mellanvidia

Stephan

Well-Known Member
Apr 21, 2017
947
715
93
Germany
I recently tested CX3 non-Pro 2-port cards, looping back to themselves using a short FDR (56 Gbps capable) cable, using Linux namespaces, no switch, only ethtool with speed 56000 autoneg off setting. On same host talking to itself I got ~41 Gbps, with two hosts ~45-48 Gbps. Saturation of link use was at around four sender threads. So link speed is not lied about, it indeed syncs at 56 Gbps. Otherwise it wouldn't be able to go above 40. But I could not reach 56, no matter the tool, cores used, sysctl or TCP tweaks etc. Tried ntttcp and nuttcp on Arch.
 
  • Like
Reactions: mellanvidia

mellanvidia

New Member
Jun 14, 2023
6
0
1
No switch, just 2 servers with CX3 Pros directly connected to each other. The 40 Gbps port is connected using this 40G QSFP+ DAC and working as expected.

Anyone know of a Windows equivalent tool for 'ethtool'? Research shows PowerShell cmdlet 'Set-NetAdapterAdvancedProperty' could be used. If I'm not mistaken, that's the same as the Advanced tab for the Network Adapter in Device Manager, and would mean 56 Gb would need to be visible and selectable as an option for Speed & Duplex setting...No?

Here's the list of options for the Advanced tab and I'm not seeing any Speed and/or Duplex related settings.
Code:
Name  DisplayName                       DisplayValue            RegistryKeyword                     RegistryValue
----  -----------                       ------------            ---------------                     -------------
SMB-1 Encapsulation Overhead            0                       *EncapOverhead                      {0}
SMB-1 Encapsulated Task Offload         Enabled                 *EncapsulatedPacketTaskOffload      {1}
SMB-1 NVGRE Encapsulated Task Offload   Enabled                 *EncapsulatedPacketTaskOffloadNvgre {1}
SMB-1 VXLAN Encapsulated Task Offload   Enabled                 *EncapsulatedPacketTaskOffloadVxlan {1}
SMB-1 Flow Control                      Disabled                *FlowControl                        {0}
SMB-1 Interrupt Moderation              Enabled                 *InterruptModeration                {1}
SMB-1 IPV4 Checksum Offload             Rx & Tx Enabled         *IPChecksumOffloadIPv4              {3}
SMB-1 Jumbo Packet                      9014                    *JumboPacket                        {9014}
SMB-1 Large Send Offload V2 (IPv4)      Enabled                 *LsoV2IPv4                          {1}
SMB-1 Large Send Offload V2 (IPv6)      Enabled                 *LsoV2IPv6                          {1}
SMB-1 Maximum number of RSS Processors  8                       *MaxRssProcessors                   {8}
SMB-1 NetworkDirect Functionality       Enabled                 *NetworkDirect                      {1}
SMB-1 Preferred NUMA node               Node 0                  *NumaNodeId                         {0}
SMB-1 Maximum Number of RSS Queues      8                       *NumRSSQueues                       {8}
SMB-1 PacketDirect Functionality        Enabled                 *PacketDirect                       {1}
SMB-1 Priority & Vlan Tag               Priority & VLAN Enabled *PriorityVLANTag                    {3}
SMB-1 Quality Of Service                Enabled                 *QOS                                {1}
SMB-1 Receive Buffers                   4096                    *ReceiveBuffers                     {4096}
SMB-1 Recv Segment Coalescing (IPv4)    Disabled                *RscIPv4                            {0}
SMB-1 Recv Segment Coalescing (IPv6)    Disabled                *RscIPv6                            {0}
SMB-1 Receive Side Scaling              Enabled                 *RSS                                {1}
SMB-1 RSS Base Processor Number         2                       *RssBaseProcNumber                  {2}
SMB-1 Virtual Switch RSS                Enabled                 *RssOnHostVPorts                    {1}
SMB-1 RSS load balancing Profile        ClosestProcessorStatic  *RSSProfile                         {2}
SMB-1 SR-IOV                            Enabled                 *Sriov                              {1}
SMB-1 TCP/UDP Checksum Offload (IPv4)   Rx & Tx Enabled         *TCPUDPChecksumOffloadIPv4          {3}
SMB-1 TCP/UDP Checksum Offload (IPv6)   Rx & Tx Enabled         *TCPUDPChecksumOffloadIPv6          {3}
SMB-1 Send Buffers                      2048                    *TransmitBuffers                    {2048}
SMB-1 Virtual Machine Queues            Enabled                 *VMQ                                {1}
SMB-1 VMQ VLAN Filtering                Enabled                 *VMQVlanFiltering                   {1}
SMB-1 VXLAN UDP destination port number 4789                    *VxlanUDPPortNumber                 {4789}
SMB-1 Ignore FCS errors                 Disabled                IgnoreFCS                           {0}
SMB-1 Locally Administered Address      --                      NetworkAddress                      {--}
SMB-1 Transmit Control Blocks           16                      NumTcb                              {16}
SMB-1 Receive Completion Method         Adaptive                RecvCompletionMethod                {1}
SMB-1 R/RoCE Max Frame Size             2048                    RoceMaxFrameSize                    {2048}
SMB-1 Rx Buffer Alignment               0                       RxBufferAlignment                   {0}
SMB-1 Rx Interrupt Moderation Type      Adaptive                RxIntModeration                     {2}
SMB-1 Rx Interrupt Moderation Profile   Moderate                RxIntModerationProfile              {1}
SMB-1 Number of Polls on Receive        10000                   ThreadPoll                          {10000}
SMB-1 Tx Throughput Port Arbiter        Best Effort (Default)   TxBwPrecedence                      {0}
SMB-1 Tx Interrupt Moderation Profile   Moderate                TxIntModerationProfile              {1}
SMB-1 VLAN ID                           245                     VlanID                              {245}
 

mellanvidia

New Member
Jun 14, 2023
6
0
1
Unfortunately, mlxlink doesn't seem to be compatible with ConnectX-3. All I'm getting returned is "-E- Device is not supported", even tried different device labels. Rebooting the servers eventually got the links to 40 Gbps, just not 56. To validate, I ran ntttcp and iperf but they still max out at 40.
Code:
PS C:\Program Files\Mellanox\WinMFT> mst status -v
MST devices:
------------
  mt4103_pci_cr1         bus:dev.fn=3e:00.0
  mt4103_pciconf1        bus:dev.fn=3e:00.0
PS C:\Program Files\Mellanox\WinMFT> mlxlink -d "3e:00.0"

-E- Device is not supported

PS C:\Program Files\Mellanox\WinMFT> mlxlink -d "3e:00.1"

-E- Failed to open device: "3e:00.1", No such device

PS C:\Program Files\Mellanox\WinMFT> mlxlink -d "04:00.0"

-E- Device is not supported

PS C:\Program Files\Mellanox\WinMFT> mlxlink -d "mt4103_pci_cr1"

-E- Device is not supported

PS C:\Program Files\Mellanox\WinMFT> mlxlink -d "mt4103_pciconf1"

-E- Device is not supported
ntttcp
Code:
PS C:\Program Files\Mellanox\WinMFT> D:\Apps\ntttcp.exe -s -m 8,*,10.1.246.2 -l 1048576 -n 100000 -w -a 16 -t 20
Copyright Version 5.39
Network activity progressing...


Thread  Time(s) Throughput(KB/s) Avg B / Compl
======  ======= ================ =============
     0   19.998       417321.732   1048576.000
     1   19.998       369086.509   1048576.000
     2   19.998       768076.808   1048576.000
     3   19.998       784001.600   1048576.000
     4   19.998       416963.296   1048576.000
     5   19.998       851080.308   1048576.000
     6   19.998       863881.588   1048576.000
     7   19.998       367140.714   1048576.000


#####  Totals:  #####


   Bytes(MEG)    realtime(s) Avg Frame Size Throughput(MB/s)
================ =========== ============== ================
    94474.000000      20.000       8953.293         4723.735


Throughput(Buffers/s) Cycles/Byte       Buffers
===================== =========== =============
             4723.735       0.678     94474.000


DPCs(count/s) Pkts(num/DPC)   Intr(count/s) Pkts(num/intr)
============= ============= =============== ==============
   272852.477         0.955      316849.804          0.822


Packets Sent Packets Received Retransmits Errors Avg. CPU %
============ ================ =========== ====== ==========
    11064440          5211263         245      0      3.783
iperf
Code:
PS C:\Program Files\Mellanox\WinMFT> D:\Apps\iperf-3.1.3-win64\iperf3.exe -s -B 10.1.246.1
-----------------------------------------------------------
Server listening on 5201
-----------------------------------------------------------
Accepted connection from 10.1.246.2, port 55960
[  5] local 10.1.246.1 port 5201 connected to 10.1.246.2 port 55961
[  7] local 10.1.246.1 port 5201 connected to 10.1.246.2 port 55962
[  9] local 10.1.246.1 port 5201 connected to 10.1.246.2 port 55963
[ 11] local 10.1.246.1 port 5201 connected to 10.1.246.2 port 55964
[ 13] local 10.1.246.1 port 5201 connected to 10.1.246.2 port 55965
[ 15] local 10.1.246.1 port 5201 connected to 10.1.246.2 port 55966
[ 17] local 10.1.246.1 port 5201 connected to 10.1.246.2 port 55967
[ 19] local 10.1.246.1 port 5201 connected to 10.1.246.2 port 55968
[ 21] local 10.1.246.1 port 5201 connected to 10.1.246.2 port 55969
[ 23] local 10.1.246.1 port 5201 connected to 10.1.246.2 port 55970
[ 25] local 10.1.246.1 port 5201 connected to 10.1.246.2 port 55971
[ 27] local 10.1.246.1 port 5201 connected to 10.1.246.2 port 55972
[ 29] local 10.1.246.1 port 5201 connected to 10.1.246.2 port 55973
[ 31] local 10.1.246.1 port 5201 connected to 10.1.246.2 port 55974
[ 33] local 10.1.246.1 port 5201 connected to 10.1.246.2 port 55975
[ 35] local 10.1.246.1 port 5201 connected to 10.1.246.2 port 55976
[ ID] Interval           Transfer     Bandwidth
[  5]   0.00-1.00   sec   276 MBytes  2.32 Gbits/sec
[  7]   0.00-1.00   sec   276 MBytes  2.32 Gbits/sec
[  9]   0.00-1.00   sec   269 MBytes  2.25 Gbits/sec
[ 11]   0.00-1.00   sec   276 MBytes  2.31 Gbits/sec
[ 13]   0.00-1.00   sec   275 MBytes  2.31 Gbits/sec
[ 15]   0.00-1.00   sec   275 MBytes  2.31 Gbits/sec
[ 17]   0.00-1.00   sec   275 MBytes  2.31 Gbits/sec
[ 19]   0.00-1.00   sec   275 MBytes  2.31 Gbits/sec
[ 21]   0.00-1.00   sec   275 MBytes  2.30 Gbits/sec
[ 23]   0.00-1.00   sec   274 MBytes  2.30 Gbits/sec
[ 25]   0.00-1.00   sec   275 MBytes  2.30 Gbits/sec
[ 27]   0.00-1.00   sec   274 MBytes  2.30 Gbits/sec
[ 29]   0.00-1.00   sec   274 MBytes  2.30 Gbits/sec
[ 31]   0.00-1.00   sec   275 MBytes  2.30 Gbits/sec
[ 33]   0.00-1.00   sec   274 MBytes  2.30 Gbits/sec
[ 35]   0.00-1.00   sec   275 MBytes  2.31 Gbits/sec
[SUM]   0.00-1.00   sec  4.29 GBytes  36.9 Gbits/sec
- - - - - - - - - - - - - - - - - - - - - - - - -
[  5]   1.00-2.00   sec   274 MBytes  2.30 Gbits/sec
[  7]   1.00-2.00   sec   274 MBytes  2.30 Gbits/sec
[  9]   1.00-2.00   sec   274 MBytes  2.29 Gbits/sec
[ 11]   1.00-2.00   sec   274 MBytes  2.29 Gbits/sec
[ 13]   1.00-2.00   sec   273 MBytes  2.29 Gbits/sec
[ 15]   1.00-2.00   sec   273 MBytes  2.29 Gbits/sec
[ 17]   1.00-2.00   sec   273 MBytes  2.29 Gbits/sec
[ 19]   1.00-2.00   sec   273 MBytes  2.29 Gbits/sec
[ 21]   1.00-2.00   sec   273 MBytes  2.29 Gbits/sec
[ 23]   1.00-2.00   sec   273 MBytes  2.29 Gbits/sec
[ 25]   1.00-2.00   sec   273 MBytes  2.29 Gbits/sec
[ 27]   1.00-2.00   sec   273 MBytes  2.29 Gbits/sec
[ 29]   1.00-2.00   sec   272 MBytes  2.28 Gbits/sec
[ 31]   1.00-2.00   sec   272 MBytes  2.28 Gbits/sec
[ 33]   1.00-2.00   sec   271 MBytes  2.28 Gbits/sec
[ 35]   1.00-2.00   sec   272 MBytes  2.28 Gbits/sec
[SUM]   1.00-2.00   sec  4.26 GBytes  36.6 Gbits/sec
- - - - - - - - - - - - - - - - - - - - - - - - -
[  5]   2.00-3.00   sec   270 MBytes  2.27 Gbits/sec
[  7]   2.00-3.00   sec   270 MBytes  2.26 Gbits/sec
[  9]   2.00-3.00   sec   270 MBytes  2.27 Gbits/sec
[ 11]   2.00-3.00   sec   270 MBytes  2.27 Gbits/sec
[ 13]   2.00-3.00   sec   270 MBytes  2.27 Gbits/sec
[ 15]   2.00-3.00   sec   272 MBytes  2.28 Gbits/sec
[ 17]   2.00-3.00   sec   270 MBytes  2.27 Gbits/sec
[ 19]   2.00-3.00   sec   272 MBytes  2.28 Gbits/sec
[ 21]   2.00-3.00   sec   270 MBytes  2.26 Gbits/sec
[ 23]   2.00-3.00   sec   270 MBytes  2.26 Gbits/sec
[ 25]   2.00-3.00   sec   270 MBytes  2.26 Gbits/sec
[ 27]   2.00-3.00   sec   269 MBytes  2.26 Gbits/sec
[ 29]   2.00-3.00   sec   269 MBytes  2.26 Gbits/sec
[ 31]   2.00-3.00   sec   271 MBytes  2.27 Gbits/sec
[ 33]   2.00-3.00   sec   268 MBytes  2.25 Gbits/sec
[ 35]   2.00-3.00   sec   270 MBytes  2.27 Gbits/sec
[SUM]   2.00-3.00   sec  4.22 GBytes  36.3 Gbits/sec

These cards are also in the first PCI-E 4.0 slot so no reason to not have the highest available bandwidth to them.

Specs
Code:
get-netadapterhardwareinfo | ? Name -like 'SMB-*' | ft -a
PS C:\Program Files\Mellanox\WinMFT> get-netadapterhardwareinfo | ? Name -like 'SMB-*' | ft -a

Name  Segment Bus Device Function Slot NumaNode PcieLinkSpeed PcieLinkWidth Version
----  ------- --- ------ -------- ---- -------- ------------- ------------- -------
SMB-1       0  10      0        0                    8.0 GT/s             8 1.1
SMB-2       0  10      0        0                    8.0 GT/s             8 1.1

PS C:\Program Files\Mellanox\WinMFT> get-netadapterhardwareinfo | ? Name -like 'SMB-*' | ft -a

Name  Segment Bus Device Function Slot NumaNode PcieLinkSpeed PcieLinkWidth Version
----  ------- --- ------ -------- ---- -------- ------------- ------------- -------
SMB-1       0  62      0        0                    8.0 GT/s             8 1.1
SMB-2       0  62      0        0                    8.0 GT/s             8 1.1

PS C:\Program Files\Mellanox\WinMFT> vstat

        hca_idx=1
        uplink={BUS=PCI_E Gen3, SPEED=8.0 Gbps, WIDTH=x8, CAPS=8.0*x8}
        MSI-X={ENABLED=1, SUPPORTED=128, GRANTED=14, ALL_MASKED=N}
        vendor_id=0x02c9
        vendor_part_id=4103
        hw_ver=0x0
        fw_ver=2.42.5000
        PSID=MT_1090111023
        node_guid=e41d:2d03:0012:8b30
        num_phys_ports=2
                port=1
                port_guid=e61d:2dff:fe12:8b30
                port_state=PORT_ACTIVE (4)
                link_speed=NA
                link_width=NA
                rate=40.00 Gbps
                port_phys_state=LINK_UP (5)
                active_speed=40.00 Gbps
                sm_lid=0x0000
                port_lid=0x0000
                port_lmc=0x0
                transport=RoCE v2.0
                rroce_udp_port=0x12b7
                max_mtu=2048 (4)
                active_mtu=2048 (4)

                port=2
                port_guid=e61d:2dff:fe12:8b31
                port_state=PORT_ACTIVE (4)
                link_speed=NA
                link_width=NA
                rate=40.00 Gbps
                port_phys_state=LINK_UP (5)
                active_speed=40.00 Gbps
                sm_lid=0x0000
                port_lid=0x0000
                port_lmc=0x0
                transport=RoCE v2.0
                rroce_udp_port=0x12b7
                max_mtu=2048 (4)
                active_mtu=2048 (4)

On another note, I ordered the official Mellanox FDR Passive Copper Cable VPI 56Gb/s and that cable also does not connect or transfer at 56 Gbps, only 40. I've tried all I can think of and just doesn't seem capable/possible on Windows without the ability to set the Speed & Duplex. Usually it's in the Advanced properties of a NIC, but as shown in the table above, it's not there.

MellanoxFDR1.jpg MellanoxFDR2.jpg
 

CyrilB

New Member
Sep 28, 2023
13
1
3
I was also looking for a way to connect two Ethernet ports at 56 Gb instead of 40 Gb on Windows Server and found this article:


It involves changing the HCA ini file and re-burning the firmware but sounds promising.
I haven't tried this myself yet but plan to do so in a week or two.
Good luck!
 
  • Like
Reactions: mellanvidia

i386

Well-Known Member
Mar 18, 2016
4,253
1,548
113
34
Germany
I was also looking for a way to connect two Ethernet ports at 56 Gb instead of 40 Gb on Windows Server and found this article:


It involves changing the HCA ini file and re-burning the firmware but sounds promising.
I haven't tried this myself yet but plan to do so in a week or two.
Good luck!
If you already have the 40GBE firmware on the nic 56GBE should work "out of the box".
And about 56GBE not showing up in ethtool: 56GBE is a (end of life) mellanox proprietary implementation, other devices/software don't know it :D
 

CyrilB

New Member
Sep 28, 2023
13
1
3
If you already have the 40GBE firmware on the nic 56GBE should work "out of the box".
And about 56GBE not showing up in ethtool: 56GBE is a (end of life) mellanox proprietary implementation, other devices/software don't know it :D
With an enabled switch, yes. But not when two ports are directly connected (back-to-back), which is what we are trying to accomplish.
In this scenario, we obviously don't care too much if the solution is proprietary or not.
 

BeTeP

Well-Known Member
Mar 23, 2019
659
437
63

CyrilB

New Member
Sep 28, 2023
13
1
3
This guy has no idea what he is talking about. Renaming .bin into .mlx does not magically make it such.

STH has dozens of threads on editing ini sections of mellanox firmware.
This struck me as strange when I read it, but I can skip this step because I already burned the latest .bin to the adapter.

If I export the .mlx and .ini, edit the .ini, mlxburn both to a new .mlx and then flint the new .mlx I won't have to magically convert the .bin to an mlx.
Or so... I'll try and report my findings eventually.
 

BeTeP

Well-Known Member
Mar 23, 2019
659
437
63
No. Flint does not deal with .mlx files. The firmware dump file is still in bin format. Even if user names the file .mlx