I have two Mellanox ConnectX3 cards connected using Mellonox MC2207130 (56Gb/40Gb FDR) DAC's (no switch). Speed testing on Ubuntu 22.04 in 40Gb Ethernet mode using parallel iperf yields expected results with bandwidth around 32-38 GB/s using 8 processors.
However, when testing on Ubuntu in Infiniband with RDMA mode I am getting results that I do not understand.
Speed tests with iperf3 show a maximum bandwidth < 10Gps even if I run multiple servers in parallel. However, if I run a test with ib_send_bw results show an average bandwidth of 47 GB/s. Is there a problem with speed possibly associated with my configuration? Is this the expected speed for IPoIB? What am I missing?
Note below when I run ibdiagnet I get a suboptimal rate group warning for IPoIB Subnets Check. Also, the multple iperf3 server tests look worse than single server.
Similar results for archlinux were show here: InfiniBand - ArchWiki.
Any guidance and insights that may point me in the right direction would be appreciated (I am new to networking).
See below for test and system information:
ib_send_bw test:
ib_send_bw -s 65535 -i 1 -F --report_gbits
************************************
* Waiting for client to connect... *
************************************
---------------------------------------------------------------------------------------
Send BW Test
Dual-port : OFF Device : ibp68s0
Number of qps : 1 Transport type : IB
Connection type : RC Using SRQ : OFF
PCIe relax order: ON
ibv_wr* API : OFF
RX depth : 512
CQ Moderation : 1
Mtu : 2048
Link type : IB
Max inline data : 0
rdma_cm QPs : OFF
Data ex. method : Ethernet
#bytes #iterations BW peak[Gb/sec] BW average[Gb/sec] MsgRate[Mpps]
65535 1000 0.00 46.50 0.088700
---------------------------------------------------------------------------------------
iperf3 tests using a single server:
[ 5] local 10.16.16.50 port 5101 connected to 10.16.16.51 port 56104
[ ID] Interval Transfer Bitrate
[ 5] 0.00-1.00 sec 283 MBytes 2.37 Gbits/sec
[ 5] 1.00-2.00 sec 299 MBytes 2.51 Gbits/sec
[ 5] 2.00-3.00 sec 293 MBytes 2.45 Gbits/sec
[ 5] 3.00-4.00 sec 290 MBytes 2.43 Gbits/sec
[ 5] 4.00-5.00 sec 293 MBytes 2.46 Gbits/sec
[ 5] 5.00-6.00 sec 292 MBytes 2.45 Gbits/sec
[ 5] 6.00-7.00 sec 287 MBytes 2.41 Gbits/sec
[ 5] 7.00-8.00 sec 291 MBytes 2.44 Gbits/sec
[ 5] 8.00-9.00 sec 294 MBytes 2.47 Gbits/sec
[ 5] 9.00-10.00 sec 453 MBytes 3.80 Gbits/sec
[ 5] 10.00-10.04 sec 25.1 MBytes 4.98 Gbits/sec
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bitrate
[ 5] 0.00-10.04 sec 3.03 GBytes 2.59 Gbits/sec receiver
iperf3 test for 3 serves:
Accepted connection from 10.16.16.51, port 37608
[ 5] local 10.16.16.50 port 5101 connected to 10.16.16.51 port 37610
Accepted connection from 10.16.16.51, port 47732
[ 5] local 10.16.16.50 port 5102 connected to 10.16.16.51 port 47748
Accepted connection from 10.16.16.51, port 35500
[ 5] local 10.16.16.50 port 5103 connected to 10.16.16.51 port 35514
[ ID] Interval Transfer Bitrate
[ 5] 0.00-1.00 sec 208 MBytes 1.75 Gbits/sec
[ ID] Interval Transfer Bitrate
[ 5] 0.00-1.00 sec 108 MBytes 907 Mbits/sec
[ ID] Interval Transfer Bitrate
[ 5] 0.00-1.00 sec 89.5 MBytes 750 Mbits/sec
[ 5] 1.00-2.00 sec 107 MBytes 894 Mbits/sec
[ 5] 1.00-2.00 sec 91.7 MBytes 769 Mbits/sec
[ 5] 1.00-2.00 sec 94.6 MBytes 794 Mbits/sec
[ 5] 2.00-3.00 sec 96.3 MBytes 808 Mbits/sec
[ 5] 2.00-3.00 sec 109 MBytes 918 Mbits/sec
[ 5] 2.00-3.00 sec 110 MBytes 926 Mbits/sec
[ 5] 3.00-4.00 sec 89.5 MBytes 751 Mbits/sec
[ 5] 3.00-4.00 sec 100 MBytes 840 Mbits/sec
[ 5] 3.00-4.00 sec 110 MBytes 924 Mbits/sec
[ 5] 4.00-5.00 sec 92.6 MBytes 777 Mbits/sec
[ 5] 4.00-5.00 sec 89.6 MBytes 751 Mbits/sec
[ 5] 4.00-5.00 sec 116 MBytes 977 Mbits/sec
[ 5] 5.00-6.00 sec 106 MBytes 890 Mbits/sec
[ 5] 5.00-6.00 sec 94.0 MBytes 789 Mbits/sec
[ 5] 5.00-6.00 sec 123 MBytes 1.03 Gbits/sec
[ 5] 6.00-7.00 sec 120 MBytes 1.01 Gbits/sec
[ 5] 6.00-7.00 sec 112 MBytes 938 Mbits/sec
[ 5] 6.00-7.00 sec 114 MBytes 956 Mbits/sec
[ 5] 7.00-8.00 sec 128 MBytes 1.07 Gbits/sec
[ 5] 7.00-8.00 sec 113 MBytes 948 Mbits/sec
[ 5] 7.00-8.00 sec 122 MBytes 1.02 Gbits/sec
[ 5] 8.00-9.00 sec 121 MBytes 1.01 Gbits/sec
[ 5] 8.00-9.00 sec 109 MBytes 916 Mbits/sec
[ 5] 8.00-9.00 sec 118 MBytes 993 Mbits/sec
[ 5] 9.00-10.00 sec 117 MBytes 978 Mbits/sec
[ 5] 10.00-10.04 sec 4.74 MBytes 949 Mbits/sec
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bitrate
[ 5] 0.00-10.04 sec 1.16 GBytes 993 Mbits/sec receiver
-----------------------------------------------------------
Server listening on 5101
-----------------------------------------------------------
[ 5] 9.00-10.00 sec 141 MBytes 1.18 Gbits/sec
[ 5] 10.00-10.04 sec 7.36 MBytes 1.54 Gbits/sec
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bitrate
[ 5] 0.00-10.04 sec 1.05 GBytes 898 Mbits/sec receiver
-----------------------------------------------------------
Server listening on 5102
-----------------------------------------------------------
[ 5] 9.00-10.00 sec 239 MBytes 2.01 Gbits/sec
[ 5] 10.00-10.04 sec 13.5 MBytes 2.52 Gbits/sec
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bitrate
[ 5] 0.00-10.04 sec 1.22 GBytes 1.04 Gbits/sec receiver
-----------------------------------------------------------
Server listening on 5103
-----------------------------------------------------------
# sudo mstconfig -d 40:00.0 q
Device #1:
----------
Device type: ConnectX3
Device: 44:00.0
Configurations: Next Boot
SRIOV_EN False(0)
NUM_OF_VFS 8
LINK_TYPE_P1 IB(1)
LINK_TYPE_P2 IB(1)
LOG_BAR_SIZE 3
BOOT_PKEY_P1 0
BOOT_PKEY_P2 0
BOOT_OPTION_ROM_EN_P1 True(1)
BOOT_VLAN_EN_P1 False(0)
BOOT_RETRY_CNT_P1 0
LEGACY_BOOT_PROTOCOL_P1 PXE(1)
BOOT_VLAN_P1 1
BOOT_OPTION_ROM_EN_P2 True(1)
BOOT_VLAN_EN_P2 False(0)
BOOT_RETRY_CNT_P2 0
LEGACY_BOOT_PROTOCOL_P2 PXE(1)
BOOT_VLAN_P2 1
IP_VER_P1 IPv4(0)
IP_VER_P2 IPv4(0)
CQ_TIMESTAMP True(1)
# ibstatus
Infiniband device 'ibp68s0' port 1 status:
default gid:
base lid: 0x1
sm lid: 0x1
state: 4: ACTIVE
phys state: 5: LinkUp
rate: 56 Gb/sec (4X FDR)
link_layer: InfiniBand
Infiniband device 'ibp68s0' port 2 status:
default gid:
base lid: 0x0
sm lid: 0x0
state: 1: DOWN
phys state: 2: Polling
rate: 10 Gb/sec (4X SDR)
link_layer: InfiniBand
# ibdiagnet
Loading IBDIAGNET from: /usr/lib/x86_64-linux-gnu/ibdiagnet1.5.7
-W- Topology file is not specified.
Reports regarding cluster links will use direct routes.
Loading IBDM from: /usr/lib/x86_64-linux-gnu/ibdm1.5.7
-I- Using port 1 as the local port.
-I- Discovering ... 2 nodes (0 Switches & 2 CA-s) discovered.
-I---------------------------------------------------
-I- Bad Guids/LIDs Info
-I---------------------------------------------------
-I- No bad Guids were found
-I---------------------------------------------------
-I- Links With Logical State = INIT
-I---------------------------------------------------
-I- No bad Links (with logical state = INIT) were found
-I---------------------------------------------------
-I- General Device Info
-I---------------------------------------------------
-I---------------------------------------------------
-I- PM Counters Info
-I---------------------------------------------------
-I- No illegal PM counters values were found
-I---------------------------------------------------
-I- Fabric Partitions Report (see ibdiagnet.pkey for a full hosts list)
-I---------------------------------------------------
-I- PKey:0x7fff Hosts:2 full:2 limited:0
-I---------------------------------------------------
-I- IPoIB Subnets Check
-I---------------------------------------------------
-I- Subnet: IPv4 PKey: MTU:2048Byte rate:10Gbps SL:0x00
-W- Suboptimal rate for group. Lowest member rate:40Gbps > group-rate:10Gbps
-I---------------------------------------------------
-I- Bad Links Info
-I- No bad link were found
-I---------------------------------------------------
----------------------------------------------------------------
-I- Stages Status Report:
STAGE Errors Warnings
Bad GUIDs/LIDs Check 0 0
Link State Active Check 0 0
General Devices Info Report 0 0
Performance Counters Report 0 0
Partitions Check 0 0
IPoIB Subnets Check 0 1
Please see /var/cache/ibutils/ibdiagnet.log for complete log
----------------------------------------------------------------
-I- Done. Run time was 0 seconds.
However, when testing on Ubuntu in Infiniband with RDMA mode I am getting results that I do not understand.
Speed tests with iperf3 show a maximum bandwidth < 10Gps even if I run multiple servers in parallel. However, if I run a test with ib_send_bw results show an average bandwidth of 47 GB/s. Is there a problem with speed possibly associated with my configuration? Is this the expected speed for IPoIB? What am I missing?
Note below when I run ibdiagnet I get a suboptimal rate group warning for IPoIB Subnets Check. Also, the multple iperf3 server tests look worse than single server.
Similar results for archlinux were show here: InfiniBand - ArchWiki.
Any guidance and insights that may point me in the right direction would be appreciated (I am new to networking).
See below for test and system information:
ib_send_bw test:
ib_send_bw -s 65535 -i 1 -F --report_gbits
************************************
* Waiting for client to connect... *
************************************
---------------------------------------------------------------------------------------
Send BW Test
Dual-port : OFF Device : ibp68s0
Number of qps : 1 Transport type : IB
Connection type : RC Using SRQ : OFF
PCIe relax order: ON
ibv_wr* API : OFF
RX depth : 512
CQ Moderation : 1
Mtu : 2048
Link type : IB
Max inline data : 0
rdma_cm QPs : OFF
Data ex. method : Ethernet
#bytes #iterations BW peak[Gb/sec] BW average[Gb/sec] MsgRate[Mpps]
65535 1000 0.00 46.50 0.088700
---------------------------------------------------------------------------------------
iperf3 tests using a single server:
[ 5] local 10.16.16.50 port 5101 connected to 10.16.16.51 port 56104
[ ID] Interval Transfer Bitrate
[ 5] 0.00-1.00 sec 283 MBytes 2.37 Gbits/sec
[ 5] 1.00-2.00 sec 299 MBytes 2.51 Gbits/sec
[ 5] 2.00-3.00 sec 293 MBytes 2.45 Gbits/sec
[ 5] 3.00-4.00 sec 290 MBytes 2.43 Gbits/sec
[ 5] 4.00-5.00 sec 293 MBytes 2.46 Gbits/sec
[ 5] 5.00-6.00 sec 292 MBytes 2.45 Gbits/sec
[ 5] 6.00-7.00 sec 287 MBytes 2.41 Gbits/sec
[ 5] 7.00-8.00 sec 291 MBytes 2.44 Gbits/sec
[ 5] 8.00-9.00 sec 294 MBytes 2.47 Gbits/sec
[ 5] 9.00-10.00 sec 453 MBytes 3.80 Gbits/sec
[ 5] 10.00-10.04 sec 25.1 MBytes 4.98 Gbits/sec
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bitrate
[ 5] 0.00-10.04 sec 3.03 GBytes 2.59 Gbits/sec receiver
iperf3 test for 3 serves:
Accepted connection from 10.16.16.51, port 37608
[ 5] local 10.16.16.50 port 5101 connected to 10.16.16.51 port 37610
Accepted connection from 10.16.16.51, port 47732
[ 5] local 10.16.16.50 port 5102 connected to 10.16.16.51 port 47748
Accepted connection from 10.16.16.51, port 35500
[ 5] local 10.16.16.50 port 5103 connected to 10.16.16.51 port 35514
[ ID] Interval Transfer Bitrate
[ 5] 0.00-1.00 sec 208 MBytes 1.75 Gbits/sec
[ ID] Interval Transfer Bitrate
[ 5] 0.00-1.00 sec 108 MBytes 907 Mbits/sec
[ ID] Interval Transfer Bitrate
[ 5] 0.00-1.00 sec 89.5 MBytes 750 Mbits/sec
[ 5] 1.00-2.00 sec 107 MBytes 894 Mbits/sec
[ 5] 1.00-2.00 sec 91.7 MBytes 769 Mbits/sec
[ 5] 1.00-2.00 sec 94.6 MBytes 794 Mbits/sec
[ 5] 2.00-3.00 sec 96.3 MBytes 808 Mbits/sec
[ 5] 2.00-3.00 sec 109 MBytes 918 Mbits/sec
[ 5] 2.00-3.00 sec 110 MBytes 926 Mbits/sec
[ 5] 3.00-4.00 sec 89.5 MBytes 751 Mbits/sec
[ 5] 3.00-4.00 sec 100 MBytes 840 Mbits/sec
[ 5] 3.00-4.00 sec 110 MBytes 924 Mbits/sec
[ 5] 4.00-5.00 sec 92.6 MBytes 777 Mbits/sec
[ 5] 4.00-5.00 sec 89.6 MBytes 751 Mbits/sec
[ 5] 4.00-5.00 sec 116 MBytes 977 Mbits/sec
[ 5] 5.00-6.00 sec 106 MBytes 890 Mbits/sec
[ 5] 5.00-6.00 sec 94.0 MBytes 789 Mbits/sec
[ 5] 5.00-6.00 sec 123 MBytes 1.03 Gbits/sec
[ 5] 6.00-7.00 sec 120 MBytes 1.01 Gbits/sec
[ 5] 6.00-7.00 sec 112 MBytes 938 Mbits/sec
[ 5] 6.00-7.00 sec 114 MBytes 956 Mbits/sec
[ 5] 7.00-8.00 sec 128 MBytes 1.07 Gbits/sec
[ 5] 7.00-8.00 sec 113 MBytes 948 Mbits/sec
[ 5] 7.00-8.00 sec 122 MBytes 1.02 Gbits/sec
[ 5] 8.00-9.00 sec 121 MBytes 1.01 Gbits/sec
[ 5] 8.00-9.00 sec 109 MBytes 916 Mbits/sec
[ 5] 8.00-9.00 sec 118 MBytes 993 Mbits/sec
[ 5] 9.00-10.00 sec 117 MBytes 978 Mbits/sec
[ 5] 10.00-10.04 sec 4.74 MBytes 949 Mbits/sec
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bitrate
[ 5] 0.00-10.04 sec 1.16 GBytes 993 Mbits/sec receiver
-----------------------------------------------------------
Server listening on 5101
-----------------------------------------------------------
[ 5] 9.00-10.00 sec 141 MBytes 1.18 Gbits/sec
[ 5] 10.00-10.04 sec 7.36 MBytes 1.54 Gbits/sec
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bitrate
[ 5] 0.00-10.04 sec 1.05 GBytes 898 Mbits/sec receiver
-----------------------------------------------------------
Server listening on 5102
-----------------------------------------------------------
[ 5] 9.00-10.00 sec 239 MBytes 2.01 Gbits/sec
[ 5] 10.00-10.04 sec 13.5 MBytes 2.52 Gbits/sec
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bitrate
[ 5] 0.00-10.04 sec 1.22 GBytes 1.04 Gbits/sec receiver
-----------------------------------------------------------
Server listening on 5103
-----------------------------------------------------------
# sudo mstconfig -d 40:00.0 q
Device #1:
----------
Device type: ConnectX3
Device: 44:00.0
Configurations: Next Boot
SRIOV_EN False(0)
NUM_OF_VFS 8
LINK_TYPE_P1 IB(1)
LINK_TYPE_P2 IB(1)
LOG_BAR_SIZE 3
BOOT_PKEY_P1 0
BOOT_PKEY_P2 0
BOOT_OPTION_ROM_EN_P1 True(1)
BOOT_VLAN_EN_P1 False(0)
BOOT_RETRY_CNT_P1 0
LEGACY_BOOT_PROTOCOL_P1 PXE(1)
BOOT_VLAN_P1 1
BOOT_OPTION_ROM_EN_P2 True(1)
BOOT_VLAN_EN_P2 False(0)
BOOT_RETRY_CNT_P2 0
LEGACY_BOOT_PROTOCOL_P2 PXE(1)
BOOT_VLAN_P2 1
IP_VER_P1 IPv4(0)
IP_VER_P2 IPv4(0)
CQ_TIMESTAMP True(1)
# ibstatus
Infiniband device 'ibp68s0' port 1 status:
default gid:
base lid: 0x1
sm lid: 0x1
state: 4: ACTIVE
phys state: 5: LinkUp
rate: 56 Gb/sec (4X FDR)
link_layer: InfiniBand
Infiniband device 'ibp68s0' port 2 status:
default gid:
base lid: 0x0
sm lid: 0x0
state: 1: DOWN
phys state: 2: Polling
rate: 10 Gb/sec (4X SDR)
link_layer: InfiniBand
# ibdiagnet
Loading IBDIAGNET from: /usr/lib/x86_64-linux-gnu/ibdiagnet1.5.7
-W- Topology file is not specified.
Reports regarding cluster links will use direct routes.
Loading IBDM from: /usr/lib/x86_64-linux-gnu/ibdm1.5.7
-I- Using port 1 as the local port.
-I- Discovering ... 2 nodes (0 Switches & 2 CA-s) discovered.
-I---------------------------------------------------
-I- Bad Guids/LIDs Info
-I---------------------------------------------------
-I- No bad Guids were found
-I---------------------------------------------------
-I- Links With Logical State = INIT
-I---------------------------------------------------
-I- No bad Links (with logical state = INIT) were found
-I---------------------------------------------------
-I- General Device Info
-I---------------------------------------------------
-I---------------------------------------------------
-I- PM Counters Info
-I---------------------------------------------------
-I- No illegal PM counters values were found
-I---------------------------------------------------
-I- Fabric Partitions Report (see ibdiagnet.pkey for a full hosts list)
-I---------------------------------------------------
-I- PKey:0x7fff Hosts:2 full:2 limited:0
-I---------------------------------------------------
-I- IPoIB Subnets Check
-I---------------------------------------------------
-I- Subnet: IPv4 PKey: MTU:2048Byte rate:10Gbps SL:0x00
-W- Suboptimal rate for group. Lowest member rate:40Gbps > group-rate:10Gbps
-I---------------------------------------------------
-I- Bad Links Info
-I- No bad link were found
-I---------------------------------------------------
----------------------------------------------------------------
-I- Stages Status Report:
STAGE Errors Warnings
Bad GUIDs/LIDs Check 0 0
Link State Active Check 0 0
General Devices Info Report 0 0
Performance Counters Report 0 0
Partitions Check 0 0
IPoIB Subnets Check 0 1
Please see /var/cache/ibutils/ibdiagnet.log for complete log
----------------------------------------------------------------
-I- Done. Run time was 0 seconds.