Mellanox ConnectX-3 40gb running at half bandwidth

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

Philip Brink

New Member
Sep 14, 2019
15
4
3
So, I got bit by the 40gb bug while looking for more bandwidth between my proxmox server and desktop. Nice thing is the cards are working and the DAC is working just fine and connection has been fun to play with, but limited in bandwidth (currently stuck at 25.5gbs).

Server: proxmox 6.x, AMD 2400g, 32gb RAM and Asrock x570 Pro4
Desktop: Windows 10, AMD 3900x, 32gb RAM and MSI x470 Carbon Gaming

Using Ethernet mode in windows drivers and a static IP on both ends
Current windows driver
Default drivers in proxmox

Started with about 11gbs connection with iperf3 and iperf going from server to desktop. (not bidirectional)
- Increased to jumbo frames 9000
- Changed driver to single port optimized
Runs iperf3 at 25.5gbs or so from server to desktop (not bidirectional)
- checked lspci on proxmox and shows x8 lanes at pcie3 (using primary x16 slot)
- checked hwinfo64 on windows, shows x8 lanes at pcie3 (using pci slot 3 with gpu in slot 1, both at x8, nvme at x4)
- windows10 vm in proxmox iperf3 to desktop was around 7gbs for default frame size in its drivers, jumbo frames at 13gbs (uses redhat virtual driver)

So questions:
1: is the ethernet mode slowing it to these speeds?
2: something obvious on the windows side I missed for configuration?
3: server or hardware/proxmox slowing down its potential in shell (not vm)?

Been reading for a couple days now, can seem to find a consistent answer to those questions.

Thanks in advance,
Phil
 

NablaSquaredG

Layer 1 Magician
Aug 17, 2020
1,345
820
113
iperf3 is only single threaded / single connection if I remember correctly.

Try running multiple tests at once
 

i386

Well-Known Member
Mar 18, 2016
4,245
1,546
113
34
Germany
the windows iperf3 binaries are usually an old version of iperf3 compiled with a "bugged" version of cygwin.
if you want to use iperf try a linux live system on the windows machine.
if you want to use windows try another tool like ntttcp by microsoft.
 

Bjorn Smith

Well-Known Member
Sep 3, 2019
877
485
63
49
r00t.dk
Its defintely not ethernet slowing it down - I have run 40gbps easy on connectx-3's - My guess is that like @NablaSquaredG writes that you need to use more threads - a single thread will probably not come a lot over 10gbps, so I would start 8 threads in parallel with iperf.
 

Philip Brink

New Member
Sep 14, 2019
15
4
3
tried to get ntttcp-for-linux functional on proxmox. Currently cant get a connection between windows and proxmox, but can create a connection between two powershells on windows and it hits 51gbs.

windows version 5.39
linux version 1.4
windows to windowsVM gets about 11gbs, so inline with iperf3 results
windows to proxmox, no connection
proxmox to windows, no connection

proxmox shell shows:
root@VMStore:~# ntttcp -r -m 8,*,192.168.2.8
NTTTCP for Linux 1.4.0
---------------------------------------------------------
16:38:51 INFO: 9 threads created

and then when i press ctrl-c to cancel on windows
proxmox shell shows:
socket read error: 104
socket read error: 104
socket read error: 104
socket read error: 104
socket read error: 104
socket read error: 104
socket read error: 104
socket read error: 104

so, it seems the two versions are speaking different commands.
 

Philip Brink

New Member
Sep 14, 2019
15
4
3
iperf results with 6 threads:
root@VMStore:~# iperf -c 192.168.2.11 -P 6
------------------------------------------------------------
Client connecting to 192.168.2.11, TCP port 5001
TCP window size: 325 KByte (default)
------------------------------------------------------------
[ 4] local 192.168.2.8 port 52444 connected with 192.168.2.11 port 5001
[ 11] local 192.168.2.8 port 52450 connected with 192.168.2.11 port 5001
[ 5] local 192.168.2.8 port 52442 connected with 192.168.2.11 port 5001
[ 3] local 192.168.2.8 port 52440 connected with 192.168.2.11 port 5001
[ 8] local 192.168.2.8 port 52448 connected with 192.168.2.11 port 5001
[ 6] local 192.168.2.8 port 52446 connected with 192.168.2.11 port 5001
[ ID] Interval Transfer Bandwidth
[ 4] 0.0-10.0 sec 4.25 GBytes 3.65 Gbits/sec
[ 11] 0.0-10.0 sec 3.96 GBytes 3.40 Gbits/sec
[ 5] 0.0-10.0 sec 4.85 GBytes 4.17 Gbits/sec
[ 3] 0.0-10.0 sec 3.89 GBytes 3.34 Gbits/sec
[ 8] 0.0-10.0 sec 4.98 GBytes 4.27 Gbits/sec
[ 6] 0.0-10.0 sec 4.00 GBytes 3.43 Gbits/sec
[SUM] 0.0-10.0 sec 25.9 GBytes 22.3 Gbits/sec
 

Philip Brink

New Member
Sep 14, 2019
15
4
3
moved the card from the server to a second windows desktop. ran ntttcp and got roughly 25.6 gbs.
1666908863601.png
PS C:\Users\Phil\Desktop> .\ntttcp.exe -r -m 8,*,192.168.2.11 -t 20
Copyright Version 5.39
Network activity progressing...


Thread Time(s) Throughput(KB/s) Avg B / Compl
====== ======= ================ =============
0 20.006 451375.035 34225.974
1 20.006 405513.931 33717.355
2 20.007 410262.907 34193.863
3 20.007 427396.314 34749.759
4 20.019 295277.221 33203.634
5 20.006 455231.224 32507.790
6 20.006 428839.888 34330.223
7 20.006 476849.741 32983.133


##### Totals: #####


Bytes(MEG) realtime(s) Avg Frame Size Throughput(MB/s)
================ =========== ============== ================
65468.462849 20.006 8925.547 3272.441


Throughput(Buffers/s) Cycles/Byte Buffers
===================== =========== =============
52359.049 2.220 1047495.406


DPCs(count/s) Pkts(num/DPC) Intr(count/s) Pkts(num/intr)
============= ============= =============== ==============
169920.630 2.263 244103.607 1.575


Packets Sent Packets Received Retransmits Errors Avg. CPU %
============ ================ =========== ====== ==========
1524923 7691255 3 0 7.556
 

Bjorn Smith

Well-Known Member
Sep 3, 2019
877
485
63
49
r00t.dk
Just a stupid question - is it a direct connection between the two machines, or are you going through a switch?

If through a switch - could you just a direct connection - just to rule out the switch?

You could also try switching to testing via UDP - I seem to remember that UDP gave me higher throughput - and remember there is also a --length parameter to control the size of the packages you sent. The bigger the package, the easier it is to reach the max bandwidth.
 

Bjorn Smith

Well-Known Member
Sep 3, 2019
877
485
63
49
r00t.dk
I would try to boot a live cd on both ends (linux) - and then test with iperf from linux to linux - then you have a baseline - you should be able to get 40gbps.

If you cannot get 40gbps between two linux installs I think something is off, it could be the cable or nics.

Also on linux/freebsd etc it can be required to tune socket options for the higher speeds, so buffers do not run out while you are testing (and also if you max out the connection consistently)
 

prdtabim

Active Member
Jan 29, 2022
173
67
28
@Bjorn Smith, it is a direct connection. I don't have a 40gb switch to play with unfortunately.
Yesterday I tested 2 linux boxes with connectx3-pro and a 56Gb/s QSFP connection between them ( 2 56Gb/s Mellanox tranceivers and one OM4 MPO fiber cable ).
Using MTTTCP for linux results in 52Gb/s in one direction and 41Gb/s in other. I'm using MTU 9000 txqueuelen 40000 .
The linux boxes aren't the same config but very powerful ( 5950X in Asrock X570 Creator and 3950X in Asrock X370 PG )
 

Philip Brink

New Member
Sep 14, 2019
15
4
3
perhaps I should up the queue length values? I have them at default still. tx = 2048 and rx = 4096.

Will have to play with that and linux over the weekend.
 

Bjorn Smith

Well-Known Member
Sep 3, 2019
877
485
63
49
r00t.dk
perhaps I should up the queue length values? I have them at default still. tx = 2048 and rx = 4096.

Will have to play with that and linux over the weekend.
I would definetely bump all buffers you can find - on freebsd I am running some buffers in the megabytes size.
FreeBSD Network Performance Tuning @ Calomel.org - some of these are probably applicable for all OS'es - just possibly with different names.
 

prdtabim

Active Member
Jan 29, 2022
173
67
28
Code:
 ./ntttcp -s -m 32,*,192.168.98.100 -n 16 -R -t 20
NTTTCP for Linux 1.4.0
---------------------------------------------------------
18:08:36 INFO: Test cycle time negotiated is: 60 seconds
18:08:36 INFO: 512 threads created
18:08:36 INFO: 512 connections created in 63786 microseconds
18:08:36 INFO: Network activity progressing...
18:08:56 INFO: Test run completed.
18:08:56 INFO: Test cooldown is in progress...
18:09:36 INFO: Test cycle finished.
18:09:36 INFO: receiver exited from current test
18:09:36 INFO: 512 connections tested
18:09:36 INFO: #####  Totals:  #####
18:09:36 INFO: test duration    :20.11 seconds
18:09:36 INFO: total bytes      :131682926592
18:09:36 INFO:   throughput     :52.38Gbps
18:09:36 INFO: tcp retransmit:
18:09:36 INFO:   retrans_segments/sec   :0.00
18:09:36 INFO:   lost_retrans/sec       :0.00
18:09:36 INFO:   syn_retrans/sec        :0.00
18:09:36 INFO:   fast_retrans/sec       :0.00
18:09:36 INFO:   forward_retrans/sec    :0.00
18:09:36 INFO:   slowStart_retrans/sec  :0.00
18:09:36 INFO:   retrans_fail/sec       :0.00
18:09:36 INFO: cpu cores        :32
18:09:36 INFO:   cpu speed      :2200.000MHz
18:09:36 INFO:   user           :0.06%
18:09:36 INFO:   system         :4.17%
18:09:36 INFO:   idle           :91.45%
18:09:36 INFO:   iowait         :0.02%
18:09:36 INFO:   softirq        :4.24%
18:09:36 INFO:   cycles/byte    :0.92
18:09:36 INFO: cpu busy (all)   :151.13%
---------------------------------------------------------
Code:
./ntttcp -s -m 32,*,192.168.98.191 -n 16 -R -t 20           
NTTTCP for Linux 1.4.0
---------------------------------------------------------
18:12:09 INFO: Test cycle time negotiated is: 60 seconds
18:12:09 INFO: 512 threads created
18:12:09 INFO: 512 connections created in 51261 microseconds
18:12:09 INFO: Network activity progressing...
18:12:29 INFO: Test run completed.
18:12:29 INFO: Test cooldown is in progress...
18:13:09 INFO: Test cycle finished.
18:13:09 INFO: receiver exited from current test
18:13:09 INFO: 512 connections tested
18:13:09 INFO: #####  Totals:  #####
18:13:09 INFO: test duration    :20.12 seconds
18:13:09 INFO: total bytes      :99066970112
18:13:09 INFO:   throughput     :39.39Gbps
18:13:09 INFO: tcp retransmit:
18:13:09 INFO:   retrans_segments/sec   :0.00
18:13:09 INFO:   lost_retrans/sec       :0.00
18:13:09 INFO:   syn_retrans/sec        :0.00
18:13:09 INFO:   fast_retrans/sec       :0.00
18:13:09 INFO:   forward_retrans/sec    :0.00
18:13:09 INFO:   slowStart_retrans/sec  :0.00
18:13:09 INFO:   retrans_fail/sec       :0.00
18:13:09 INFO: cpu cores        :32
18:13:09 INFO:   cpu speed      :2200.000MHz
18:13:09 INFO:   user           :0.08%
18:13:09 INFO:   system         :5.13%
18:13:09 INFO:   idle           :91.25%
18:13:09 INFO:   iowait         :0.00%
18:13:09 INFO:   softirq        :3.53%
18:13:09 INFO:   cycles/byte    :1.25
18:13:09 INFO: cpu busy (all)   :180.32%
---------------------------------------------------------
 

prdtabim

Active Member
Jan 29, 2022
173
67
28