100gbe mellanox with 60% loss - how to diagnose?

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

eskiuzmi

New Member
Dec 10, 2022
12
0
1
I finally got these two machines to talk and it's fast, but so far it's only 15gbe fast.

Windows 10 both, both with Mellanox 455a. On the server the NIC is on a x16 gen3 and on the other the 455 is on a x8 (gen5 going to waste) because that's the best that z790 can offer. In any case 60gbe is better than 2.5gbe so I won't complain. Except that throughput shows 60% loss.

I lowered the jumbo size from 9000 to 1500 just in case. Since the twinax connecting them directly is rated for 100gbe, I don't know what could be wrong.

Do you know what I should look into to diagnose this?

1671665672216.png
nice bumps to 50gbe 1671665685449.png
 

Stephan

Well-Known Member
Apr 21, 2017
945
714
93
Germany
As first measure try NTttcp on the command line. At 100 Gbps you need to make sure any performance test is using all cores and multiple TCP streams between machines.

To rule out windows, boot a recent Linux from USB sticks (using Ventoy, and e.g. some Ubuntu ISO with persistence) and run GitHub - microsoft/ntttcp-for-linux: A Linux network throughput multiple-thread benchmark tool.

Also on Linux you can use ethtool -S [interface] to get detailed statistics:

ethtool -S eno1 | grep -E "(drop|error|fail|coll)"

rx_errors: 0
tx_errors: 0
tx_dropped: 0
collisions: 0
rx_length_errors: 0
rx_over_errors: 0
rx_crc_errors: 0
rx_frame_errors: 0
rx_missed_errors: 0
tx_aborted_errors: 0
tx_carrier_errors: 0
tx_fifo_errors: 0
tx_heartbeat_errors: 0
tx_window_errors: 0
tx_abort_late_coll: 0
tx_single_coll_ok: 0
tx_multi_coll_ok: 0
rx_long_length_errors: 0
rx_short_length_errors: 0
rx_align_errors: 0
tx_tcp_seg_failed: 0
rx_csum_offload_errors: 0
alloc_rx_buff_failed: 0
dropped_smbus: 0
rx_dma_failed: 0
tx_dma_failed: 0
uncorr_ecc_errors: 0
corr_ecc_errors: 0
 

eskiuzmi

New Member
Dec 10, 2022
12
0
1
The other DAC arrived. Loss by 15%. Bending the new cable a certain way increases loss further. The first one was immune to bending loss.
So cable explains a bit of the loss, but not the assymetry between up/down and since both machines run the same OS, things could be hardware related. One machine runs at x16 and the other at x8, maybe that explains some of it but not that download is unaffected... unless these NIC allocated up and down circuit by pcie lane? then the x8 machine has all required lanes to downstream but only a few to upstream? I don't know how this symetrical 100gbe works at the hardware level.

I didn't read all the word but the graphs look interesting, it's like multi threaded programming which I do some so I can understand some of it. Is it the same for 100gbe? Seems that bottlenecks move as we go faster. Someone mentioned increasing "cache/buffer size on the OS"
 

CyklonDX

Well-Known Member
Nov 8, 2022
857
283
63
Is it the same for 100gbe? Seems that bottlenecks move as we go faster. Someone mentioned increasing "cache/buffer size on the OS"
Its the same process, but don't follow exact same numbers they used - keep in mind your own hardware; and test, test and test again.
(best to keep some excel/google sheets with your results), and change only single thing at the time so you know exactly what does what.