Mellanox ConnectX-3 Low File Transfer Speed

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

Siren

New Member
Sep 9, 2019
2
0
1
Hi All!

I'm new to the STH forums and wanted to get some guidance/help as to my issue.

I have two devices that have this card and the specs:

- Dell Poweredge R620 (2 * E5-2690 V2, 192GB of RAM, All 2.5" Samsung 850/860 EVO SSD's)
- Custom built ITX server (i7-8700k, 1 * 32gb DIMM, 1 * Samsung 960 EVO NVMe SSD, 1 * 500gb Samsung 860 EVO)

Both systems are using the same exact card Model (MCX353A-FCBT) with the same exact firmware (2.42.5000) and are on PCIe 3.0 x16 slots.

TL;DR - I've configured both systems with both Windows Server 2016 and CentOS 7 with the cards being tuned for single port traffic (Server 2016) and High Throughput (CentOS 7) and made sure that RoCE works for both OS's, but still dont know where the issues with transfer speeds lie for both OS's. Is it a hardware bottleneck on my ITX system or is it somewhere within the configuration of either system? I've posted some benchmarks below along with attached some outputs. I can provide links to guides I've used if necessary.


Detailed version


The purpose of the ITX system will eventually be a high-capacity backup server, but also be used as a regular system when needed. In terms of OS, I've configured Windows Server 2016 and CentOS 7 for both systems, with VMWare running as a secondary boot for the r620. I am also utilizing KVM on the centos 7 array for the server, as I eventually want to create a virtual environment and utilize 40gb speeds (or close to it).

In terms of benchmarking, everything that I have done testing-wise seems to point that I am getting close to 40gb speeds across both devices. Here's what I've done in terms of benchmark testing:

- ntttcp (Windows)
- ib_send_bw (Linux)
- Atto (Windows)
- iperf (Both OS)

However, when I try to do an actual file transfer between both systems (whether its Windows or Linux), I'm getting around 550MB/s, which is nowhere near the speeds that I should be getting (which is anything above 3.3GB/s). With all of the disk benchmarking done, I noticed that transferring from the ITX system to the R620 is fine, but going the reverse slows the speeds down significantly.

I've posted mostly all of the benchmarks in the Mellanox community page from a post I created a couple of weeks ago (with exception of the iperf results), but didn't get any response after a week.

I'm posting the screenshots and outputs of the testing here:

Atto ITX to R620.png Atto R620 to ITX.png

(Ignore the RAID 5 testing, as my arrays are all RAID 1 as of now on the server)

Note that when going from the server to the ITX system yields some bad results and that is where I think the issues are, but again am not sure.

What I also found interesting was that ntttcp (at least for me) did not like to perform multi-threaded tests and would just hang at the version. But doing a single threaded test will run with no issues, as seen on the ntttcp output file.

Lastly, iperf will come back with nearly 40Gbit speeds when doing the command (will post it if needed).

I've followed some guides to get RoCE working for both OS's on both systems and can paste the links if necessary, but it is where I am stuck at the moment.

Any help/advice/guidance is greatly appreciated.

Thank you.
 

Attachments

i386

Well-Known Member
Mar 18, 2016
4,241
1,546
113
34
Germany
Iperf and ntttcp test network performance, not storage performance!

Samsung 860 evo are sata3 devices (6gbit/s max!); in raid 1 you have the write performance of a single drive.

I'm not sure if there is an implementation of roce for samba. The last time I checked the progress it was still a "todo" task.
 

Siren

New Member
Sep 9, 2019
2
0
1
Iperf and ntttcp test network performance, not storage performance!

Samsung 860 evo are sata3 devices (6gbit/s max!); in raid 1 you have the write performance of a single drive.

I'm not sure if there is an implementation of roce for samba. The last time I checked the progress it was still a "todo" task.

Thanks for the reply. I do realize that I failed to mention the test types (network and disk), as I was writing this very late. And yes, I understand that the SSD's are locked at 6gbit/s

However, the atto benchmarks should at least give an idea of the disk speeds between the systems. I wanted to put the networking benchmark as well to make sure that there wasn't something causing problems between the links themselves (which there doesn't seem to be, based on the outputs).

When I was able to do some testing last night, I did change from a raid 1 array and used BOTH raid 10 and a raid 0 array to test on my R620. The results (via atto benchmarking and actual transferring) did NOT make a difference at all. In fact, the results were almost identical to the benchmarks in my first post. I understand that I'm going to be limited with SATA 3 speeds, but the fact that the speeds arent even close to 10g on an actual file transfer when many people have 10g working fine through SSD's is concerning to me.

The file I used was a centos 7 everything ISO to test, as that's a 10gb file on its own.

What I haven't tried out yet was changing the MTU to a higher value and see if that improves the transfer speeds a bit. I also didnt post any ramdisk performance results, but those hit 4GB/s consistently. I'll post some more results later tonight.

Thanks.
 
Last edited: