Fluctuating speeds on 40gb Mellanox

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

djdrock

New Member
Jul 28, 2019
10
3
3
I finally got my MCX353A-FCBT ConnectX-3 cards to connect at 40gb, but I am having some fluctuating speed issues. I was hoping I could get some help figuring this out.

System 1: (PC): Threadripper 1950x, 64gb RAM, x399 mobo, Samsung 970 Evo Plus M.2, Windows 10 Education (basically Enterprise). Mellanox is installed in a PCIex 16 slot.

System 2: (Server):" Xeon W2102m 16gb RAM, Supermicro X11SRA-F mobo, Samsung 970 Evo Plus M.2. Windows Server 2019. Mellanox is installed in a PCIe x16 slot. LSI raid10 with eight 6tb SAS drives.

Points of interest:

I am using Ethernet mode instead of Infiniband.
The latest firmware and drivers are installed.
I have confirmed that RDMA is enabled.

Going from the server to the PC, I am getting about 11 Gbit/s consistently transferring a 20gb file.
Going from the PC to the server, I am getting bursts of about 25Gbit/s for about half of the transfer, then it drops to a stop, resumes to a fast speed again and does this until the transfer is finished. Copying from either m.2 to m.2 drives, or from the m.2 (pc) to the RAID10 array has the same issue.

I have been tweaking some things in the cards configuration (jumbo packets, interrupt moderation, send and receive buffers, large send offload etc.) without any luck.

It feels like a buffering issue of some kind, but am not having any luck pinning this down.

Any help would be greatly appreciated.
 
Last edited:

fossxplorer

Active Member
Mar 17, 2016
554
97
28
Oslo, Norway
RDMA (RoCE in this case)...you aren't utilizing with it standard copy using OS copy utilites, as they don't support RDMA.
You should do some network testing using iperf (non-RDMA tests) or Mellanox supplied tools (RDMA tests) and see how the system behaves.
The issue might be related to storage as @i386 mentions.
 

djdrock

New Member
Jul 28, 2019
10
3
3
@fossxplorer, I did not know that Windows did not utilize RDMA using the standard copy utilities. When doing the transfer from the PC to the Server, I did keep an eye on memory usage.

See these images: (the insert image function on the forum is not working)

memory
speed3

As you can see, the peaks and valleys of the lan and memory utilization coincide. Now, I am not sure if the memory graph is showing exactly what it should be doing if the lan is choking. What I am wondering is if the memory (I only have 16gb) is causing the issue. In other words, is the memory causing the transfer to choke, or, is the lan choking and the memory graph is simply showing me when the choke occurs.
 
Last edited:

mb300sd

Active Member
Aug 1, 2016
204
80
28
34
Looks like you're filling up your write cache in RAM, and waiting for the disk to catch up.
 

djdrock

New Member
Jul 28, 2019
10
3
3
For reference, adding more memory did the trick. No more buffering/pausing. With this said, I am getting triple the speeds transferring files from the Threadripper PC to the Server. I am not quite sure why this is. If anyone has any ideas, I would appreciate it.
 

kiteboarder

Active Member
May 10, 2016
101
48
28
45
Hi, some info in this thread where I was learning about the different versions of Windows and their write cache limitations:
https://forums.servethehome.com/ind...ncrease-memory-size-of-the-write-cache.19712/

The read/write speeds of your drives, amount of RAM, cache on SSDs, etc., etc. all play a big role in copy speeds. This isn't always a simple question with simple answer. You'll also find limitations with Windows file copy system... Basically this can be a big rat hole.

I suggest using the Resource Monitor to see what's happening with the specific I/O happening on your specific systems.
 
  • Like
Reactions: djdrock

djdrock

New Member
Jul 28, 2019
10
3
3
@kiteboarder thank you for your reply. Over the last day, I have been doing some reading. You nailed it when you said that this can (is) a big rat hole. Unlike when I copy from my PC to the server, when going from the server to the PC using the same file (using a larger 50gb file), the memory usage on my PC only goes up around 4 or 5gb, so it obviously is handling the transfers in a better manner. Going TO the server, it is chewing up 50% of the RAM, writes to disk, then repeats until the transfer is done. Windows has always been plagued with networking issues, but one would think these things would be much easier to manage.

I really want to understand RDMA and utilize this technology. As it has been pointed out, windows will not utilize RDMA with the standard file transferring/copy utility. Have you any suggestions on where I should start in getting RDMA working for transferring files between these two systems?
 

djdrock

New Member
Jul 28, 2019
10
3
3
Another thought here...since it seems nearly impossible to adjust write cache (disabling did not help me), I am wondering if there are tweaks that allow the RAM to write the data to disk earlier in the transfer process? The write cache fills up (to 50% in Server 2019), then writes that data to disk, then it repeats. Can we tweak windows so that the cache is writing to disk before it gets full?
 

i386

Well-Known Member
Mar 18, 2016
4,220
1,540
113
34
Germany
As it has been pointed out, windows will not utilize RDMA with the standard file transferring/copy utility.
Copying files with explorer will use smb direct/rdma.
On the client versions of windows it's seems not to work. At least not with mellanox connect-x3 nics.
 

kiteboarder

Active Member
May 10, 2016
101
48
28
45
My understanding is you must have the most expensive Windows 10 in order to get all of the RDMA stuff to work: Windows 10 Pro for Workstations (or Windows Server)

A couple other threads:
https://forums.servethehome.com/index.php?threads/windows-10-pro-for-workstations-rdma.21426/
https://forums.servethehome.com/index.php?threads/40-gbe-on-win10-only-getting-10gb-s.24937/

You asked: I am wondering if there are tweaks that allow the RAM to write the data to disk earlier in the transfer process?

With regard to the memory filling up, the write is being pushed to the destination drive at the same time as the write cache is filling. No tweak required. It's just that your read speed is much faster than your write and thus the memory write cache is filling up. Use the Resource Monitor and you'll see it writing to disk.

My experience has been that the only way to improve the situation is to get a faster write destination hdd/ssd solution. You don't really know the limits of your writing destination until you try to write 50-100 GB in one go. A great many consumer SSDs don't have the write cache to handle it and they fall on their face once their write buffer is full. QLC is really bad in this regard. Obviously I don't know the details of your setup, but honestly I would bet a dollar that your raid array can't take it sustained. Please don't take this as an insult in any way. I didn't know my write solution's limits either. I had to upgrade to NVME SSDs to handle 40 Gb writes.

Good luck...
 

djdrock

New Member
Jul 28, 2019
10
3
3
My understanding is you must have the most expensive Windows 10 in order to get all of the RDMA stuff to work: Windows 10 Pro for Workstations (or Windows Server)
Directly from Microsoft (Compare Windows 10 business editions - Windows)

"SMBA Support (RDMA) is checked for both the Workstation and Enterprise Editions."

This table (Windows 10 editions - Wikipedia) confirms that the Education version is essentially Enterprise.

And I can confirm that I have SMB Direct installed on my Windows 10 Education machine:

SMB-Direct (click for image)

I am not finding SMB Direct, in either features or roles on Server 2019. Any pointers?
 
Last edited:

kiteboarder

Active Member
May 10, 2016
101
48
28
45
That's cool if that's the case with Education. I just remember about a year ago I had to upgrade from regular Pro.

Wikipedia grabbed this documentation from somewhere:

Windows 10 editions - Wikipedia

Pro for Workstations
Windows 10 Pro for Workstations is designed for high-end hardware for intensive computing tasks and supports Intel Xeon, AMD Opteron and the latest AMD Epyc processors; up to four CPUs; up to 6 TB RAM; the ReFS file system; Non-Volatile Dual In-line Memory Module (NVDIMM); and remote direct memory access (RDMA).[4][5][6]

And MS website:
https://www.microsoft.com/en-us/mic...rosoft-announces-windows-10-pro-workstations/

Faster file sharing: Windows 10 Pro for Workstations includes a feature called SMB Direct, which supports the use of network adapters that have Remote Direct Memory Access (RDMA) capability. Network adapters that have RDMA can function at full speed with very low latency, while using very little CPU. For applications that access large datasets on remote SMB file shares, this feature enables:

Have no idea about Server 19, haven't used it yet.
 

nikey22

New Member
Feb 19, 2018
14
0
1
52
use a RAMDISK on each machine - server and PC and transfer a 50Gb file across the network and see what happens.
use Jperf/iperf as @fossxplorer suggested between the two machines and see what happens.

what cable do you have connecting the two?