10GB CPU bottleneck help

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

Airz

New Member
Jan 22, 2014
23
2
3
So I've recently installed 10Gb Ethernet in the homelab and I'm having some issues with a receive side bottleneck which looks to be caused by the CPU.
The card in the PC is the Supermicro AOC-STG-i2T X540 based card and I'm using the latest drivers from Intels website. The PC is using the x99 Asus rampage V extreme with an 5960X clocked at 4.4Ghz so not very new but still a decent powerful CPU.
When copying a large file to the server I get the full 1.12Gbytes per second transfer speed however when copying down to the local computer I get about 700Mbytes and a single core is pegged at 100% load. I've tried using applications like robocopy and simulating traffic using NTttp but I see exactly the same behavior.

So onto my question. Is this normal and the CPU just isn't up to the task or is there something I should be looking at to optimise the configuration? I've tried playing with receive side scaling but that's had no effect. I've also got jumbo frames set at 4000 at the moment which got me to the current speeds I'm at as it was horrendous at 1500.
Lastly if it is just CPU bound do the newer 10Gb cards make things any better on that front e.g would swapping it our with an X550 make any difference (I don't know if they provide any better offloading)
 

Deslok

Well-Known Member
Jul 15, 2015
1,122
125
63
34
deslok.dyndns.org
Some of the mellanox cards are less hard on the cpu than the intel ones for 10gig(there's a few threads about that somewhere here) You're also seeing an increase in cpu usage due to the filecopy performance especially if you're doing something like writing to a software raid 5, even something simple like a raid1 ssd setup will show more load since the cpu is processing more data moving through it though i'm having a hard time finding exactly how much cpu gets eaten up for io on a sata based drive
 

RageBone

Active Member
Jul 11, 2017
617
159
43
The used protocol matters a lot.
With SMB make sure you are actually on 3_11, that could maybe help.
I don't think your Nic can do RDMA and i have no clue about its offloading features.
On the Mellanox side, i think the Card does most of the IP Stack work or some RDMA magic.
RDMA in general reduces cpu load and the times stuff is copied around.

SMB can do RDMA and other magic which is bundled under the SMB-Multichannel or SMB Direct Naming.
That would be worth a try in my opinion.
 

Airz

New Member
Jan 22, 2014
23
2
3
Thanks for the quick replies guys :) In regards to the potential software raid i'm copying the large file to a single Samsung 970 Evo plus which I've tested at 3.2Gb per sec so it's not a raid or throughput bottleneck at the storage layer.
From a protocol perspective I'm running a fully patched Win10 on the desktop and fully patched Server 2016 and checked that both have SMBMultiChannel and SMBv3.
I've tried to look around to find X540-T2 firmware in case the card is running and old version but all I can find is the NVM update software for the 700 and 550 series which is frustrating.
Weirdly I'm running a X540-T2 in the server (ESXi 6.7U2) as well and it seems to equally load the CPU when doing large copies so I wonder whether a) the other card I have needs an update or b) the intel drives for Win10 aren't as well optimised.
 
  • Like
Reactions: nikalai

i386

Well-Known Member
Mar 18, 2016
4,250
1,548
113
34
Germany
Samsung 970 Evo plus which I've tested at 3.2Gb per sec so it's not a raid or throughput bottleneck at the storage layer.
Can you run this benchmark for me and post the result here or pm them to me?
It will test the 4k performance for 2minutes on a 20gb file with 4 threads and a qd of 8 IO requests/thread at 80% reads & 20 % writes with hardware and software cache disabled
Code:
diskspd -b4K -c20G -d120 -L -o8 -r -Sh -t4 -w20 testfile.dat
 

Airz

New Member
Jan 22, 2014
23
2
3
As a small update I disabled the spectre and meltdown patches in Windows 10 (for testing only) and my copy test has jumped from 600Mb/s to 850-900Mb/s???
Can't believe those patches have such a huge impact on network throughput.