What can I do to fine tune my 10Gbps network?

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

Road Hazard

Member
Feb 5, 2018
30
2
8
52
I have two boxes (running Mint 19.2 on both) using 10Gbps Mellanox X3 cards connected via a DAC cable and iPerf reports 9.8Gbps both ways. No networking switch between these boxes.... just the DAC cable.

Server A is a 24 bay, SuperMicro box. I think I have 12, 4TB drives in RAID 6 via MDADM. The backplane in the SuperMicro is SAS2 and the SFF cable goes into a 9207-8i HBA. When I copy a 10 gig file from the SSD to the RAID 6 array or from the SSD to the RAID 6 array, I'm getting a hair over 500MB/s.

Server B is a 12 bay Rosewill box. It has a bunch of drives in RAID 5 (MDADM, 9207-8i HBA) and a SSD for booting. Same deal. When I copy to/from the RAID 5 array in this box to/from the SSD, seeing 500+MB/s.

At this point, the sub-system in both servers is capable of maxing out the read/write performance of my SSD drives. But when copying over the 10Gbps link, I'm only seeing about 300+MB/s. I did some digging around and found references to adjusting the MTU. I did that for the 10Gb NICs (from 1500 to 9000) in both systems and the first file copy test got me speeds around 350MB/s and then every test copy after that........ roughly 70MB/s! I restarted both boxes and the MTU is back to 1500 and speeds are consistent at 300+MB/s.

Both cards have the latest firmware and are using v4.00 of the driver (Mint default). I can't apply the absolute latest driver because Mellanox's installer is hard-coded to look for only "officially" supported operating systems and Mint isn't supported.

In case it matters, both systems are using their 1Gbps NICs to route out through the switch and get on the internet and talk to the LAN so in case I need to make some universal kernel boot change, I need to make sure it doesn't negatively impact the 1Gbps NIC.
 

Road Hazard

Member
Feb 5, 2018
30
2
8
52
Take a look at my post here and the whole thread.
Thanks for the link. Interesting read but unfortunately, I'm no better off. :(

iPerf shows nearly 10Gbps both ways so I guess the network stack is optimal? (I've tried alllllllllllllllll sorts of settings this afternoon):

sudo sysctl -w net.core.rmem_max=134217728
sudo sysctl -w net.core.wmem_max=134217728
sudo sysctl -w net.ipv4.tcp_rmem="4096 87380 134217728"
sudo sysctl -w net.ipv4.tcp_wmem="4096 87380 134217728"
sudo sysctl -w net.core.netdev_max_backlog=300000
sudo sysctl -w net.ipv4.tcp_moderate_rcvbuf=1
sudo sysctl -w net.ipv4.tcp_no_metrics_save=1
sudo sysctl -w net.ipv4.tcp_congestion_control=htcp
sudo sysctl -w net.ipv4.tcp_mtu_probing=1

No help with any of that. (Made things worse.)

I used bonnie++ to test the speed of my array and sec writes are a bit over 500MB/s. When a copy is happening over the 10 gig link to my array, iostat shows disk utilization across all the drives in the array (12) is about 50-60% so the incoming traffic is not coming close to maxing out my write performance. Mpstat shows CPU usage is in the single digit range for all cores during file xfer. Heck, I even copied a 15 gig file back and forth from/to RAM drives on both systems over the 10 gig link and speeds were in the 250-300MB/s range.

I'm using the latest firmware and drives for the cards. Samba shares, NFS.....no difference. Not sure what else to try except maybe buying some Intel X520 NICs, Chelsio or some super cheap HP NC523SFP cards.
 
Last edited:

azev

Well-Known Member
Jan 18, 2013
768
251
63
have you tried multiple copy job from client to server. I mean start a copy of the 15Gb files and then run another separate copy and see if you can load up your server to its max write speed
 

Road Hazard

Member
Feb 5, 2018
30
2
8
52
have you tried multiple copy job from client to server. I mean start a copy of the 15Gb files and then run another separate copy and see if you can load up your server to its max write speed
I carried out your test scenario and copying the same file from 2 SSDs in my server to the RAID 6 array at the same time and was seeing about 350MB/sec on each copy.
 

azev

Well-Known Member
Jan 18, 2013
768
251
63
Well there you go, if you cant get the result you want with a single threaded copy, then sometime all you have to do is run multiple tread.
Using tools that allow parallel file transfer could help your case as well.
 

Road Hazard

Member
Feb 5, 2018
30
2
8
52
Well there you go, if you cant get the result you want with a single threaded copy, then sometime all you have to do is run multiple tread.
Using tools that allow parallel file transfer could help your case as well.
Maybe I didn't explain that well. When I copy locally (from SSD to array), I get over 500MB/s. When I had two copies going at the same time, the combined throughput was 650'ish MB/s. But lets go back to the single copy and 500MB/s. Why can't I get that speed when copying over the 10Gb network to the same array? When I copy from a RAM drive on the client PC to the server array, I'm not getting anywhere near 500MB/s. More like 2Gbps.
 

oddball

Active Member
May 18, 2018
206
121
43
42
So from RAM to RAM you're getting 2Gbps?

We run a mixed 10GbE and 40GbE network. Out of the box with default Linux/Windows 2016 we can max a 10GbE pipe when copying to SSD or NVMe. No tuning required, default OS installs.

To hit 40GbE you need a lot more tuning and work. But hitting 15-20GbE is fairly easy as well.

90% of the time the storage subsystem is the limiter. Sometimes a slow CPU will do it too. But with a reasonable CPU (E5-2680v2 and up) you can saturate a 10GbE pipe with an array of SSD's to write to. Storage is always the limiter.

Do you have a RAID controller that's bottlenecking? What if you write to a RAID 10 or RAID 0?
 

Road Hazard

Member
Feb 5, 2018
30
2
8
52
So from RAM to RAM you're getting 2Gbps?

We run a mixed 10GbE and 40GbE network. Out of the box with default Linux/Windows 2016 we can max a 10GbE pipe when copying to SSD or NVMe. No tuning required, default OS installs.

To hit 40GbE you need a lot more tuning and work. But hitting 15-20GbE is fairly easy as well.

90% of the time the storage subsystem is the limiter. Sometimes a slow CPU will do it too. But with a reasonable CPU (E5-2680v2 and up) you can saturate a 10GbE pipe with an array of SSD's to write to. Storage is always the limiter.

Do you have a RAID controller that's bottlenecking? What if you write to a RAID 10 or RAID 0?
I just got done conducting another test. I pulled the Linux boot SSD and replaced it with another SSD (rated for over 500MB/s read/write) and installed Windows 10 on there and left the 10Gbps NIC in. I pulled the 2nd 10Gbps NIC out of the other Linux Mint box and placed it into my Windows 10 PC (which also has a SSD capable of 500+MB/s read/write). I copied about 50 gigs worth of data over the 10Gbps connection (Windows 10 on both PCs) and........... I was maxing out my SSD and getting about 480-500MB/s!!! This was with default Windows 10 settings (1500 MTU, etc. No tweaking.) So, looks like my 10Gbps cards and DAC cable are 100% fine..... it's a software tuning problems with Linux that needs figuring out.
 
  • Like
Reactions: T_Minus

acquacow

Well-Known Member
Feb 15, 2017
784
439
63
42
Since you mentioned samba, have you enabled smb multichannel in your config?

That will get you a big boost.
 

acquacow

Well-Known Member
Feb 15, 2017
784
439
63
42
Thought about it but after reading it could POSSIBLY lead to data corruption, I gave up on pursuing that option.
I don't think that's been an issue for a while.

It's all baked-in and supported in the latest versions last I checked.
 

Road Hazard

Member
Feb 5, 2018
30
2
8
52
I don't think that's been an issue for a while.

It's all baked-in and supported in the latest versions last I checked.
Ok, I might give that a shot but here's another piece of the puzzle.

So when I did my experiment of moving one of the 10Gbps NICs to another PC and using Windows 10 on both boxes and maxing out my SSDs and thinking the problem was totally Linux related (which it MIGHT still be????).... I did another test. I booted the main server back into Linux Mint and had it talking to my Windows 10 box over 10Gbps and going from ram drive to ram drive, I was maxing out the 10Gbps link at 1.02GB/s. So Linux on my main server is in the clear as is the hardware.

It looks like my backup server (running Mint 19.2) was the culprit. So I trashed the install and put a fresh copy of Mint 19.2 back on it and I solved half my problem.

When I copy files FROM the main server TO the freshly installed copy of Mint 19.2 on the backup server, I get about 500MB/s. I'm 100% fine with that speed!

Unfortunately, when I copy files FROM the newly installed copy of Mint TO the main server............ 250MB/s to the array. About the same speed when copying to a RAM drive on the main server as well. So, the big question.....why is 10Gbps only going fast in one way?

So close, yet so far away. :)
 
Last edited:

itronin

Well-Known Member
Nov 24, 2018
1,234
793
113
Denver, Colorado
I did another test. I booted the main server back into Linux Mint and had it talking to my Windows 10 box over 10Gbps and going from ram drive to ram drive, I was maxing out the 10Gbps link at 1.02Gb/s. So Linux on my main server is in the clear as is the hardware.
...
So close, yet so far away. :)
for maxing out your 10Gb link did you mean 1.02Gb/s or or 1.02GB/s ... 1.02GB is about 8.25Gb - though still an improvement. without tweaking on Linux, or ESXI, using 1500MTU I only see about 1.02-1.03GBps for storage performance. But yeah that's usually "good enough" in most instances on a 10Gb link.
 

Road Hazard

Member
Feb 5, 2018
30
2
8
52
for maxing out your 10Gb link did you mean 1.02Gb/s or or 1.02GB/s ... 1.02GB is about 8.25Gb - though still an improvement. without tweaking on Linux, or ESXI, using 1500MTU I only see about 1.02-1.03GBps for storage performance. But yeah that's usually "good enough" in most instances on a 10Gb link.
Whoops, corrected. Yes, I meant a BIG 'B'. Still Googling around to try and see if I can crack this last part of the puzzle (asymmetrical speed).

Since a fresh install of Mint fixed 1/2 my problem..... when Mint 19.3 drops (few weeks), I'm going to put a fresh copy on both servers and hopefully that will magically fix things. :)
 

Road Hazard

Member
Feb 5, 2018
30
2
8
52
for maxing out your 10Gb link did you mean 1.02Gb/s or or 1.02GB/s ... 1.02GB is about 8.25Gb - though still an improvement. without tweaking on Linux, or ESXI, using 1500MTU I only see about 1.02-1.03GBps for storage performance. But yeah that's usually "good enough" in most instances on a 10Gb link.
Saw lots and lots of these errors on the sending and receiving computers with Wireshark. Could this be the cause, all the retransmission's?

Reassembly error, protocol TCP: New fragment overlaps old data (retransmission?)
Severity level: Error
Group Malformed

wireshark error.png
 

itronin

Well-Known Member
Nov 24, 2018
1,234
793
113
Denver, Colorado
If you are re-trans "lots and lots" things will slow down.... Have you undone the tuning tweaks to see what your performance is? Could you have accidentally tweaked something that is now being part of the problem? You aught to be able to get 1.0GBps or so without any tuning, using 1500MTU. Check switch as well for that port, MTU settings see if anything looks amiss, card settings on both systems. Try changing out the DAC. a flaky DAC might lead to retransmits and or a problem in one direction.
 

Road Hazard

Member
Feb 5, 2018
30
2
8
52
If you are re-trans "lots and lots" things will slow down.... Have you undone the tuning tweaks to see what your performance is? Could you have accidentally tweaked something that is now being part of the problem? You aught to be able to get 1.0GBps or so without any tuning, using 1500MTU. Check switch as well for that port, MTU settings see if anything looks amiss, card settings on both systems. Try changing out the DAC. a flaky DAC might lead to retransmits and or a problem in one direction.
All the tuning I did was non-persistent, a reboot and I'm back to stock settings. I'm not using a network switch, just a Mellanox DAC cable directly connecting the 2 systems. I ordered some Intel X520 cards as I'm thinking it's something with Mellanox cards and Linux Mint? (grasping at straws)

Not sure I need a new DAC cable since I can put the cards in Windows machines and get 1.08GB/s both ways when using RAM drives.
 
  • Like
Reactions: itronin

Road Hazard

Member
Feb 5, 2018
30
2
8
52
Wanted to revive this thread with an update.

I installed some Intel X520 cards and used an Intel branded DAC to hook my servers together and.................................................. no change. I can send files one way at 500MB/s but sending the same file back.... 250'ish MB/s, sometimes even slower! I really think that if I had a better understanding of how to setup routes in Linux, I might be able to resolve this. I'm wondering if those Wireshark retransmissions are a result of ACKs or something going out over the 1Gbps link then back to the 10Gbps card.

My DAC cable is 5M long, would a shorter cable help?!

Another strange twist. I didn't keep detailed notes on this but I could have sworn I flipped the read/write speeds one time depending on which server booted up last. (Maybe I'm hallucinating because when I observed this, I was working on the problem for 12 hours straight at this point.)

Something else I might try, stop hooking the PCs directly together via a DAC and use a switch. Something like the Dell 5524 and plug all my 1Gbps stuff into that and the two servers into the SFP+ ports of the switch. Also, on the servers, remove the 1Gbps CAT5 cables from each and just let them talk via their 10Gbps NICs.

So I guess I can give up on troubleshooting these problems and was wondering if I could bounce a few questions off everyone. Looking at getting a used Dell 5524 (don't need PoE) switch. Should I consider something better? I need a switch with the following:

1) Something simple (prefer GUI) to set up. I'm not a network engineer but know how to use Google. I don't need any fancy routing done or VLANs or nothing of the sort. Just want a simple switch to plug things into and go.

2) Since I already have some DAC cables, would prefer to keep using them instead of buying transceivers and fiber. (Both are sorta cheap so it isn't the end of the world if I need to buy new cabling but would really like to use my existing DAC cabling. I see lots of people liking the Aruba gear but read that compatibility with DACs are hit or miss so maybe scratch Aruba recommendations?)

3) Cheap. Would like to keep the switch to under $200 and closer to $100.

4) Must have easy/free access to any firmware updates. I know that some enterprise gear needs keys/dongles/paid accounts to access firmware.

5) Mikrotik CSS326-24G-2S+RM the best bet? (Old Dell 5524 or a new Mikrotik?)

6) Juniper gear any good?

Thanks for reading. :)
 

mutluit

New Member
Apr 2, 2020
1
0
1
Germany
Hi, it could be a routing issue so that the packets go over the wrong interface... :) I once had such a headache, but then could solve it.
For testing this, just pull off the cable(s) of the other interfaces, ie. have only the 10G interface(s) active...
If that is indeed the reason for the problem, then I maybe might help you further: you need to define so called "user defined routing tables" in Linux for the affected interface(s)... Just let us know.

Take a look at these:
4.8. Routing Tables
Two Default Gateways on One System - Thomas-Krenn-Wiki
 
Last edited: