Breaking the 1GBps barrier...

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

Quaraxkad

New Member
Jun 10, 2017
3
1
3
40
I have a 4U server with 24 drives (36 actually including a second chassis, but for testing and simplicities sake I'm working with just the 24 in the main chassis for now). I've had a number of hardware configurations in this system; Motherboards, CPUs, HBA controller cards, RAID controllers, SAS Expanders, etc. The current configuration is:

Motherboard: Intel SS2600CP2J
CPU: Two Intel Xeon E5-2670's
RAM: 64GB Samsung ECC PC3-8500R, 16x4GB
HBA: Three LSI SAS9201-16e
HDD: 24 mixed SATA drives, each ranging from 2TB to 6TB

Each HBA is installed in a PCI-Express full x8 slot (one of the motherboard slots operates at x4 under certain conditions, I'm not using that one.

In running various tests utilizing simultaneous disk reads, I am unable to break a total combined read speed of a little over 1GBps. This has been the case no matter what my past hardware configurations have been, yet I remain unable to break through that 1GBps barrier. My primary benchmarking/testing software is a combination of IOMeter, HDTune, and a Java tool written by the developer of FlexRAID. The Java tool is the easiest to use in my opinion, and roughly mirrors what I have seen in my experiments to date with IOMeter, so I'm pretty sure it's a solid tool for this purpose. Individually, the slowest disks read performance is 73MBps and the fastest is 201MBps (as tested repeatedly with the Java tool).

Realistically, I don't expect the simultaneous combined read speed to be 100% equal to the sum of the individual speeds (which calculates to roughly 3GBps), but I also don't expect this level of drop-off. When reading a 5GB random data test file (true random, not an empty file) from each disk simultaneously, starting with 2 disks, and each subsequent test adding another, up to the final test reading from all 24 at once: Performance ramps up essentially at a 1:1 ratio up to ~12 simultaneous disk reads. At ~12 disks, the total combined read speed is 1.3GBps. This is more or less equivalent the sum total of each drive having been read individually. For every disk added beyond that, instead of improving the combined read speed, each disk read is slowed down such that the final combined read speed never breaks 1.3GBps. Between each test Standby List cache is cleared using the tool 'EmptyStandbyList.exe' from the developer of Process Hacker. If this step is skipped, the test files are read near instantaneously from cache, instead of from the HDD.

Here's some raw data of the results from my most recent test (combined read speed in KBps while reading n drives simultaneously):
2=297,296
3=425,171
4=534,397
5=668,829
6=785,337
7=894,563
8=1,027,959
9=1,144,467
10=1,259,989
11=1,340,224
12=1,414,289
13=1,440,753
14=1,433,548
15=1,450,578
16=1,424,043
17=1,412,460
18=1,384,576
19=1,389,126
20=1,367,885
21=1,290,037
22=1,331,943
23=1,378,285
24=1,378,545

During these tests, not even one CPU core/thread is maxed out, and the total usage across all cores doesn't exceed roughly 35%. I'm hoping that somebody out there can help me either find my bottleneck, or assist in optimizing my system to improve performance. This is not purely for academic purposes, my end-goal is to improve the performance of operations in SnapRAID, which access all drives simultaneously. If there's any more information I can provide, or specific tests I should try, please let me know!
 

i386

Well-Known Member
Mar 18, 2016
4,241
1,546
113
34
Germany
1.3gbyte/s -> ~ 10gbit/s

Maybe it's a really stupid question, but do you run the test on the server or over the network on a client?
 

pyro_

Active Member
Oct 4, 2013
747
165
43
Have you tried to run something like iperf to verify what the network speed is when you take the disks out of the equation? This will at least let you know for sure that it is the disk subsystem that is the issue and not something else
 

frogtech

Well-Known Member
Jan 4, 2016
1,482
272
83
35
1.3gbyte/s -> ~ 10gbit/s

Maybe it's a really stupid question, but do you run the test on the server or over the network on a client?
All of his metrics are notated in bytes, I think he knows 1.3 GB is 10 Gb.

The problem he is illustrating is that he hits diminishing returns on the number of disks and perceived performance as it relates to what is technically expected.
 

frogtech

Well-Known Member
Jan 4, 2016
1,482
272
83
35
Have you tried to run something like iperf to verify what the network speed is when you take the disks out of the equation? This will at least let you know for sure that it is the disk subsystem that is the issue and not something else
I'm a little confused by his post as he says he has all 24 disks in his "main chassis" but the HBAs he listed are 9201-16e cards. I'm guessing this was a typo. If the disks are direct-attached whether via some disk shelf or internally(think Supermicro 846)then he doesn't need to run iperf.
 

Quaraxkad

New Member
Jun 10, 2017
3
1
3
40
I ran all tests on the server itself, of course. Not over the network. I would feel pretty stupid if that's what I had done... On my 1Gbps network...

I did experiment with IOMeter again last night, and got much better results... So I may have spoken too soon regarding the reliability of that Java test tool on such a large scale test. In previous comparisons on my old configurations they would both drop off in performance in roughly the same place. It may have been a coincidence that the limit of my previous hardware was also the limit of the Java tool.

But now, utilizing all 35 drives in IOMeter, with 5MB sequential reads on 5GB test files, I ended up getting a combined total of 3,057MBs. That's an average of 87MBps per drive. The average drive speed tested individually is 92MBps. So I guess everything is as it should be! I really should have re-tried IOMeter before posting...
 
  • Like
Reactions: frogtech

Quaraxkad

New Member
Jun 10, 2017
3
1
3
40
I'm a little confused by his post as he says he has all 24 disks in his "main chassis" but the HBAs he listed are 9201-16e cards. I'm guessing this was a typo.
Nope, not a typo, I just left out one detail. I am using three 16e cards, and each 24-bay chassis has a pair of 3x SFF-8088 to SFF-8087 adapters on the back panel. So there are six SFF-8087 ports inside each case.