Playing with my SAS expander

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

TuxDude

Well-Known Member
Sep 17, 2011
616
338
63
After some recent discussions around here about link aggregation, I decided to do a bit of playing around and share the results. The hardware for this experiment consists of an IBM M1015 HBA, a SuperMicro SC846 chassis with the BPN-SAS2-846EL1 backplane, and lots of spinning disks.

Note: references to connector numbers below is according to the manual for the backplane, available at: http://www.supermicro.com/manuals/other/bpn-sas2-846el.pdf

The first thing I wanted to know if would work, is the 8-port-wide link aggregation back to the HBA - using two SFF-8087 cables from the HBA, connecting to ports 7 and 8 on the backplane, did result in an 8-wide link.

Code:
# ls /sys/class/sas_host/host0/device/
bsg  phy-0:0  phy-0:1  phy-0:2  phy-0:3  phy-0:4  phy-0:5  phy-0:6  phy-0:7  port-0:0  sas_host  scsi_host  subsystem  uevent
# ls /sys/class/sas_host/host0/device/port-0\:0/
expander-0:0  phy-0:0  phy-0:1  phy-0:2  phy-0:3  phy-0:4  phy-0:5  phy-0:6  phy-0:7  sas_port  uevent
# cat /sys/class/sas_host/host0/device/port-0\:0/sas_port/port-0\:0/num_phys
8
The next thing I wanted to know was whether I could also use the extra SFF-8087 connectors on the backplane to connect additional drives. So I connected a SFF-8087 breakout cable to port 9 on the backplane (7 and 8 both still connected to the HBA) and plugged 4 more HDDs into the other end. Happy to report that all 4 drives came up without issue - with a few of the hotswap bays still empty I've now got 25 drives connected through that backplane, with a 48Gbps link from the backplane back to my HBA.

The final bit was to put a load on it, and verify the performance. I quickly tossed together this tiny script, which combined with the existing monitoring I have on the server put a few nice big spikes into my throughput graphs.

Code:
# cat perftest.sh
for disk in a b c d e f g h i j k l m n o p q r s t u v w x y
do
  dd if=/dev/sd$disk of=/dev/null bs=1M count=10000 iflag=nocache &
done
wait
It's not pretty, but should be very easy for anyone to modify to run against any subset of drives in a linux box - in my case I've left out any drives connected to the onboard SATA ports of the MB. (yes, they do extend into sdaa, sdab, etc. on my system now.) On the other hand, the graphs generated by Grafana are pretty - damn you @vanfawx for getting me addicted to this software. raintank


Edit: As noted below I had an error in the grafana graph definition which caused the sum of bandwidth to be significantly off. I've left the post above with the bad graph alone so that the replies below still make sense - the correct graph is here: raintank
 
Last edited:

nephri

Active Member
Sep 23, 2015
541
106
43
45
Paris, France
If i'm not wrong, the M1015 is a PCIe 2.0 8x card that mean the maximum throughtput of 8 PCIe 2.0 lanes are around 32 Gbits/s ?
That mean, you could not go up to 4 Go/s as a max throughtput. i'm wrong ? How can you reach 5 Go/s from your link raintank ?
 

TuxDude

Well-Known Member
Sep 17, 2011
616
338
63
The numbers I'm getting from grafana here do seem to be a little higher than is theoretically possible, and I haven't been able to figure out exactly why just yet. That graph does include an additional 4 drives that are connected through the onboard SATA ports, however there is little to no activity on them (eg. zoom in on the area between the two large spikes) - those drives would be responsible for a maximum of 30 to 40 MB/s which is nothing when we're looking at multiple GB/s bandwidth coming off the HBA.

The PCIe 2 bus has a maximum bandwidth of 5Gbps (500 MB/s) per lane, so 40Gbps (4 GB/s) for the 8-wide link to the HBA. Each SAS2 lane has a maximum bandwidth of 6Gbps (600 MB/s), so the 8-wide link from the HBA to the expander should have 48Gbps (4.8 GB/s) of bandwidth. And for reference I've taken all of these bandwidth numbers from here: List of device bit rates - Wikipedia, the free encyclopedia which according to the conventions section at the top is using the proper definition of 1000 bytes per KB, 1000 KB/MB, etc.

The graph output from grafana (and the export at raintank above) has its unit listed as GiB, so we need to do some conversion to get our units in order - 1GiB is NOT 1GB, unless you're microsoft and decide to follow your own standard instead of those from the IEC. Doing a bit of quick math, 5.1 GiB (around the max seen on the graph in the first post) is almost 5.5GB. Hmm..., we're getting even farther above the theoretical maximums - this is going to take a bit more research....

----------

Ok - went back over the graph definition, and I found the problem. I was accidentally summing together statistics for every disk plus every partition on each disk, so the results were about double what they should be. The fixed graph maxes out at around 2.5 GiB, or about 2.7 GB /s. That seems more reasonable, working out to about 108 MB/s/disk - slightly lower than I would expect for streaming sequential reads but there are a few rather old disks in there. I'll need to get some SSDs to get up to the PCIe / SAS maximum bandwidths. It is still more than the maximum bandwidth of a 4-wide SAS link though (which would be only 2.4 GB/s), so it does still serve as confirmation that the SAS link aggregation is working as intended.

New fixed graph: raintank
 

PnoT

Active Member
Mar 1, 2015
650
162
43
Texas
Thank you so much for coming through on this, from the previous thread, and doing the work on verifying if this actually works or not. I know SuperMicro has been all over the place with their responses on this subject. I look forward to more of your testing if and when you can get some SSDs in the mix. Just for the hell of it can you run the same exact test with 1 cable connected? I know you've proven you're above theoretical max but meh couldn't hurt right.

A complete guide on setting up your monitoring and Grafana would be wicked ;)
 
Last edited:
  • Like
Reactions: Chuckleb and nephri