9400-16e bottlenecking at 4GB/s?

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

IamSpartacus

Well-Known Member
Mar 14, 2016
2,515
650
113
I'm currently running badblocks on 24 x 10TB SAS3 HDDs and seemingly running into a bottleneck that is limiting me to 4GB/s in aggregate io (see screenshot below). When I run badblocks on just a select few disks at a time I get upwards of 250+ MB/s per drive. The HBA is connected to a PCIe 3.0 x8 slot on a Supermicro X11SCH motherboard.

Any ideas why I'm only hitting about half the theoretical bandwidth of the PCIe slot? Is the chipset limiting me?
 

Attachments

UhClem

just another Bozo on the bus
Jun 26, 2012
435
249
43
NH, USA
Insufficient information.
What SAS expander? (full model#) Dual-linked?
Full topology of drive connections? [expander vs HBA-direct]
 

IamSpartacus

Well-Known Member
Mar 14, 2016
2,515
650
113
Insufficient information.
What SAS expander? (full model#) Dual-linked?
Full topology of drive connections? [expander vs HBA-direct]
What SAS expander? (full model#) Dual-linked?

QCT JB4242 Dual Linked

Full topology of drive connections? [expander vs HBA-direct]

Not sure what you're asking for here.
 

Sogndal94

Senior IT Operations Engineer
Nov 7, 2016
114
72
28
Norway
Full topology of drive connections? [expander vs HBA-direct]

Not sure what you're asking for here.
He/she/it/A-10 wants to know how your HDDs are connected to your card. Is each HDD connected to your card with one cable each drirectly, or are they connected to a sas expander(most of the time a pcb or card with HDDs connected to it, with one or to cables connected to the sas card itself. :)
 

IamSpartacus

Well-Known Member
Mar 14, 2016
2,515
650
113
He/she/it/A-10 wants to know how your HDDs are connected to your card. Is each HDD connected to your card with one cable each drirectly, or are they connected to a sas expander(most of the time a pcb or card with HDDs connected to it, with one or to cables connected to the sas card itself. :)

I figured that was covered in my first answer being that it's an enterprise SAS3 disk shelf. See the link below.

 
Last edited:

mirrormax

Active Member
Apr 10, 2020
225
83
28
Can it only only do X4 per connector? that's what it sounds like anyways hitting 4GiB/s
 

UhClem

just another Bozo on the bus
Jun 26, 2012
435
249
43
NH, USA
OK, so all 24 HDDs are on the JB4242. The specs say it has 2 SAS Interface Modules, but the documentation is piss-poor. Is it the case that you have a total of 2 miniSAS-HD cables from the 9400-16e to the JB4242 (and the other 2 9400 ports are unused)?

A max of 4000 MB/sec (w/2 cables) smells like SAS-2 (though 4300-4400 MB/sec is the real-world max for 2xSAS-2). Let's try to verify that your HDDs are actually connecting at SAS-3 (12Gbps):
Please do: sg_logs -p24 /dev/sda | grep bps
(and try it on a few: sdf, sdk, ...)
 

IamSpartacus

Well-Known Member
Mar 14, 2016
2,515
650
113
OK, so all 24 HDDs are on the JB4242. The specs say it has 2 SAS Interface Modules, but the documentation is piss-poor. Is it the case that you have a total of 2 miniSAS-HD cables from the 9400-16e to the JB4242 (and the other 2 9400 ports are unused)?

A max of 4000 MB/sec (w/2 cables) smells like SAS-2 (though 4300-4400 MB/sec is the real-world max for 2xSAS-2). Let's try to verify that your HDDs are actually connecting at SAS-3 (12Gbps):
Please do: sg_logs -p24 /dev/sda | grep bps
(and try it on a few: sdf, sdk, ...)
Yes I have two cables going form the 9400-16e to the JB4242 and the other two ports are unused.

All 24 drives return negotiated logical link rate: 12 Gbps
 

Attachments

Last edited:

Bjorn Smith

Well-Known Member
Sep 3, 2019
876
481
63
49
r00t.dk
I might be wrong, but pcie 3 x8 has a total bandwidth of 7.88GB/s, but since you are only using two of the ports, the card might have some kind of "optimisation" that uses 2 lanes per port, and that could explain your 4GB/s.

So at most you could suck 7.88 GB/s out of the card, even if the disks themselves could provide more bandwidth.

But I guess the documentation should say that.
 

IamSpartacus

Well-Known Member
Mar 14, 2016
2,515
650
113
I might be wrong, but pcie 3 x8 has a total bandwidth of 7.88GB/s, but since you are only using two of the ports, the card might have some kind of "optimisation" that uses 2 lanes per port, and that could explain your 4GB/s

But I guess the documentation should say that.
I don't believe that to be the case being that a single miniSAS-HD cable/ort supports 48Gbps on it's own. I've yet to see anywhere saying you need to each port supports only 16Gbps in PCIe bandwidth. But I could be wrong.
 

Bjorn Smith

Well-Known Member
Sep 3, 2019
876
481
63
49
r00t.dk
I don't believe that to be the case being that a single miniSAS-HD cable/ort supports 48Gbps on it's own. I've yet to see anywhere saying you need to each port supports only 16Gbps in PCIe bandwidth. But I could be wrong.
True - but it might be some "optimisation" they have done inside the card to to make each cable get their fair share of the theoretical bandwidth.
 

Bjorn Smith

Well-Known Member
Sep 3, 2019
876
481
63
49
r00t.dk
Documentation says:
" • Up to 8 storage interface PCIe links. Each link supporting x4 or x2 link widths up to 8.0 GT/s (PCIe Gen3) per lane "

Which could be interpreted as you have to use all links to get the full bandwith available - but perhaps someone at broadcom can tell you.

This has some interesting information about bonding of ports, so perhaps there is some configuration you do do on the card to configure how each port should work?
 

UhClem

just another Bozo on the bus
Jun 26, 2012
435
249
43
NH, USA
Yes I have two cables going form the 9400-16e to the JB4242 and the other two ports are unused.

All 24 drives return negotiated logical link rate: 12 Gbps
Was afraid of that ...
This isn't looking good ...

(I hope I'm wrong, but) My suspicion is that Quanta cheaped out on the SAS-3 Expander chip(s) they used (i.e. not LSI or PMC). It appears that the JB4242 "talks the talk" [SAS-3 protocol] but does not "walk the walk" [SAS-3 throughput] ;).
==>[EDIT (19Apr22)] Yes, I am happy to report that I was wrong. I located a JB4242 Expander firmware update on Quanta's site with the filename 12G-EXPANDER_FW-Broadcom_1.06.zip--probably safe to say that it does use an LSI chip.(My apologies, QCT.)[/EDIT]

Your (screenshot) evidence covered write throughput (24x badblocks during the write phase of a test_rw). For completeness, we should test read throughput; (rather than firing up badblocks again,)
Try this script (copy/paste into foo & chmod 755 foo):
Code:
#!/bin/bash
for i in {a..x}
do
echo $i `hdparm -t --direct /dev/sd$i 2> /dev/null | sed -n -e "s/^.*= \(.*\)$/\1/p"` &
done
wait
[Note that the MBs in the output are really MiB]
 
Last edited:

UhClem

just another Bozo on the bus
Jun 26, 2012
435
249
43
NH, USA
I'm actually currently in the read phase of badblocks so see below.
Very convenient!
And even more pessimistic evidence. Note how there is more variance in the read speeds (vs write speeds in first post). When an expander is bottlenecking, reading is more stressful.

Just for grins, do a lsscsi -v | grep enclos
 

IamSpartacus

Well-Known Member
Mar 14, 2016
2,515
650
113
Very convenient!
And even more pessimistic evidence. Note how there is more variance in the read speeds (vs write speeds in first post). When an expander is bottlenecking, reading is more stressful.

Just for grins, do a lsscsi -v | grep enclos


Code:
# lsscsi -v | grep enclos
[1:0:32:0]   enclosu QCT      JB4242 SIM0      1600  -
Also, I have a second one of these units connected the exact same way (2 miniSAS-HD cables from 9400-16e to JB4242). In that system is 24 SATA disks and 8 x SAS3 SSD's. I ran your script on that system and it comes back with the below output which adds up to roughly 7,126MiB (7.47GB/s) which is pretty close to the theoretical bandwidth of of PCIe 3.0 x8.

Code:
af 473.83 MB/sec
d 137.28 MB/sec
m 135.16 MB/sec
ae 479.33 MB/sec
j 136.54 MB/sec
i 136.62 MB/sec
l 136.46 MB/sec
z 476.87 MB/sec
n 137.20 MB/sec
ac 484.38 MB/sec
u 137.12 MB/sec
ab 484.79 MB/sec
al 135.76 MB/sec
x 138.42 MB/sec
ad 474.37 MB/sec
y 481.95 MB/sec
k 135.90 MB/sec
v 137.90 MB/sec
q 138.50 MB/sec
h 137.23 MB/sec
s 137.92 MB/sec
p 139.10 MB/sec
aa 487.15 MB/sec
o 137.21 MB/sec
r 137.76 MB/sec
g 132.60 MB/sec
c 136.62 MB/sec
am 137.16 MB/sec
t 137.76 MB/sec
w 137.75 MB/sec
a 136.43 MB/sec
b 137.08 MB/sec
 

itronin

Well-Known Member
Nov 24, 2018
1,233
793
113
Denver, Colorado
Regarding the system and shelf that is misbehaving... any chance that one of your cables and/or connector pairs is not passing traffic (for whatever reason) and/or has become rate limited in some fasion ? I'm more inclined to think a cable is misbehaving.

It may be useful to use a single cable and measure your performance and see if you get 4GB/s and then test the other cable by itself. Make sure to use the same connections in/out of the shelf for each cable as you are when two cables are in use.
 
  • Like
Reactions: UhClem

IamSpartacus

Well-Known Member
Mar 14, 2016
2,515
650
113
Regarding the system and shelf that is misbehaving... any chance that one of your cables and/or connector pairs is not passing traffic (for whatever reason) and/or has become rate limited in some fasion ? I'm more inclined to think a cable is misbehaving.

It may be useful to use a single cable and measure your performance and see if you get 4GB/s and then test the other cable by itself. Make sure to use the same connections in/out of the shelf for each cable as you are when two cables are in use.
Will have to wait until I'm home to start messing around with physical connections for testing. But I will do so.
 

UhClem

just another Bozo on the bus
Jun 26, 2012
435
249
43
NH, USA
Regarding the system and shelf that is misbehaving... any chance that one of your cables and/or connector pairs is not passing traffic (for whatever reason) and/or has become rate limited in some fasion ? I'm more inclined to think a cable is misbehaving.

It may be useful to use a single cable and measure your performance and see if you get 4GB/s and then test the other cable by itself. Make sure to use the same connections in/out of the shelf for each cable as you are when two cables are in use.
Bingo!
In light of the (excellent!) numbers from the second system, I do believe you've nailed it.
[I'm still puzzled why it only reached a ceiling of 4000 M/s, and not 4300-4500 ... any chance that the screenshot #s are MiB?? (that is iotop, right?) [but I'll get over it]]

I've never worked with any Expansion Subsystems, but is it possible that the troublesome first system has been "Management Configured" such that the connectivity/function of its connectors/SIMs no longer matches those of (the probably Default) second system? [just a WAG]
 
  • Wow
Reactions: itronin

IamSpartacus

Well-Known Member
Mar 14, 2016
2,515
650
113
Bingo!
In light of the (excellent!) numbers from the second system, I do believe you've nailed it.
[I'm still puzzled why it only reached a ceiling of 4000 M/s, and not 4300-4500 ... any chance that the screenshot #s are MiB?? (that is iotop, right?) [but I'll get over it]]

I've never worked with any Expansion Subsystems, but is it possible that the troublesome first system has been "Management Configured" such that the connectivity/function of its connectors/SIMs no longer matches those of (the probably Default) second system? [just a WAG]
Well I've narrowed down that there is indeed a problem just not exactly what the source of that problem is. I will log into the management console on both systems and compare them to see if there's any configuration differences. I also will do some moving around and swapping of cables/ports as @itronin suggested to see if I can isolate the issue.
 
  • Like
Reactions: itronin