Dell perc h755n- nvme hw raid

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

jpmomo

Active Member
Aug 12, 2018
531
192
43
I am trying to get some confirmation on how many pci lanes this raid card allocates to each of the 8 supported nvme drives. To me it seems like there is a bottleneck of only 8 lanes between all 8 drives and the CPUs. I can't get any clear answer from Dell. This is on a dell r7525 with their new h755n hw nvme raid controller. The docs and support folks claim that it supports x2 link width (2 gen4 pci lanes) per ssd. Even that seems to limit the gen4 drives to half the potential throughput. The drives are dell branded samsung pm1735 6.4tb gen4 nvme and intel optane p5800x. Both are u.2 gen4 and would be throttled if only able to use x2 vs x4. I also don't think that they are able to even get x2 with all 8 drives as far as I can tell. Please see diagrams below for some detail. Also let me know if there is something that I might be overlooking as I am not that familiar with dell and raid controllers in general.
1621950231725.png

1621950294324.png
 

i386

Well-Known Member
Mar 18, 2016
4,245
1,546
113
34
Germany
PCIE switches.
And usually with nvme drives you don't need the max throughput but the low latency. I think even with pcie switches in the topology the latency should be lower than sas or sata.
 

jpmomo

Active Member
Aug 12, 2018
531
192
43
Thanks for the quick reply. That's what I thought they were using. But even with pcie switches built into the raid controller, how can they claim 2 lanes x 8 drives? I understand about the benefits of low latency but it should be even lower if the raid card is bypassed altogether. It also seems like this could severely limit the max throughput of this system if all 8 drives are in use. The only benefit I can see is the actual raid functionality itself but at a pretty big performance hit. They should have designed with 4 x8 connectors on the card. (or 2 x16) so as to not bottleneck the drives. This is the way most of the other perc cards are designed.
 

UhClem

just another Bozo on the bus
Jun 26, 2012
438
252
63
NH, USA
Pictures are nice, but you need to dig deeper ("The devil is in the details.")
From the PERC 11 User's Guide [link]:
Pg. 8 :
● Supported drive speeds for NVMe drives are 8 GT/s and 16 GT/s at maximum x2 lane width.
Pg 15:
Non-Volatile Memory Express
Non-Volatile Memory Express (NVMe) is a standardized, high-performance host controller interface and a storage protocol for communicating with non-volatile memory storage devices over the peripheral component interconnect express (PCIe) interface standard. The PERC 11 controller supports up to 8 direct-attach NVMe drives. The PERC 11 controller is a PCIe endpoint to the host, a PowerEdge server, and configured as a PCIe root complex for downstream PCIe NVMe devices connected to the controller.
NOTE: The NVMe drive on the PERC 11 controller shows up as a SCSI disk in the operating system, and the NVMe command line interface will not work for the attached NVMe drives.

Conditions under which a PERC supports an NVMe drive
● In NVMe devices the namespace identifier (NSID) with ID 1, which is (NSID=1) must be present.
● In NVMe devices with multiple namespace(s), you can use the drive capacity of the namespace with NSID=1.
● The namespace with NSID=1 must be formatted without protection information and cannot have the metadata enabled.
● PERC supports 512-bytes or 4 KB sector disk drives for NVMe devices.
To dig even deeper, I would like to see a lspci -vv for a fully-configured R7525 (to ascertain whether the PCIe subset described above is actually visible, or just an implementation note on the (invisible) behind-the-scenes inside the H755N).
 

jpmomo

Active Member
Aug 12, 2018
531
192
43
thanks again for replying. Yes, I did read through all of the docs and any notes that I could find. that was why I was referencing their claim of x2 per drive. I still haven't gotten any feedback on how they would be able to do this with only 1 x8 cable connected. Unless, the meant UP to 8 drives at x2 (not all 8 at the same time.) they have an equivalent to the lspci -vv in idrac and that is where I first noticed that the link width to be only x2 and not x4. linkcap vs linksta in lspci -vv speak.

For more detail on how things are connected, only 1 x8 cable from the mb to the perc card. there are then 2 cables that split into 2 each for a total of 4 that connect from the perc card to the 4 pci connectors on the backplane. so it looks like we have 8 lanes into the card that is then split into 2 x 4 lanes for the 2 perc card internal connectors. each of these cables splits into 4 x 2 lanes. each of these 2 lane cables are connected to the 4 pci connectors on the backplane (these connectors support x8 but they are only being fed x2!)

One thought is to find a cable that would connect one of the x8 pci connectors on the backplane to the 2 x4 connectors on the internal side of the perc card. this cable would be x8 on the backplane side and split into 2 times x4 on the perc card side. This would at least allow 2 drives to run at full x4 rate and still take advantage of the raid card. I am not sure if this is possible as I do not know if each of these connectors are standard connectors or dell proprietary.

Thanks again for taking the time to help me understand how the foot bone is connected to the shin bone!
 

UhClem

just another Bozo on the bus
Jun 26, 2012
438
252
63
NH, USA
Ah-hah!! So you actually have this already. [I had assumed you were attempting to "figure it out" [pre-purchase] from the docs (& support) alone.]

The best chance I have to answer (all?) your questions is by seeing a (actual) lspci -vv output. If you can produce one, pls .zip it and upload as an attachment. If not, then please peruse your IDRAC thingy and try to identify exactly (model#) what (PCIe) switch is being used. [Probably a PLX (e.g., PEX88024) or PMC (e.g., PFX 28xG4)]
 

jpmomo

Active Member
Aug 12, 2018
531
192
43
Yes, I purchased this server from the Dell outlet because it had 2 x 7763 milan CPUs. I didn't really know about the raid controller or how it was actually cabled. It also came with 2 x 6.4 TB nvme gen4 ssds. They actually had the 2 ssds in raid 1. I also have a few other gen4 nvme drives that I want to ensure they are not throttled. I have vmware on it now and unfortunately vmware doesn't have the equivalent of the -vv parameter for the lspci command. I can install a Linux distro and run that cmd to check the output. Regardless of pci sw, I still think it will only be capable of x8 in total. The plan would be to just cable via the xgmi cables directly to the backplane which would allocate 32 pci lanes.
Thanks again.
 

jpmomo

Active Member
Aug 12, 2018
531
192
43
I also ran across a broadcom 9560-16i that has 2 x8 slimsas connectors on the raid card. this card supports u.3 (sata,sas and nvme). but the issue is that there also seems to be some fine print regarding max througput " Eight-lane aggregate bandwidth of up to 16GB/s (16,000 MB/s) " I am not sure how to interpret that statement and can't seem to find anyone that can give any details.
 

uldise

Active Member
Jul 2, 2020
209
72
28
there are then 2 cables that split into 2 each for a total of 4 that connect from the perc card to the 4 pci connectors on the backplane.
so, i assume this is direct attach backplane? and you connect 4 NVME drives to each h755n? if you have 16 drives in total, then you need 4 h755n?
 

jpmomo

Active Member
Aug 12, 2018
531
192
43
No. There is an 8 x 2.5 drive cage. There is a total of 8 drives. The backplane that these drives slide into supports x4 lanes for each drive. On the internal side of the backplane there are 4 x8 slimsas connectors that supports 2 drives each. If I connect directly to the MB with 4 x8 cables, I will have the full bandwidth for all 8 drives. If I connect to the perc card, I am limiting the 8 drives potentially down to either x2 each or x1 each.
 

uldise

Active Member
Jul 2, 2020
209
72
28
thanks! i'm wondering if such a NVME backplane exists with built-in expander. if yes then there will be a band-with bottleneck anyway for say 16 drives..
I am limiting the 8 drives potentially down to either x2 each or x1 each.
but it's a Gen4 x1 or x2. 2GB/s or 4GB/s theoretical band-with - in real live a bit slower. agree, x1 will be a bit bottleneck, even x2 if very fast SSDs used..
 

jpmomo

Active Member
Aug 12, 2018
531
192
43
The Dell r7525 supports and has a backplane for 24 nvme gen4 ssds at x4 each. That backplane is connected to 160 pcie 4.0 lanes.
 

NateS

Active Member
Apr 19, 2021
159
91
28
Sacramento, CA, US
I think you're exactly right that that HBA will have 8x 2-lane connections to the drives, and then uses only one 8-lane connection for the backhaul, so if every drive was used at once, they'd all effectively be limited to 1-lane, and each individual drive will never exceed 2-lane speeds. In a sense, it's a bit like the NVMe version of a SAS expander, in that you're trading bandwidth to any individual drive for the ability to connect more drives in total.

The 2-lane only connections to the drives makes me wonder if this card is intended be used as one half of a dual-port NVMe setup. With dual port, each drive gets 2-lane connections each to two different HBAs. With that sort of setup, each drive would get a full 4 lanes of bandwidth, and they'd effectively share a 16 lane backhaul.
 

jpmomo

Active Member
Aug 12, 2018
531
192
43
How would you physically connect a dual-port NVMe drive to the 2 different HBAs? Keep in mind that the current setup is as follows:

8 drives connected into the backplane. 4 slimsas connectors on the backplane are connected to 2 slimsas (LP) connectors on the 755n raid card. one slimsas (LP) connector is then connected from the 755n to the server's motherboard.
 

NateS

Active Member
Apr 19, 2021
159
91
28
Sacramento, CA, US
How would you physically connect a dual-port NVMe drive to the 2 different HBAs? Keep in mind that the current setup is as follows:

8 drives connected into the backplane. 4 slimsas connectors on the backplane are connected to 2 slimsas (LP) connectors on the 755n raid card. one slimsas (LP) connector is then connected from the 755n to the server's motherboard.
I think the backplane would need to support it. Most likely there'd be separate connectors for each port, or maybe custom cables to split one port to two HBAs. I'm not sure your particular server supports it, but I bet there's some Dell server out there that does, and reuses these same HBAs.
 

jpmomo

Active Member
Aug 12, 2018
531
192
43
I ran a few more tests and noticed something interesting. The throughput (regardless of how many drives I have in raid0) always seems to max out at 14GBps. that would be close to the x8 at gen4 rates. This is probably due to the 8GB of cache on the perc card. When I configure one of these drives to be non-raid, it drops down to around 3.3GBps (x2 max). I was trying to help another user on the dell forums where he was needing max IOPs. For RND4K Q32T16, we had to set the raid settings to use NO read ahead and write-through in order to get around 700K R/W IOPs. Or we could just configure as non-raid and that metric would also be very high. If we configured as raid0 with the default settings (read ahead, write-back) the IOPs just for the RND4K Q32T16 would drop to around 150K. Not sure why this is happening as the other IOP metrics are higher with the normal raid0 config compared to the non-raid setup.
 

johanmmmm

New Member
Feb 16, 2022
1
0
1
How are RAID-devices exposed to Linux with the H755N? Just like any PERC (with /dev/sdx)? I assume Linux is unaware that there are NVMe-drives (and NVMe protocol) behind the scenes?