Setup a 6x SSD RAIDz1 array .... and it is SLOW!

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

TrumanHW

Active Member
Sep 16, 2018
253
34
28
From the looks this isn't raidz1, and i can't say from this what it is.

(please run 'zfs version')

In my case:
zfs-2.1.4-1
zfs-kmod-2.1.4-1

This is how raidz1's should looks like

View attachment 28312

this is example of performance (on this setup)

View attachment 28313




Sure, but they aren't and controller thinks they are spinning rust, resulting in less performance.


Once I got SHIT results in RAIDz1 ... I reverted to ZFS RAIDz-0 (striped) ... I mean, if it can't outperform my RAIDz2 spinning drives with no condom? There's SOMETHING wrong; whether it's the optimization, the SAS controller !?, something ... bc there's just no excuse for it to be slower without parity and working all in parallel than having lost 2 drives to parity.

Anyway, i'm on the warpath to seeking out either an R7415 or if I can find a good enough deal an R7515 ... think it's worth the "future proofing" of people able to get the most out of PCIe 4.0 NVMe SSDs for the modest difference in price between the two systems..? So far it looks like it'll cost me about 500 extra to get the version compatible with Epyc 2 and Epyc 3 as well as off the bat (due to Epyc 2) upgrading the PCIe to 4.0 ...?

Thanks for your response and time, as always!

Truman
 

CyklonDX

Well-Known Member
Nov 8, 2022
846
279
63
be slower without parity and working all in parallel than having lost 2 drives to parity.
Its very likely a combination of sas controller, backplane (how signaling works on it), as well as your raidz setup
(As mentioned it doesn't look like its your normal raidz setup.)

think it's worth the "future proofing" of people able to get the most out of PCIe 4.0 NVMe SSDs for the modest difference in price between the two systems..
There's no such thing as future proofing.
You should consider micron 3400 as entry grade pcie gen4 with power loss protection. Most proconsumer, and consumer grade nvme's won't have power loss protection. However you plan on using them, keep in mind it may not be best solution for high write solution, as it will chew right through your nvme's - and they often don't have the best of endurance. Next thing would b e keeping it cool, the server cooling might be fine as long as you don't mind noise.

Depending on server configuration, you won't have full pcie x16 slot; Typically in dell poweredge, pcie slots are shared between daughter cards (one side network, and other side raid controller)
 
  • Like
Reactions: TrumanHW

TrumanHW

Active Member
Sep 16, 2018
253
34
28
Its very likely a combination of sas controller, backplane (how signaling works on it), as well as your raidz setup
(As mentioned it doesn't look like its your normal raidz setup.)
How would you go about experimenting to determine what it is ? There are only 2 SATA ports on the unit ... and these SSDs are just SATA-III ... is there actually a controller better suited to this ? Maybe I'd be wasting my money to get an NVMe array and the machine designed to do it ...

There's no such thing as future proofing.
Guess the quotes failed to show my awareness that there's almost not such thing ... but getting PCIe 4.0 in this case really has to constitute that "future proofing" given this scenario.

There are quite a few more upgrades to the R7525 than just the CPU. If I can find a unit inexpensive enough I'd grab one, but now they've gone to charging a massive surcharge if you want the 24 SFF variant. :'(

You should consider The Micron 3400 as a PCIe Gen4 with Power Loss Protection.
Ha! I was about to buy the Micron 7300 Pro in 7.68TB:

Most proconsumer, and consumer grade NVMe lack power loss protection and the resilience for the wear ZFS places on NVMe drives.
Security for Your Data
Solid, firmware-based security includes options for TCG-Opal 2.0 and IEEE-1667 supports SED (self-encrypted drives). The 7300 also includes power-loss protection for data at-rest and in-flight as well as data-center-class data path protection for user and metadata.

As far as endurance, it's rated at 1 DWPD or 26 PETA bytes... Which is WAY over what I'm looking at ever moving around.

7300 Pro / Max SSD Drives


Depending on the config, you may lack a full PCIe x16 slot. PowerEdge PCIe slots are shared between daughter cards (i.e., their NIC & PERC.)
I plan on removing the PERC and just using the BOSS (Dual M.2 for bootup)...

What would you say is a good price for the 7.68TB 7300 Pro (used) roughly ..?

Thanks for all of your input. I only in fact checked the Power Loss Protection bc you reminded me ... and it would've only been luck that the feature was present.
 

TrumanHW

Active Member
Sep 16, 2018
253
34
28
From the looks this isn't raidz1, and i can't say from this what it is.

(please run 'zfs version')

In my case:
zfs-2.1.4-1
zfs-kmod-2.1.4-1

This is how raidz1's should looks like

View attachment 28312

this is example of performance (on this setup)

View attachment 28313




Sure, but they aren't and controller thinks they are spinning rust, resulting in less performance.

SHIT !!! I forgot to update this thread earlier and mention that these results are now tested while it's in RAID-0.
I (so!) apologize for not updating this when I reformatted it, but I wanted to eliminate even the minor 'parity penalty' and give the drives // array their best chance possible to turn in decent numbers.

I'm going to have to study that excel file later tonight ...

Isn't that what all computers think ? What SSD (excluding NVMe) doesn't pretend to be a spinner? Is that to say that SATA SSD are incapable of performing to their capacity because the computer is "tripped out" with a capgras delusion? (Meant to be a funny metaphor, if you have a moment to see something entertaining, check this out:
)
 

TrumanHW

Active Member
Sep 16, 2018
253
34
28
Its very likely a combination of sas controller, backplane (how signaling works on it), as well as your raidz setup
(As mentioned it doesn't look like its your normal raidz setup.)
Yeah, I totally forgot earlier to mention I put it in RAID-0 (ZFS at least) as it's supposed to be "the fastest" RAID config (if RAID isn't a misnomer, obviously).

Anyway, I just purchased this:

www.ebay.com/itm/285148284088

Which, I could of course run it with the SATA SSDs when it arrives ... but otherwise I have 4 (enterprise) NVMe SSDs ... and I'm probably going to buy another 7 or 8 NVMe that are 7.68TB ea (as my current enterprise NVMe are only 3.84 TB). Likely the Micron 7300 Pro as it has Power Protection and also can withstand 1 DWPD ... etc.
 

CyklonDX

Well-Known Member
Nov 8, 2022
846
279
63
How would you go about experimenting to determine what it is ?
I would ask following
What is this system?
Check 'zfs version'
Determine if this is some custom/older/outdated flavor of zfs. (Latest package is zfs‑2.1.10)
Upgrade zfs to that latest package (don't worry you won't loose data - but it will ask you to upgrade your pool after you do that)
Create the pool manually, and don't use it for your OS install. I'll be frank those os setups are dodgy as ... better use brtfs, ext4 or xfs for OS; and better make it sit on separate disks. (preferably nvme/ssd on pcie or something as such -and set it up as raid1)


is there actually a controller better suited to this ?
For SSD's i would recommend 9300-8i or better.
(you can use following cables to connect your sas controller with backplane if its using sff 8087 plug)

What would you say is a good price for the 7.68TB 7300 Pro (used) roughly ..?
Isn't 7300 Pro a gen3 device? A used 8T nvme most likely would cost around ~800->1k usd if you could even find one.
If you want to run gen4 you would need Micron 7400 Pro (hard to say - likely no used ones yet.)

Running Gen3 on Gen4 pcie will only help with temps (as at best it will run at half the speed), it won't grant you any performance benefits.
 

CyklonDX

Well-Known Member
Nov 8, 2022
846
279
63
Isn't that what all computers think ? What SSD (excluding NVMe) doesn't pretend to be a spinner? Is that to say that SATA SSD are incapable of performing to their capacity because the computer is "tripped out" with a capgras delusion? (Meant to be a funny metaphor, if you have a moment to see something entertaining, check this out:
The sas controllers (especially older ones) didn't differentiate between SSD and HDD.
It would run same algorithm for reading, and writing data.

Including Queues, time spaces, and such... how they work.
Its mostly secret sauce, and they can get really complicated to explain. But end day the algo running on older controllers will do try to do few things that are unnecessary, and sometimes too slow for SSD's - one of the most commonly known algo was to set specific data chunk size it will try to read, with static spaced timers before it attempts to read from another disk.
(With HDD's its not a problem, because they calculated that for HDD's, and read-ahead will send the request to disks - and it will wait for being ready to read to controller or write to.)

Tighter refresh timers can help, but chip will overheat quickly and result in really bad performance - as it will internally downclock to snail pace.

There are plenty ... plenty beef in this.
 
  • Like
Reactions: TrumanHW

i386

Well-Known Member
Mar 18, 2016
4,245
1,546
113
34
Germany
SHIT !!! I forgot to update this thread earlier and mention that these results are now tested while it's in RAID-0.
I (so!) apologize for not updating this when I reformatted it, but I wanted to eliminate even the minor 'parity penalty' and give the drives // array their best chance possible to turn in decent numbers.
I would still look at the samsung evo ssds and blame them for low numbers before trying something else/replacing otehr hardware
Isn't that what all computers think ? What SSD (excluding NVMe) doesn't pretend to be a spinner? Is that to say that SATA SSD are incapable of performing to their capacity because the computer is "tripped out" with a capgras delusion?
No, computers see storage devices (like hdds & ssds) as "block" devices that are addressed via logical block addressing (lba, it's already an abstraction of the real media).
All my ssds reported as ssds correctly to the host or controller correctly as ssds, do you have a link to a ssd that pretends to be an hdd (I'm interesting about learning more about this :D)?
My intel s3700 400gb is capable of reaching all the advertised numbers under sustained workloads. That's slower than the maximum sata3/6GBit/s speeds, but still faster than other (sata) consumer ssds. So no, sata ssds are not "crippled" in any way.
 

TrumanHW

Active Member
Sep 16, 2018
253
34
28
View attachment 28421

they do not pretend to be hdd's its just controller never heard of SSD so it sees it as normal HDD.
I wasn't writing a white paper. I was just speaking to be understood in simple terms.
I do somewhat understand this topic, as I'm decent in the use of Ace Labs PC3000 to repair spinning drive's (eg, the FW, RAID, etc., as (amongst other services) I offer recover people's data professionally.

To be more accurate I would've said it's a combination of the legacy drivers (kexts) in Operating Systems (certainly HFS+, Windows 7 & 10 if I recall in the Posix l programming) which weren't updated as SSD Tech proliferated, requiring the ad hoc use of TRIM and obviously bandwidths which were artificially limited to SATA as we see from the rapid bandwidth changes of NVMe. This is a product of using logic that anticipated only the use of spinning drives.

But if you feel that's still incompatible with what I said, I accept your opinion.
 

TrumanHW

Active Member
Sep 16, 2018
253
34
28
The sas controllers (especially older ones) didn't differentiate between SSD and HDD.
It would run same algorithm for reading, and writing data.

Including Queues, time spaces, and such... how they work.
Its mostly secret sauce, and they can get really complicated to explain. But end day the algo running on older controllers will do try to do few things that are unnecessary, and sometimes too slow for SSD's - one of the most commonly known algo was to set specific data chunk size it will try to read, with static spaced timers before it attempts to read from another disk.
(With HDD's its not a problem, because they calculated that for HDD's, and read-ahead will send the request to disks - and it will wait for being ready to read to controller or write to.)

Tighter refresh timers can help, but chip will overheat quickly and result in really bad performance - as it will internally downclock to snail pace.

There are plenty ... plenty beef in this.
Being able to find / see the SAS controller's commands and understanding them is impressive.

Very good info. How in the end did you resolve this..?
I assume upgrading the SAS controller to a new(er) model mitigates the issue ..?
 

CyklonDX

Well-Known Member
Nov 8, 2022
846
279
63
(for ssds) On older sas controllers often you can change advanced device properties, and adapter timing properties.

(this will be similar for almost every lsi sas controller - starting from pages 3-4 "Advanced Adapter Properties")
by changing those settings you can get it better for SSDs.

Here's a good resource (that person talks a bit - and shares some interesting info on controllers)

and this guy (he has some deep knowladge how storage, backplanes, and such work)


myself I use lsi sas 9300.
 
  • Like
Reactions: TrumanHW

TrumanHW

Active Member
Sep 16, 2018
253
34
28
I've since purchased an R7415 (Epyc server) & several NVMe drives.
While the drive speeds in Ubuntu & Windows (synthetic benchmarks) are good:


Micron 7300 Pro:
Write: 2GB/s​
Read: 3GB/s​

Micron 9300 Pro:
Write: 3GB/s​
Read: 3GB/s​


In TrueNAS, (where I'd expect to get 2-3GB/s) they don't even get the performance of a SINGLE drive:


8x Micron 9300 Pro (RAIDz2):
Write: 650 MB/s​
Read: 750 MB/s​


4x Micron 7300 Pro (RAIDz1):
Write: 450 MB/s​
Read: 550 MB/s​


4x Samsung Evo 870 (RAIDz1):
Write: 450 MB/s​
Read: 550 MB/s​


Again ... the befuddling part is that I've gotten up to 1.2GB/s with 8 spinning drives that probably R/W at 200MB/s max!
(Granted, the consistency with the NVMe drives at 650MB/s would still trounce the sporadic 800MB/s - 1GB/s of the spinning drives)

Where we could write off the performance via a SAS2 card to the timing / driver, etc...
The R7415 slots are NVMe, only, until I'd purchased an HBA330 for SATA/SAS access.
 

mattventura

Active Member
Nov 9, 2022
447
217
43
Start simple and change one variable at a time.

Start with a zpool with a single drive. This should get close to the non-ZFS performance. Then do it with simpler configurations first (2x single drive vdevs, then 3x, etc, then move onto mirrors) before moving onto any RAIDZ levels.
 
  • Like
Reactions: TrumanHW

TrumanHW

Active Member
Sep 16, 2018
253
34
28
Start simple and change one variable at a time.
Agreed:

Start with a single drive zpool. This should get close to the non-ZFS performance.
I tested the individual drive of which the (below) RAID-5 is comprised...
And then tested a 3-drive RAID-5 array in Ubuntu (made with the above drive type)...
Both shown below in the pictures, & got the same crap perf as in ZFS in Ubuntu.

Ubuntu 1 NVMe  Perf.jpgUbuntu 3 NVMe RAID Perf.jpg



The machine (as I should've mentioned that if I didn't) ...

Dell R7415
EPYC (first gen)
256GB DDR4 ECC
Sold with 24x SFF NVMe slots
(required adding an HBA330 to use SATA / SAS as boot in the 1st 8 bays (0-7)
I've been using the last 8 (15-23) for the NVMe device tests ...


Thank you VM in the meantime ... I'm grateful for all ideas.
 
Last edited: