What kind of R/W perf can I hope for with 4x NVMe drives over SFP28 using TrueNAS..?

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

Rand__

Well-Known Member
Mar 6, 2014
6,634
1,767
113
No.
Its bound not to be a TrueNas problem ( or only in combination with your specific HW) since other TNC boxes perform much better.

Now that u established how to test it would be time to establish a baseline.
Test a single drive or single mirror and then test a double pair (4 drives) (same jobs and double #jobs)
The goal is to see if there is a total limit or a per drive limit.
 
Last edited:
  • Like
Reactions: TrumanHW

ano

Well-Known Member
Nov 7, 2022
654
272
63
zfs = cpu!!!!

hence why it spins up fans 2-3GBs sounds reasonable
 

Rand__

Well-Known Member
Mar 6, 2014
6,634
1,767
113
Dont think he's getting that, thats theoretical single drive perf he's quoting;)
 

TrumanHW

Active Member
Sep 16, 2018
253
34
28
zfs = cpu!!!!
hence why it spins up fans 2-3GBs sounds reasonable
Sorry it's taken me a while to reply...

Yup, that's what I thought: 2GB/s – 3GB/s

My CPU and Temp stats:

Max CPU utilization: 7% (at up to 800MB/s )
Maximum system load: 3%
CPU Temp:

min: 45.45
avg: 46.11
max: 47.95

But the CPU stayed at 3% during the transfer, and only hit 7% for a split-second.

In the post I made about fans, I was lamenting Dell's new stupid IPMI policy.
Basically, it seems Dell removed the administrative rights to adjust fan speeds.
With exhaust, CPU & SSD temps no higher, some drives cause it to go to 10k rpm.
With the same mfr (Micron, but the 7300 pro, not 9300 pro they stay under 4k rpm..
Seems there's a tendency of people saying "CPU" without really meaning it.




A single 7300 Pro gets 80% the performance of 4x 7300 Pro in RAIDz1
4x 7300 Pro in RAIDz1 gets 85% the performance of 8x 9300 Pro in RAIDz2

At that, the single-drive performance is only 20% the drive's actual performance.
Getting 1/4th the drive's supposed performance while the CPU never exceeds 7%
Copying the same data off the array 10x never seems to be accelerated by L2arc.
R7415, 256GB DDR4 ECC: Read a 20GB folder, 10th time at the same speed as the first.

Might install ubuntu & setting up ZFS to see if it gets the same shit performance.


I've gotten up to 1.2GB/s from a T320 with 8x SPINNING drives.
Far faster than a single drive's maximum performance.
Obviously, the spinning drives have a much wider range of performance ...
It's just disappointing that NVMe get less than a single drive's performance when grouped.
Verses spinning drives reliably performing faster than a single drive can when grouped.

As mentioned ... performance tests of each SSD via Windows // Ubuntu are:


Read (Synthetic Benchmark)
Write (Synthetic Benchmark)
Read (ZFS Data Test)
Write (ZFS Data Test)
9300 Pro (single drive)
3.2 GB / s​
3.2 GB / s​
9300 Pro (8 SSD in RAIDz2)
750 MB/s​
650 MB/s​
7300 Pro (single drive)​
3.2 GB / s​
2.2 GB / s​
680 MB/s​
550 MB/s​
7300 Pro (4 SSD in RAIDz1)
680 MB/s​
550 MB/s​
Evo 870 (single drive)
500 MB / s​
500 MB / s​
Evo 870 (4 SSD in RAIDz1)
500 MB/s​
400 MB/s​
(all of the numbers shown above are average results from tests on a Dell R7415 EPYC with 256GB RAM)
 

Rand__

Well-Known Member
Mar 6, 2014
6,634
1,767
113
Why dont u try mirrors at all? or at least scale slowly? and try different #jobs and qd?
What's the test command? what are the dataset options? 7% cpu util is nice and dandy but is that 1 core at 100%? total util potentially does not matter unless u scaled up to 1 nvme/core/job

Also, again, it is not TNC, its your particular hw, i can get more than 750 MB with nvme easily
 

ano

Well-Known Member
Nov 7, 2022
654
272
63
something is weird, you should max thoose drives pretty much with that cpu on lz4 and the right benchmark fio even at 128k
you should rock 50-75% easy on cpu
 
  • Like
Reactions: TrumanHW

Rand__

Well-Known Member
Mar 6, 2014
6,634
1,767
113
Are those Dell drives or non Dell drives? Maybe some proprietary mumbojumbo limiting performance? Didnt u say you wanted to open a ticket?
 
  • Like
Reactions: TrumanHW

TrumanHW

Active Member
Sep 16, 2018
253
34
28
something is weird, you should max those drives out pretty much with that cpu on lz4
and the right benchmark fio even at 128k you should rock 50-75% easy on cpu
Now this is a comment I'm in COMPLETE agreement with. Thank you. :)

Run fio on devices directly as well of course
I have ... it's not much better. Rand_ and I discussed that (maybe page 2 or 3 of this thread) ...
When I started I'd made a mistake in not selecting the location of the array, but after that..? Same results as these.

With a RAIDz1 of 4x NVMe drives (that get over 2GB/s individually on the same machine in windows or ubuntu) got ≤ 175 MB/s
With a RAIDz2 of 8x NVMe drives (that get over 3GB/s individually on the same machine in windows or ubuntu) got ≤ 87.5 MB/s !!!

And I rather doubt that an R7415 which is sold supporting up to 24 NVMe drives in parallel...has a max perf of a single drive. ¯\_(ツ)_/¯
 
Last edited:

TrumanHW

Active Member
Sep 16, 2018
253
34
28
What's the test command?
I'm no longer testing via FIO. But when we did, contrary to a point you made (and I do trust you) ...
When I tested a RAIDz1 of 4 NVMe, I got about 125MB/s according to ZFS I/O in performance reporting
When I tested a RAIDz2 of 8 NVMe, I got about 87 MB/s according to ZFS I/O in performance reporting


Why dont u try mirrors at all? Or at least scale slowly? and try different #jobs and qd?
I can't right now, I had to pack my server up bc I'm moving VERY soon, but what difference does that make??
Isn't that for fine tuning? If people got this performance NO ONE would use SSD drives. There's another real problem somewhere.

Are these Dell drives or non Dell drives?
The 7300 Pro are Dell ... which is why the fans don't go to 10,000 freaking-rpm with them, and why I'm limited to 4 drives until I finish moving.

Maybe proprietary BS limiting performance?
But that was why I tested them in 3 different operating systems, and get good performance for both the Dell and non-Dell drives. ;-) Remember?


Didnt u say you wanted to open a ticket?
I focused on the IDIOTIC fan issue in which with nominal CPU and exhaust temps, they punish you and cry (via their fans) bc you didn't give them extortion money ... a ransom for arbitrarily-over-priced drives with their sticker. :)


I also get the same-dog-crap performance using SATA SSD in RAIDz1 (4x Evo 870) via the HBA330.
Eg: Despite each Evo 870's individual performance of ~500MBs R/W ... in ZFS as RAIDz1 the set of 4 drives R/W performance was:
~500MB/s W (real-world, not synthetic tests of media files greater than 1GB, the same as a single drive would get)...
~680MB/s R (real-world, not synthetic tests of media files greater than 1GB, an iota more than a single drive gets)...


I didn't test SSDs in mirrors, but did a baseline of a single drive vdevs (not FIO, but actual tests) which get approx equal performance as an array.

The synthetic benchmarks & actual 'real-world' performance (copying large (1GB+) video files) in Windows and Ubuntu are the same.
The single-drive ZFS benchmarks shown are of actual testing, which is already vastly off from their performance.

That the actual single device performance and synthetic tests in Windows and Ubuntu rules out some Dell BS, right ..?

Is there anything "appropriate" or "predictable" about drives which individually get ≥ 2GB/s getting 1/3rd as an array on this (R7415) hardware..?
When an array of 8x spinning 7200-rpm drives on a T320 doesn't get a fraction of ... but multiples of the constituent devices it's comprised of..?
When Windows and Ubuntu confirm that the devices are able to perform at 2GB/s or greater ... the CPU is hardly used, with 128 PCIe 3.0 lanes..?


Also, again, it is not TNC, its your particular hw, i can get more than 750 MB with nvme easily
I think so also, especially after testing on other operating systems, which ruled several potential issues (IMO)...
And ... the first thing it illuminates is that a single device gets less than 1/3rd of what that device gets in other OS, which is wrong to me. Why?
Because ... I have seen (literally) up to 1.2GB/s in a RAIDz2 array of 8 drives which at best can get 200MB/s.
As in ... they're getting 75% of their individual performance in aggregate ... and a single drive? Has no parity to even write out.


Future testing plans:
When I get re-setup (there's no time to learn this today) I'll setup a ZFS on Ubuntu RAIDz1 array ... (which will take me time).
I might just unpack & setup Ubuntu real quick to see if I can make a mirror or RAID volume to test perf in Ubuntu...
 
Last edited:

ano

Well-Known Member
Nov 7, 2022
654
272
63
how are you testing? exact commands? I've shared mine I thing with fio? can reshare

I always verify with iostat
sadly zpool iostat etc are usually very off
 

TrumanHW

Active Member
Sep 16, 2018
253
34
28
how are you testing? exact commands? I've shared mine I thing with fio? can reshare

I always verify with iostat
sadly zpool iostat etc are usually very off
Well, I'm testing with large video files, as I mentioned. I did perform tests with FIO ... but ultimately, it doesn't matter what I get with synthetic tests (although they're about the same / slightly worse than what I get with video). Why..? Bc it doesn't matter if I get 50 GB/s on synthetic ... if I can't get performance that's at least somewhat proportionate to the cost / technology, what's the point!? Real world's all that matters. And large video files give it a rather good chance of getting a high score, at that.

Again ... 4 SATA SSD in RAIDz1 get nearly the same performance ...
as 4x NVMe with 4x (2GB/s vs 500MB/s) the individual drive performance of each ...
as 8x NVMe with 6x (3GB/s vs 500MB/s) of the SATA array
- 2x x 6x, or suggestive of 12x the SATA performance bandwidth ... yet, they get roughly equivalent performance.

There's NOTHING diagnostically indicative about that !? Really ?

A single 7300 Pro vdev only gets about 500MB/s Write in TrueNAS, but 2GB/s Write in Windows and Ubuntu
A single 7300 Pro vdev only gets about 600MB/s Read in TrueNAS, but 3GB/s Read in Windows and Ubuntu

As in, just to write the checksum data a single 7300 Pro vdev suffers a loss of 75% (in Write) and 80% (in read).

While the 4x SATA of Samsung Evo 870 gets closer to their appropriate performance, it's still utter dog crap.

My cheap old T320 with spinning rust in their low average get 400 MB/s writing large video files and up to 800MB/s
As in, the 8 device vDev gets between 1/4th and 1/2 their aggregate performance ...

If I got the same proportional performance with this configuration ...
I'd be getting 2GB/s - 4GB/s (or better, as the spinning array is RAIDz2 vs RAIDz1 with these 4 drives).
- and -
I'd be getting 2GB/s - 4GB/s (or better, as the spinning array is RAIDz2 vs RAIDz1 with these 4 drives).

6GB/s - 12GB/s (CPU and Networking limited obviously) ... and probably that 2-3GB/s still ... however, in which:
The drive's have that IOPs bandwidth available... meaning that of smaller IOPs limited performance would stay near 2-3GB/s longer.


But instead, faster drives? Hardly any improvement.
More drives that are faster? Same shit. lol

This is just HARDLY an issue of "what benchmarking method" I'm using ...
This is a HUGE problem in which I'm off by MULTIPLES of what I'm getting.

Car analogy:
You DONT need Dyno tuning to figure out why a car that makes 300 HP stock is getting 75 at the wheels.

You need tuning to get the last 5-10% out.

Seriously ... performance of maybe 1/4th to 1/6th and we're acting like mirrored vs RIADz1 is the issue?

They are ALL getting the same speeds ... and people have already told me that this machine has yielded 2-3GB/s (with slower drives).
 

Rand__

Well-Known Member
Mar 6, 2014
6,634
1,767
113
The reason why we ask for the tests u run in synthetics is that something like Crystal disk mark uses different QDs and block sizes depending on test, so just throwing óut a number without telling how (whatever tool) got to it is not really helpful :)

Also RaidZ vs mirror makes a huge difference in writes at least (single vdev) and depending on how data is distributed on the drives also on reads (total test size dependent.
Anyhow, I totally agree that real live tests are much more meaningful than fancy benchmarking.

So we etablished that a single device works fine in non TNC. We also know that TNC can do better on different hardware.
That means that its either a hw compatibility issue or a driver problem (since the hw itself is fine in Windows/Ubuntu)

So, what hba are you using to connect the NVME drives? Latest Bios? Does it use a generic driver in TNC? Can u use a Dell one?

The HBA330 is not particularly fast, and 870 EVOs are not either, depending on how large your files are you might easily oversaturate them - and again Z1 = write in single vdev, only read is faster if distributed. You can go up with multiple threads (= writing/reading multiple video files at the same time).

I'd run another ticket, using the 7300's which are supported. I assume TNC is not supported but maybe FreeBSD is? Else Lin/Win , run a OS array and see if syntethics match aggregated performance, if not thats ur use case for the ticket. if aggregated perf matches, then it try with TNC only, but you're in a worse postion since OS not supported, but maybe they help anyway.
 

ano

Well-Known Member
Nov 7, 2022
654
272
63
you need the synthetic fio tests, so you have a baseline, and you can use it to debug whats under there.

you still have 300 at the wheels, but maybe your wheels are hotwheels plastic wheels and having trouble putting it down.

clients a re huge thing.
 

TrumanHW

Active Member
Sep 16, 2018
253
34
28
DAMMIT!! Why do you guys have to always be right?? :'(
Tested the stupid thing w RAID in Ubuntu; same shit!!

W: 730MB/s (RAID-5)
R: 850MB/s (RAID-5)

W: 2GB/s (1 Drive)
R: 3GB/s (1 Drive)

I don't understand this ... but, something makes this machine "defecate the bed" in parallel.
Now to get Dell to agree it's stupid for a single drive to outperform 3 in a RAID-5 or ZFS.


Ubuntu 1 NVMe  Perf.jpg

Ubuntu 3 NVMe RAID Perf.jpg
 

Rand__

Well-Known Member
Mar 6, 2014
6,634
1,767
113
That sounds like a resource contention problem, as if multiple drives are sharing the pcie Express bus and thus limit themselves... but that would be stupid. Not impossible though.

Else Dell is indeed your bet bet. Make sure to use a supported OS, drivers and drives to avoid discussions:)
 

TrumanHW

Active Member
Sep 16, 2018
253
34
28
I THINK ... the 24 slots are in banks of 8 ... so far I've been using the third bank (bc I added an HBA330 to 0-7) ... think it'll make any difference to try slots 8-15..?
 

TrumanHW

Active Member
Sep 16, 2018
253
34
28
That sounds like a resource contention problem, as if multiple drives are sharing the pcie Express bus and thus limit themselves... but that would be stupid. Not impossible though.

Else Dell is indeed your bet bet. Make sure to use a supported OS, drivers and drives to avoid discussions:)
But doesn't Epyc provide 128 PCIe 3.0 Lanes..?

(Rhetorical)... How do you get LESS than a single-drive's performance just bc several are working at once?
 

ano

Well-Known Member
Nov 7, 2022
654
272
63
benchmark 1, 2,3,4 drives at a time, you can do it easy with fio in same benchmark, to make sure its not a bus issue
 

Rand__

Well-Known Member
Mar 6, 2014
6,634
1,767
113
I THINK ... the 24 slots are in banks of 8 ... so far I've been using the third bank (bc I added an HBA330 to 0-7) ... think it'll make any difference to try slots 8-15..?
Did we ever talk about drive to pcie-card/multiplier to slot layout? Maybe there is your problem?
The CPU provides enough lanes, but they might not "arrive" at the slots. 24*4 lanes are 96 lanes needed for the drives alone, are those provided?

And spreading out might be a good idea, especially if that uses multiple cards...
 
  • Like
Reactions: TrumanHW