Benchmarkin RaidZ2 vs vDevs

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

wdp

New Member
May 1, 2021
25
2
3
Thanks for the extra info @Rand__ . That helps me narrow my train of thought a lot.

In fio, what do # of jobs simulate? they aren't running concurrently are they?
 

Rand__

Well-Known Member
Mar 6, 2014
6,626
1,767
113
#jobs is basically what i says, the amount of concurrent processes that are being executed.
In your case that should be 3 or 4 depending on how many editors are working on the same box at a time (or run with less jobs but different usage patterns to get some random activity simulated)
 

wdp

New Member
May 1, 2021
25
2
3
So many 2600 v4 sku's...ugh.

Looking at 2630 v4, 2640 v4, 2650 v4, 2667 v4, 2690 v4.

I think the E5-2690 v4 chips are about $300 used. That looks like the sweet spot.

TDP cap is 145, so I think those are all around 135, besdies the 2630. Just not sure if it's worth sinking $700 on a bigger chip, or just getting out of the 2603's to see if I can even notice when it's under load.
 

Rand__

Well-Known Member
Mar 6, 2014
6,626
1,767
113
CPU - 2603 is basically the entry card - definitely would recommend to upgrade that - higher clock is preferred over many cores in your instance. Think 2+ (1-2 cores/user)
I think the E5-2690 v4 chips are about $300 used. That looks like the sweet spot.
1620511422704.png

Hmmm ?

But do as you like o/c, you also can look at 1650/60/80...
 

wdp

New Member
May 1, 2021
25
2
3
Has. Was that approval or concern? budget isn't really a a concern, I just dont want to waste money for no reason.

The board is an X10DRH-iT | Motherboards | Products | Super Micro Computer, Inc. and seeing as how this is the first server board I've ever worked with, as old as it is, it says E5-2600 v3/v4 on the page. So I was sticking in that range and TDP cap. And since it's a dual socket board..
 

T_Minus

Build. Break. Fix. Repeat
Feb 15, 2015
7,625
2,043
113
2667 v3 or v4 for not really low low core but high frequency is my vote.

Bump that ram to 8*32 or at-least 8*16 and you can try even more configuration options.
 

Rand__

Well-Known Member
Mar 6, 2014
6,626
1,767
113
Has. Was that approval or concern? budget isn't really a a concern, I just dont want to waste money for no reason.
Concern...
I said 2 (basic OS function) +1-2/user. Given you said a couple I dont see how you need a 14C/28T CPU (which operates at a medium clock level)? Dont get me wrong, its a nice CPU and a good choice if you need lots of cores (step up from the 2680 you mentioned before), but for only a couple of users? Especially if you have a dual board you can/want to use.

The 2667s @T_Minus mentioned are usually a great choice for a filer which needs a bit more single thread performance (or for low core requirements get the 16XX's, a single one runs on dual board too). You also can consider the 2687W but that might over TDP.

Also, I as well think you should spend some cash on memory .
 

wdp

New Member
May 1, 2021
25
2
3
Upgraded the RAM today (to 128gb), racked the server, built out the pool, setup the datasets, configured ACL's, setup SMB, wrote 13TB of data in about 4.5-5 hours which is expected for my edit bays onboard 10gbe aquantia. Read Write tests seem good, iperf is solid. So the basics were done.

Everything seemed great. Until...

I loaded up a video file in a timeline and hit play. 3 seconds, stall, 3 seconds, stall...3 seconds, stall...

Opened the same file on a stock synology 12 bay, plays fine. Doesn't seem to matter, it's all playback. I assume it's attempting to be loaded into cache/ARC?

Where does one go from here?
 

wdp

New Member
May 1, 2021
25
2
3
Concern...
I said 2 (basic OS function) +1-2/user. Given you said a couple I dont see how you need a 14C/28T CPU (which operates at a medium clock level)? Dont get me wrong, its a nice CPU and a good choice if you need lots of cores (step up from the 2680 you mentioned before), but for only a couple of users? Especially if you have a dual board you can/want to use.

The 2667s @T_Minus mentioned are usually a great choice for a filer which needs a bit more single thread performance (or for low core requirements get the 16XX's, a single one runs on dual board too). You also can consider the 2687W but that might over TDP.

Also, I as well think you should spend some cash on memory .
I don't really have a true user limit in the end. At any given time I have 10 edit bays on the network via an Arista 10gbe switch. 12 spindles won't handle that stream load no matter what hardware it's backed by. I do plan to add an expansion chassis and continue to add more storage at a later date and updgrade, or replace it all with more modern server.

My main concern was if the CPU would bottleneck. But CPU load is relatively low in playback, and pegged when doing writes from a client at high speed.

And playback has a frequent lag/stutter issue on first playback, which I've yet to resolve and mentioned in my post before this.
 

Rand__

Well-Known Member
Mar 6, 2014
6,626
1,767
113
Where does one go from here?
Identify the cause;)

If you are lucky its networking because then you can simply increase send/receive buffers on both ends and that might help.

If its the disks there are also buffers you can tune, or try moving metadata to special vdevs so lighten the load on the disks, but keep in mind disk performance deteriorates with used space size...
 

wdp

New Member
May 1, 2021
25
2
3
Read / Write test (blackmagic) show 650/670, expected from the aquantia 10gbe stock, iperf is solid, Server to edit bay upload is fast (I moved 13TB in 5 hours via robocopy, was 6 with explorer).

Download in Windows10 is slow (120mb/s when pulling a file, same file from another server is fast, 600-700). But TrueNAS to the edit bay to another btrfs server (synology, stock SMB3 config) on the 10gbe, is 600mb/s sustained read/write. I'm not sure how that pass through works, vs pulling down to a local drive, or why that is so slow, but could be the root of why playback is stutter if it's kinda capping max speed.

The fact other tasks are fast, leads me to think the pool itself is fine.
 
Last edited:

Rand__

Well-Known Member
Mar 6, 2014
6,626
1,767
113
I actually found it very difficult to follow this and don't think i understood it correctly...
 

wdp

New Member
May 1, 2021
25
2
3
Sorry. I was on a tangent earlier. Was finally at a point where I thought everything was working how I wanted, and then in real world usage it buckles.

Gonna go wrench on the problem some more and run tests against multiple clients before I post again.
 

wdp

New Member
May 1, 2021
25
2
3
Well, after two days of real world testing, I couldn't get mirrored vdevs to really stabilize for read performance on media content.

Something may have been wrong and being new to ZFS there may have been some behaviors that I can't account for yet. But we couldn't play back a lot of RAW or high resolution content with only 12 spinning drives in the array. We'd see constant playback / lag even with a single client hitting the server. All benchmarks and testing was saying it was operating fast enough, but when I actually loaded footage into the dataset and attempted to edit, the whole system bucked even at drastically reduced playback quality in the NLE.

After I ran out of ideas, I slept on it, blew the pool away in the morning and fell back to a 2x RaidZ2 in the 12 bays for more comparison testing. Oddly enough, I had to opposite results than I expected, my write slowed down a lot and my reads were more constant in RaidZ2. It's still operating much slower than our QNAP or Synology NAS or enterprise SAN systems which range from 8 to 36 drive bays, but it's at least usable now, at least from one client. We'll see how it hopes up as we hammer it harder over the next week or two.
 

Rand__

Well-Known Member
Mar 6, 2014
6,626
1,767
113
My impression (based on personal gut feeling, not actual scientific measurement) is that a mirrored vdev will not distribute writes (and subsequently reads of newly written data) equally to all drives in the array when using only a single thread. Accordingly, reads of that data will not be read from all drives since its only residing on a few of them.

If you run multiple threads this will o/c be better since more drives are involved.

And o/c the IOPS are better since each operation can have access to a fresh spindle with empty disk cache and idle heads...

Observe the input/output bandwith and IOPS for the Z2 to see what your requirements are for single/multiple users
 

wdp

New Member
May 1, 2021
25
2
3
A single camera file can be up to 200-300 mb/s at full quality in a timeline, a big one at least. But most are around 100mb/s less. Anything that was over 100MB, buckled on mirrored vdevs, stuttering in the timeline until it was cached in the playback on the edit bay. Maybe if it spanned across more platters, had more spindles, but with 12 bays, it was really only hitting a few drives most of the time on reads. The only reason I wanted to test out mirrored vdevs is cause the iX engineering crew said it was the way to go for video editing, and I've never really had a server I could just play around with. We usually rack then, raid 6, and start working right away.

So far so good on 2x Raidz2 though. I haven't hit it hard with multiple clients yet, but it is working. I can scrub a timeline fine. And only our biggest 8k camera seems to push it over the edge. CPU usage in reads is relatively low.

I guess I just expected mirrored vdevs to distribute more? what would happen is large chunks of data were only in 1-2 places it seems. for small files, lots of users, that would be fine. For something big, a single machine would buckle that section over the server.
 

Rand__

Well-Known Member
Mar 6, 2014
6,626
1,767
113
I guess I just expected mirrored vdevs to distribute more? what would happen is large chunks of data were only in 1-2 places it seems. for small files, lots of users, that would be fine. For something big, a single machine would buckle that section over the server.
Me too, learned the hard way.


Run whats working for you, as long as you do realistic tests that should be fine.
Just remember performance will drop if the pool gets fuller.