Correct way to set up storage for home.

ttabbal · Aug 19, 2016

TLN said:
Well, with RAID6 I'll get a performance upgrade anyway, right?
Idea of buying bunch of used 2Tb drives is tempting, but I still have doubts about it.

Performance upgrade from what? A single mirror pair? A single drive? Maybe, depending on the access pattern. Performance of an array depends greatly on how it's accessed. That's one reason there are so many types. There is no one size fits all. If you want to store movies for simple playback on a single client, the performance isn't an issue, any of them will do. Start running 6 clients, it won't keep up. At least, it doesn't for me. Basic file serving, probably ok most of the time. VM/database storage? It will likely cause issues.

The used drives are a good cheap way to get started up. If you thoroughly test them, they are fine. If you try to blindly use them, you might run into problems later. No different from new drives. I had a similar failure rate with those vs new, so have most people. The seller took the return back with no issues, I had one DOA. The full tests take a few days to run. There are no shortcuts here. You should be doing it with new drives if you like your data. Particularly with ZFS and somewhat frequent checks, I like them. If whatever filesystem/RAID you use can't verify data, they are probably about the same as anything else really.

What is your use pattern? How is the server connected? Local processes hitting the array? What are your priorities (data integrity, speed, space efficiency, etc).

If you're not sure about those, one option is to pick up say 4 of the used drives, make an array and copy your data to it. Now hit it with all the things you want to do with it, and a couple other things for testing. Watch various stats like I/O wait, speed per disk, speed per client, IOPS, etc... Then try with a different style. Maybe do RAID6 first, then mirrors, then raidz and ZFS mirrors, then md or snapraid. Just get an idea of pros/cons of the types for your needs. Then consider expansion, error handling etc before deciding which way you want to go. That's how I did it when I got started, it was helpful to see the various options in action.

TLN · Aug 19, 2016

ttabbal said:
Performance upgrade from what? A single mirror pair? A single drive? Maybe, depending on the access pattern. Performance of an array depends greatly on how it's accessed. That's one reason there are so many types. There is no one size fits all. If you want to store movies for simple playback on a single client, the performance isn't an issue, any of them will do. Start running 6 clients, it won't keep up. At least, it doesn't for me. Basic file serving, probably ok most of the time. VM/database storage? It will likely cause issues.

I know no two patterns are the same, and all of us will get different results.

I expect that RAID 6(of 6 drives) will be faster then Mirror of 2 drives and single drive for seq. read.
Seq write should be faster for RAID6, while mirror and single drive speed are the same.

For random read/write there's no answer, but I expect raid6(again of 6 drives) will be faster then single drive.
How does it compares to mirror - that's interesting question. I can see some scenarios where mirror will be better. It's more obvious if I put two Mirrors. Not RAID 10, but RAID 1 and another RAID 1 to serve different purposes.

ttabbal said:
What is your use pattern? How is the server connected? Local processes hitting the array? What are your priorities (data integrity, speed, space efficiency, etc).

If you're not sure about those, one option is to pick up say 4 of the used drives, make an array and copy your data to it. Now hit it with all the things you want to do with it, and a couple other things for testing. Watch various stats like I/O wait, speed per disk, speed per client, IOPS, etc... Then try with a different style. Maybe do RAID6 first, then mirrors, then raidz and ZFS mirrors, then md or snapraid. Just get an idea of pros/cons of the types for your needs. Then consider expansion, error handling etc before deciding which way you want to go. That's how I did it when I got started, it was helpful to see the various options in action.

Home use only, I've mentioned this here.
For mirror I'd put: RAID 1 - Critical data, another Raid 1 - less critical data, some drives without mirrors as temp storage.
It's pretty easy to upgrade, and should be pretty fast. This is "default" option.

I expected RAID 6 to be a better option, because it provides better performance and redundancy. You also get more useful space. But seems like it's not that obvious. And if someone streams it will affect other users.

I was thinking about some SSD caching option. If there's something like that? SSD are pretty cheap and there's solution to speed up RAID6 it will be really great.

TuxDude · Aug 19, 2016

TLN said:
...
For random read/write there's no answer, but I expect raid6(again of 6 drives) will be faster then single drive.

Big differences here depending on whether you are looking at traditional block-level RAID-6, or ZFS's RAID-Z2.

For traditional RAID-6 - random reads will be faster than a single drive - you should expect about N-2 times single disk performance, maybe even slightly more. To read a single random block from a RAID-6, you just read that block from the drive holding it - no overhead, all drives in the array can be simultaneously handling reads assuming a high enough queue depth against the array. Random writes are the big hit here - you should expect about (N-2)/6 performance. To write a single random block on a RAID-6 array, the controller (hardware or software) has to first read the current data stored at that block, as well as both parity blocks associated with it. It can then do the math using old value + new value + old parity to calculate what the new parity values should be, after which it can finally write the new data and both new parity blocks to disk. So to update a single block on the array, the controller has to do 6 IO's against the member disks in the array. Also known as the 'read-modify-write' cycle, it is why random write performance on parity raid sucks - raid-5 has a penalty multiplier of 4, and raid-6 has a multiplier of 6. Assuming your raid controller is reasonably smart (most are, hardware or software), you can bypass the penalty for sequential writes - if you are writing a full stripe of data to the array then you already have all of the information you need to calculate the new parity, you can just do the math and write the data+parity to disk.

ZFS on the other hand does things very differently, and I've got very little experience with it so my explanations might be slightly off. The key is that ZFS only ever reads or writes full stripes of data, regardless of whether it is sequential or random. So for a random read of a single block on ZFS, ZFS reads in the entire stripe of data from all of the drives in the array - the checksums it uses to guarantee data are calculated at the stripe level and it needs all of that extra data to verify that the one block you are interested is actually correct. Random writes on the other hand are written first into the intent-log (ZIL), and then written in batches to the array a full stripe+parity at a time. ZFS never does a 'read-modify-write' by design, making it immune to the so-called 'raid-5 write hole', but then it also involves every disk in the array for every IO, so the disks can't be using that time to do other things, hence all the statements that raid-z is the speed of the single slowest disk in the array.

ttabbal · Aug 19, 2016

If you use ZFS, you can add SSD cache drives quite easily. That would help with various speed issues on the arrays. I think some of the other setups can do similar caching.

I honestly don't see a lot of point in 2 separate mirrors. Combining them gives more space and performance.

TLN · Aug 19, 2016

ttabbal said:
I honestly don't see a lot of point in 2 separate mirrors. Combining them gives more space and performance.

How does it gives me more space?
I don't think it's good idea to mix different models in array(we talk RAID).
For example I can simply buy 2x4TB drives and put in RAID1, and use old pair in another RAID1. Plenty of space and less possible problems with random access.

TuxDude · Aug 19, 2016

The more space part I can understand - having a single 4TB volume is easier to make use of than two 2TB volumes - especially if you have a 3TB file to store. It's technically not any more space, but you will make better use of it by not having the free space fragmented across many smaller volumes.

But the performance part depends on the workload - lets use streaming video as an example. A single user streaming a video off a volume is a sequential-read workload, which anything can handle quite easily. But the moment a second user starts watching a different video, the disk (or array, whatever) now has to bounce back and forth between the two streams - two sequential workloads on the same device = one random workload, and is where having multiple arrays can give increased performance. Two users watching videos held on two different RAID-1 arrays would not affect each other, and each array would have a simple sequential-read workload to deal with. Taken more to the extreme, this is where snapraid can give very good performance - my snapraid media library is spread across 15 or so drives now, I have far more drives than typical concurrent users and so any given user can typically have the spindle to themselves and get very good sequential speeds. Back when my media library was on a traditional raid-6 array any kind of write activity at all added to the usually non-stop read workload would cause the entire array to start doing a lot of 'read-modify-write' activity, each drive spending way more time seeking than reading/writing, and overall array throughput would drop to <20MB/s.

Keljian · Aug 19, 2016

My intention (re raidz2)is to do a gradual upgrade, as disks fail, replace them with bigger ones, until they all are replaced or I have cash to replace the remainder.

I like z2 because there is a statistically greater chance that both drives in a raid 1 dev will die than 2 of the 7 others.

ColPanic · Aug 25, 2016

Another option worth considering is Windows Server 2016 storage spaces. They've made some big improvements from 2012r2 and it's a viable option for hosting esxi datastores. I'm moving my personal data over from freenas now and am getting much better performance especially for reads.

Here are some of the advantages:
ReFS adds a lot of the resiliency features that zfs has historically had and is optimized for storing vhds.
Up to 3 levels of Tiering. NVME as write cache if you have it, then you can have lots of SSD drives for your fast tier, then spinners for the slow tier.
You can mix redundancy levels between tiers. E.g. Striped and Mirrored SSDs then parity for spinners. This gives very fast writes all the time (provided you have the SSD space)
SMB3 if you are sharing to Windows
Reasonable memory requirements
Much easier to expand pools and add capacity later on
DEDUPE!!! That really works!!!!
You never have to listen to that cyber guy tell you what an idiot you are for not using 64GB ram for your NAS box.

But the big one for me is that MS has a ton of engineers developing this. Freenas seems to be just a few guys relying on the old zfs stack from before oracle took it closed source and there is an odd cult-like hostility to questioning their methods.

I'm still testing and evaluating both but so far I'm leaning toward Windows as my next storage os.

katit · Aug 25, 2016

ColPanic said:
Another option worth considering is Windows Server 2016 storage spaces. They've made some big improvements from 2012r2 and it's a viable option for hosting esxi datastores. I'm moving my personal data over from freenas now and am getting much better performance especially for reads.

I see people mention a lot of good stuff about 2016
Where do I get it? I understand it's not public yet? How do you plan to upgrade when it goes public?

TLN · Aug 25, 2016

I don't really wanna go with Windows. I just wanna be able to plug it to random pc and get data in case I need it.

ColPanic · Aug 25, 2016

katit said:
I see people mention a lot of good stuff about 2016
Where do I get it? I understand it's not public yet? How do you plan to upgrade when it goes public?

RTM is not public yet but is available through several beta programs. Supposedly you can upgrade to RTM from the TP but at the very least you'd be able to import pools and volumes. I did that from TP5 to RTM.

katit · Aug 25, 2016

Is there going to be any "upgrade" deals from 2012 R2 standard?

Franko · Aug 26, 2016

ColPanic said:
Another option worth considering is Windows Server 2016 storage spaces. They've made some big improvements from 2012r2 and it's a viable option for hosting esxi datastores. I'm moving my personal data over from freenas now and am getting much better performance especially for reads.

Here are some of the advantages:
ReFS adds a lot of the resiliency features that zfs has historically had and is optimized for storing vhds.
Up to 3 levels of Tiering. NVME as write cache if you have it, then you can have lots of SSD drives for your fast tier, then spinners for the slow tier.
You can mix redundancy levels between tiers. E.g. Striped and Mirrored SSDs then parity for spinners. This gives very fast writes all the time (provided you have the SSD space)
SMB3 if you are sharing to Windows
Reasonable memory requirements
Much easier to expand pools and add capacity later on
DEDUPE!!! That really works!!!!
You never have to listen to that cyber guy tell you what an idiot you are for not using 64GB ram for your NAS box.

But the big one for me is that MS has a ton of engineers developing this. Freenas seems to be just a few guys relying on the old zfs stack from before oracle took it closed source and there is an odd cult-like hostility to questioning their methods.

I'm still testing and evaluating both but so far I'm leaning toward Windows as my next storage os.

How about doing a write-up on your findings? I am very interested in seeing how 2016 Storage Spaces has been improved compared to 2012r2 and I am sure other readers are as well.

ColPanic · Aug 27, 2016

I'll been thinking about a write up and will try and find the time.
In addition to the things I mentioned above another great feature is rebalancing. In freenas if you add a stripe to an already full pool, it does nothing to balance the existing storage. This means most of your new writes (and reads) go to the new disks which can cripple performance. WS2016 will rebalance. You can also tell it to empty a disk for removal which can be very handy.

Freenas has always felt like a hack (despite its underlying zfs pedigree).

katit · Aug 29, 2016

2016 sounds very good. I played with Storage Spaces in Win 10, did some very primitive performance testing and I like it.
The only question is what the street (eBay) price will be for Server 2016. Might be cost-prohibitive for home use..

Franko · Aug 29, 2016

If an article is too much commitment, perhaps we can just start up a discussion thread or two once Server2016 is officially released.

TLN · Sep 6, 2016

I'm still debating on build (read this as 0 progress).
What's your opinion on storing all data on external device? This kills all the idea of AIO system, and makes it a bit more complicated. So that's just a thought. Most likely it will require 10gb network, right? While 1 gig is perfectly fine, and all VMs will be stored locally, but 1 gig is not that fast nowadays, right?

Search

Correct way to set up storage for home.

ttabbal

Active Member

TLN

Active Member

TuxDude

Well-Known Member

ttabbal

Active Member

TLN

Active Member

TuxDude

Well-Known Member

Keljian

Active Member

ColPanic

Member

katit

Active Member

TLN

Active Member

ColPanic

Member

katit

Active Member

Franko

New Member

ColPanic

Member

katit

Active Member

Franko

New Member

TLN

Active Member