Storage space mirror vs RAID10 for a large array

ca3y6 · Feb 19, 2025

I asked the question on reddit but that seems to be the wrong audience for something that technical, perhaps someone here knows the answer.

My understanding of how Windows Storage Space allocates stripes across disks is that it is primarily based on capacity. So if I have a two way mirror virtual disk with a single column (so every stripe is written to two disks for parity), the stripes may end up distributed on any combination of two disks in the array. So say maybe some will be on disks A and B, B and C, A and C, etc.

RAID10 on the other hand is basically a set of pairs of disks, each in RAID1. What it means is that every stripe is saved on the same two of disks of a RAID1 pair, i.e. always A and B, or C and D, but never A and D, etc.

Now both give you redundancy for the loss of a single disk, that's fine and understood. My question is what happens when you lose more than one disk at a time.

For RAID10, it is basically down to whether you are unlucky enough that the other failed disk is part of the same RAID1 pair than the first disk to fail. If you lose both A and B, that's the end. But on a large array (think like 24 disks), chances are the other disk will be from another pair (say you lose A and D). So in most cases you can recover from the loss of more than one disk, and if you are extremely lucky, you could in theory lose up to half of your disks before losing any data (remote scenario).

For storage space mirror, on the other hand, it seems that if the pairs of stripes are indeed distributed across all disks, it is therefore very likely that if you lose two disks, some of the stripes will span those exact two disks, resulting in a data loss. Unless the storage space algorithm is smart enough to try avoiding too many permutation of pairs of disks, i.e. tries to favour spanning stripes between A and B, and C and D, but never A and D if it can avoid it. My question is, is it that smart?

This is what storage space would look like if it was that smart, i.e. limit the number of permutations of pairs of disks:

Because if it isn't, what it means if I understand correctly is that Storage Space mirror is basically no better than RAID5 in term of redundancy, i.e. you can lose one disk, but you are dead on the second disk failure, but having the capacity penalty of RAID10 (i.e. you sacrify half of the storage capacity). But RAID10 can most often tolerate more disk failures in a large array, unless you are unlucky to have two disk failures in the same RAID1 sub array (but that risk goes down as the array has more disks).

gea · Feb 19, 2025

Storage Spaces is the most advanced method to pool disks of any type or size. Unlike Raid it does not build redundancy over disks. Think of it like a basket where you throw in what you have. When you create Spaces, you can individually define data location, resiliency or auto tiering over hd, ssd or NVMe. Resilency like mirror, parity or dual parity is not based on disks but data copies over multiple disks. Without resilency data is lost when any related disk fails, with resilency all disks with independent data copies must fail. This means there is no array or pool repair but only add/remove disks (remove when no data on them) and an option to repair unhealty Spaces with resilency.

Main advantage is flexibility as you can define Spaces per use case, main disadvantage is that you need Powershell for many settings and it is quite complicated to oversee and manage larger arrays.

Main alternative or better add on is OpenZFS on Windows as a disk based software raid solution. It is currently a prerelease beta (2.3 rc6e, nearly ready) optionally with a web-gui for Storage Spaces and ZFS (napp-it cs)

Captain Lukey · Mar 13, 2025

Its "Storage Space is a glorified software RAID5" - This is what their product team say.. "

Storage Spaces does not replace hardware RAID

Storage Spaces helps protect your data from drive failures and extend storage over time as you add drives to your PC. You can use Storage Spaces to group two or more drives together in a storage pool and then use capacity from that pool to create virtual drives called storage spaces." -Software RAID 5 for failed drives.

101

gea · Mar 13, 2025

Hardware raid lacks protection for atomic writes (this are the smallest writes that must be processed or discarded together) like write a datablock + update metadata or write a raid stripe over several disks. Only Copy on Write can protect and there is no hardware raid option with Copy on Write in an array. ReFS and ZFS offer Copy on Write on all disks in an array but only with software raid.

Hardware raid also lacks end to end checksums that protects against bitrot and errors due bad cables or connectors.You can have checksum protection on Windows with ReFS (and ZFS in near future) but only with hardware raid and single disks. In a hardware raid array, chcksum errors can only be detected but not auto repaired.

Hardware raid is inferiour to modern software raid while I would consider ZFS Z1/2 software raid superiour to a Storage Space with parity/dual parity regarding performance, handling and failure management.

gea · Mar 13, 2025

ca3y6 said:
Because if it isn't, what it means if I understand correctly is that Storage Space mirror is basically no better than RAID5 in term of redundancy, i.e. you can lose one disk, but you are dead on the second disk failure, but having the capacity penalty of RAID10 (i.e. you sacrify half of the storage capacity). But RAID10 can most often tolerate more disk failures in a large array, unless you are unlucky to have two disk failures in the same RAID1 sub array (but that risk goes down as the array has more disks).

This is a general item
Any mirror and Raid 5 or Storage Space with single parity protects against a single disk failure
A Raid 10 allows two disks to fail but not in the same mirror

Only 3way mirror, Raid 6 or a Storage Space with dual parity allows any two disks to fail.
ZFS Z3 or 4way+ mirror allow more disks to fail.

ca3y6 · Mar 13, 2025

I get that but what I am saying is that people are uncomfortable using RAID5 with 24 disks because the risk of more than one disk failing at the same time is very high. RAID10 is what people usually go for, which allows for more disks failing, not always but in the vast majority of cases.

What I am saying is that storage space mirror has the storage penalty of RAID10 but the level of parity of RAID5, which isn’t great. In fact I hardly see the point of storage space parity if that’s the case.

gea · Mar 13, 2025

Raid-5 with 24 disks is stupid, even Raid-6/Z2 with so many disks is bad. Z3 is an option but dual Z2/Raid 60 is usual then. Multiple striped mirrors (Raid 10 and more so n * mirror ZFS vdevs) have a bad capacity ratio. While they improve iops of disk based pools the usual method are hybrid pools either with hot/cold data tiering (Storage Spaces) or special vdevs for small io (ZFS) to combine cheap high capacity disks with fast and expensive flash.

bugacha · Mar 13, 2025

So I have 4 equaly fast NVMes, whats the best way to organize them with Storage Spaces?
I guess my primary consideration is read speed. I do have regular backup in place to a slow moving z2 array

gea · Mar 13, 2025

It depends. There is no best of all method for a Storage Spaces Pool from 4 x NVMe.

You can create a Space that organize data in a Raid-0 manner over 4 disks what means fastest without redundancy.You can also choose to configure a Space similar to a Raid-1 or Raid-10 what means slower but with protection against a single drive lost.

A Space with single parity (slower) is also possible. A 3way mirror or double parity Space is not possible as this requires more disks.

Captain Lukey · Mar 14, 2025

100% agree RAID does not stop bit rot or support atomic writes. (or even silent data corruption over time) - There has been cases where ZFS - and ZIL still fail

1+1+1. lose the processor lose the data on a write

If you really want high end then I would look at ...
Firmware PLP – Firmware PLP protection is also designed to reduce the likelihood of data loss by ensuring the firmware’s ability to rebuild the mapping table upon the next power-on following a power loss event. A conceptual overview of firmware based PLP protection would look something like this:

The SSD’s mapping table is stored in Flash memory and is updated in DRAM
When new data is written to the SSD, the firmware updates the mapping table
The new data that is written is always written with tags (or spare bytes) which include LBA, EEC and other structure data information
Sudden power loss occurs
The spare bytes that contain data structure information combined with the original mapping table enable the SSD firmware to rebuild the SSDs mapping table upon the next power on

Firmware PLP protection is a highly effective method for preventing data loss in enterprise storage applications. For example, it is essential that SSDs configured in RAID arrays are able to recover and return to a healthy state after a power fail event to retain the integrity of the RAID array. One or several failed array members can result in an off-line array with a high potential for data loss. Then you can look at box failure so if the server fails you use two mirroring or disperse RAID or even like the hyperscaller use Erasure coding + pNFS + protection level 10.

ca3y6 · Mar 15, 2025

bugacha said:
So I have 4 equaly fast NVMes, whats the best way to organize them with Storage Spaces?
I guess my primary consideration is read speed. I do have regular backup in place to a slow moving z2 array

well fastest will be Simple (~RAID0) with 4 columns. Of course if one dies you lose the whole array but that may be a risk you are happy to take (particularly with a small number of SSD). Mirror 2 columns (~RAID1, the columns don't include parity so you read from 4 disks in parallel) will be very fast too in both reads and writes. Parity 4 columns (~RAID5, columns include parity), will have decent read speeds but not great write speed. Write speed will look horrendous in crystaldiskmark which bypasses the windows write back cache. Real life writes will look very different (up to 10x faster) as they must do some optimisations with the cache.

Best is to test it yourself. Create a storage pool with the 4 drives, then create a thin provisioned virtual disk with each of these configurations. And test it with real life workloads (like copy files), not just crystaldiskmark.

Search

Storage space mirror vs RAID10 for a large array

ca3y6

Well-Known Member

gea

Well-Known Member

Captain Lukey

Member

gea

Well-Known Member

gea

Well-Known Member

ca3y6

Well-Known Member

gea

Well-Known Member

bugacha

Active Member

gea

Well-Known Member

Captain Lukey

Member

ca3y6

Well-Known Member