MegaRAID RAID-1 consistency check issues

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

lunadesign

Active Member
Aug 7, 2013
256
34
28
I have a few RAID-1 volumes on a MegaRAID 9270-8i that has been solid for 2 years. I just set up a new RAID-1 array a few days ago with two 1 TB Samsung SSD 850 Pro SSDs. A few days later I started noticing that some data I had just copied onto the virtual drive didn't match the original. I ran a consistency check and it found 70+ issues and apparently corrected all of them.

I temporarily connected the physical drives to a test system and checked the raw SMART values and saw nothing wrong. That doesn't completely rule out a drive problem but it's still interesting.

This brings up a few questions:

1) Since RAID-1 has no parity, I believe it's just comparing the two physical drives byte-by-byte. When it finds a discrepancy, how does it know which physical drive has the "correct" copy of the byte?

2) How do I determine the root cause of this inconsistency?

Thanks in advance!
 

gea

Well-Known Member
Dec 31, 2010
3,141
1,184
113
DE
1. Without a filesystem that includes checksums like btrfs, ReFS or ZFS you have no chance to decide which mirror part contains good and which bad data.

2. Beside hardware, power or cabling problems, you have
- the write hole problem

On a crash during a write, one disk may be updated while the other was not or only partly what gives you an inconsistent raid. A cache with a BBU can reduce the problem. Only CopyOnWrite filesystems like btrfs, ReFS or ZFS are completely resistent against this as they write a datablock completely on all disks of a raid or not.

- silent data errors
You will find random bit flips on disks with a statistical rate by chance that can only be detected and repaired with checksums... goto 1.)

With current disk sizes, you really need btrfs, ReFS or ZFS if your data is in any way important, does not matter if you use a single disk, a mirror or a higher raid level.
 

lunadesign

Active Member
Aug 7, 2013
256
34
28
Thank you very much....this is scary but helpful.

I read up on ReFS and it seems very interesting but how safe is it to use, especially on Windows 8.1? From what I've read, not a lot of people are using it yet and it doesn't seem like it's very mature.

I've heard about random bit flips but are SSDs as susceptible to those as HDDs?

What other alternatives do I have? RAID 5 or (for larger HDDs) RAID 6? I realize they don't protect against some of the scenarios but does RAID 5 protect against random bit flips? I'm guessing RAID 6 does.
 

pricklypunter

Well-Known Member
Nov 10, 2015
1,708
515
113
Canada
ReFS is maturing, but it's early days yet imo. I wouldn't use it yet in production unless I had a very specific reason to. Random bit flips can occur at any time, both with data in transit and after it has been written to disk, and that can affect both magnetic and solid state media. It's not the raid level that protects against random bit flips, it's the way in which the file system self checks and heals itself. So basically it's the filesystem that's responsible for keeping that in check. Raid only provides disk redundancy in case of a disk failure. A single disk can fail and your array will survive until it is replaced in the case of raid 5 or two disks can fail and your array will survive in the case of raid 6. It should be noted though when using raid 5, if any additional disks fail or simply do not respond in time during a rebuild, after replacing the original faulty disk, your array is toast and so is your data. The same holds true for raid 6 if two drives fail and something goes wrong with a third during a rebuild.
Always take regular backups, if it all goes sideways, only a backup is going to save your ass :)
 

lunadesign

Active Member
Aug 7, 2013
256
34
28
Thanks. I totally understand and am very regular with backups. Of course, backups don't help with bit flips.....the corrupted data simply gets backed up and preserved. :(

I guess I should re-state my RAID 5/6 question regarding bit flips. Assuming the controller supports a Check Consistency type function, I'm wondering if the fact that RAID 6 has two parity blocks means if there's a disagreement between the live block and the two parity blocks, there's a 2 vs 1 way to determine which one is bad and correct it. Unlike RAID 1 when it's 1 vs 1 and the controller has no idea which is right. I hope this makes sense.
 

pricklypunter

Well-Known Member
Nov 10, 2015
1,708
515
113
Canada
There is no raid level or controller that I know of that can protect you from corruption after the fact. The raid controller will check, for example, to see if the stored data is identical across all members in a mirror, and if there's something different, it will flag the array as being inconsistent, but the controller has no way of knowing which is the good copy and which is the bad one. Only the filesystem can determine that with the use of a checksum on the original data, because it is determined at the same time as the data is written. If that data later changes due to bit rot, then it no longer matches the checksum, will be flagged by the filesystem and can be repaired :)
 

gea

Well-Known Member
Dec 31, 2010
3,141
1,184
113
DE
The write hole problem is the same with raid-1 and striped raid 5/6.
If you for examle edit a textfile and replace a house with houses, then

- you must store the additional byte and you must update all metadata.
If the byte is written and the metadata not, you have a corrupted filesystem.

With raid-5/6 you have not less but more problems. You must create a stripe
of each datablock that must be written more or less sequentially over all disks.
On a problem on a write you may have a corrupted raid and a corrupted filesystem.

All these problems can only be adressed at the filesystem level and CopyOnWrite.
Microsoft is aware of that. This is why they adopted the base ZFS principles to
ReFS. ReFS is quite usable as a filesystem now but you can currently not yet boot
from, it has performance problems but mainly because of the Storage Spaces Concept
that lacks features and performance. It also lacks functions like the cache and higher
raid-levels from ZFS like Z3. Sometimes I am under the impression that MicroSoft has completely
lost their focus on professional storage ideas as I think that Storage Spaces is the wish to
merge the enterprise class ZFS Pooling ideas with the home idea of " you should be able
to just add single disks" to a pool. Now they do not have the best of two but the worse of two.

But ReFS is the future on Microsoft and it is needed for current disk sizes.

btw.
Have you ever thougt about virtualising Windows Server ex with ESXi combined
with a ZFS NAS/SAN Storage Server and NFS. This solves mainly the backup and
recovery problems or makes the setup hardware independent and moveable but give
you also a higher data reliability.
 
Last edited:

lunadesign

Active Member
Aug 7, 2013
256
34
28
Thanks gea....much appreciated!

Regarding ReFS:
1) I can live with the boot drive being on NTFS on my Windows workstation. I image that drive periodically so if a bit flip or other surprise messes it up, I can restore from the most recent image.
2) I'm intrigued with using it for the local data drive (including many desktop OS VMs for testing in VMware Workstation) but want to be sure it's rock solid on Win 8.1. I've been burned by way too many bleeding edge things (especially things that weren't supposed to be bleeding edge but turned out to be).

Regarding using a ZFS NAS/SAN:
1) Other than my Windows workstation and laptop, almost everything I have is already virtualized on ESXi and I'm actually in the process of doing a hardware refresh so this timing is interesting.
2) I've been reading a lot of BTRFS and ZFS since your initial reply and learned about napp-it by clicking on your signature. I can definitely see the value of checksum-ing and copy-on-write. BTRFS looks especially interesting but doesn't seem ready for production use yet. So, I'd probably play it safe and go with ZFS.
3) I've never set up a NAS/SAN but am interested. I guess the biggest thing stopping me is the impression that disk performance....it's gotta be slower because the disk still has to be read and then the data has to be transmitted over the network. What kind of networking infrastructure would I need to get NAS/SAN to near-DAS levels?
4) I'm hesitant to throw away my recent investment in MegaRAID cards as it appears HW RAID can't be used with checksum-based filesystems. That said, my RAID 1 issues appear to be yet another LSI incompatibility with Samsung SSDs, not a bit flip, so I'm keeping an open mind.
 

lunadesign

Active Member
Aug 7, 2013
256
34
28
Quasduco - That crossed my mind but from what I can tell, it's not possible with the ones I have (9260-8i and 9270-8i).
 

gea

Well-Known Member
Dec 31, 2010
3,141
1,184
113
DE
3) I've never set up a NAS/SAN but am interested. I guess the biggest thing stopping me is the impression that disk performance....it's gotta be slower because the disk still has to be read and then the data has to be transmitted over the network. What kind of networking infrastructure would I need to get NAS/SAN to near-DAS levels?
4) I'm hesitant to throw away my recent investment in MegaRAID cards as it appears HW RAID can't be used with checksum-based filesystems. That said, my RAID 1 issues appear to be yet another LSI incompatibility with Samsung SSDs, not a bit flip, so I'm keeping an open mind.
about 3
DAS storage especially the new NVMe disks are unbeatable. They offer a sequential performance of
1000-2000MB/s. But if it comes to randow reads, a ZFS storage server with enough RAM can even be faster as it can deliver rereaded content from a ramcache. It can also be faster with random writes as it collects some seconds of small random writes and write it as a large sequential write.

But mostly performance is not the reason to switch from DAS to a ZFS storage server. Its mainly datasecurity (crash resistent filesystem due CopyOnWrite), no write hole problem with Raid ( "Write hole" phenomenon in RAID5, RAID6, RAID1, and other arrays. ), multiuser access, versioning with unlimited snaps or expandability regarding capacity with storage pooling (software defined storage, storage virtualisation).

If you use a 10G network with a pool from fast disks and enough CPU or RAM and some tunings, you can go near to the limits of 10G ethernet what means 700-800MB/s. Without tuning you can go up to 300-400 MB/s with SMB, NFS or iSCSI transfers.

If you virtualise a storage server under ESXi you can achieve 300-700 MB/s on internal transfers over a vmxnet3 vnic in software over the ESXi virtual switch.

read my napp-it howto, all-in-one and smb tuning and performance manual from
napp-it // webbased ZFS NAS/SAN appliance for OmniOS, OpenIndiana, Solaris and Linux : Handbücher

about 4.
For a barebone setup, even onboard Sata may be a better and faster solution for ZFS than a hardware raidcontroller. There are also quite cheap options like a Dell H200 or IBM M1015 that are OEM versions of an LSI HBA like the LSI 9211. Even a new LSI 9207 that comes in a raidless IT mode (no reflash needed) is affordable.

A hardware raid controller is perfect for a bootraid with ntfs or ESXi vmfs so maybe you find a use case there. You will hardly find a raidless IT firmware. Some controllers allow a Raid-0 config from a single disk for ZFS but this is not optimal. Without a use case, sell it and buy a cheaper LSI HBA.
 

lunadesign

Active Member
Aug 7, 2013
256
34
28
Hi gea,

Thanks again for your thoughts!

about 3
DAS storage especially the new NVMe disks are unbeatable. They offer a sequential performance of
1000-2000MB/s. But if it comes to randow reads, a ZFS storage server with enough RAM can even be faster as it can deliver rereaded content from a ramcache. It can also be faster with random writes as it collects some seconds of small random writes and write it as a large sequential write.

But mostly performance is not the reason to switch from DAS to a ZFS storage server. Its mainly datasecurity (crash resistent filesystem due CopyOnWrite), no write hole problem with Raid ( "Write hole" phenomenon in RAID5, RAID6, RAID1, and other arrays. ), multiuser access, versioning with unlimited snaps or expandability regarding capacity with storage pooling (software defined storage, storage virtualisation).
I'd be more than happy if a NAS/SAN could provide SATA SSD class performance, as seen by my Windows workstation.

While sequential performance is important I'm more worried about what the extra latency would do to random read/write performance. It sounds like caching can help but can I realistically get in the neighborhood of SATA SSD random read/write performance?

If you use a 10G network with a pool from fast disks and enough CPU or RAM and some tunings, you can go near to the limits of 10G ethernet what means 700-800MB/s. Without tuning you can go up to 300-400 MB/s with SMB, NFS or iSCSI transfers.
10G seems intriguing but the 10G switches all seem very pricey except that $800ish Netgear one that I hear has a fan that's similar to a jetliner.

What about a 4G connection (each one created by teaming four 1G connections in a LAG group)?

If you virtualise a storage server under ESXi you can achieve 300-700 MB/s on internal transfers over a vmxnet3 vnic in software over the ESXi virtual switch.

read my napp-it howto, all-in-one and smb tuning and performance manual from
napp-it // webbased ZFS NAS/SAN appliance for OmniOS, OpenIndiana, Solaris and Linux : Handbücher
http://napp-it.org/manuals/index_en.html
Thanks!...I'll take a look at this.

about 4.
For a barebone setup, even onboard Sata may be a better and faster solution for ZFS than a hardware raidcontroller. There are also quite cheap options like a Dell H200 or IBM M1015 that are OEM versions of an LSI HBA like the LSI 9211. Even a new LSI 9207 that comes in a raidless IT mode (no reflash needed) is affordable.

A hardware raid controller is perfect for a bootraid with ntfs or ESXi vmfs so maybe you find a use case there. You will hardly find a raidless IT firmware. Some controllers allow a Raid-0 config from a single disk for ZFS but this is not optimal. Without a use case, sell it and buy a cheaper LSI HBA.
My Supermicro motherboards only have 2 6Gbps ports so I'd likely be looking at some LSI 9207's...they seem pretty reasonably priced.
 

Quasduco

Active Member
Nov 16, 2015
129
47
28
113
Tennessee
Quasduco - That crossed my mind but from what I can tell, it's not possible with the ones I have (9260-8i and 9270-8i).
Well, the 9260-8i is a 2108 chip card. I just got a couple of servers with in-built 2108 cards, and I too, wanted IT mode for them, so I did some searching, and found:

SAS2108 (LSI 9260) based firmware files - Projects, Tools, Utilities & Customized INFs

I have not yet flashed mine, as I have not had the need (booting on SATADOM, then using NAS), but this is the path I will likely take when I have a little more time...

As for the 9270-8i, I do not know specifically, quick google was not as positive as the 9260...

Hope this helps.
 

lunadesign

Active Member
Aug 7, 2013
256
34
28
Well, the 9260-8i is a 2108 chip card. I just got a couple of servers with in-built 2108 cards, and I too, wanted IT mode for them, so I did some searching, and found:

SAS2108 (LSI 9260) based firmware files - Projects, Tools, Utilities & Customized INFs

I have not yet flashed mine, as I have not had the need (booting on SATADOM, then using NAS), but this is the path I will likely take when I have a little more time...

As for the 9270-8i, I do not know specifically, quick google was not as positive as the 9260...

Hope this helps.
Thanks....I also ran into this URL but didn't see any indication that any of these turn the card into HBA/IT mode. Unless I'm missing something, this looks like a collection of older RAID firmware releases.
 

lunadesign

Active Member
Aug 7, 2013
256
34
28
One follow-up thought on 10G - if I'm only connecting 2 or 3 priority systems to the NAS/SAN, couldn't I just connect NIC-to-NIC and skip purchasing a 10G switch until I need to connect more systems?
 

gea

Well-Known Member
Dec 31, 2010
3,141
1,184
113
DE
You can directly connect 10G adapters just like you can with 1G adapters,
I also use some of the Netgear XS708 switches. They are not as loud as some 1G switches.
You can use them if they are not on the desk directly beside you.