SAS vs SATA in Terms of Bit Rot

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

Samir

Post Liker and Deal Hunter Extraordinaire!
Jul 21, 2017
3,284
1,471
113
49
HSV and SFO
For about 2 decades now I've had to move away from the reliability of SCSI drives in favor of the cheaper and more available 'IDE' flavors of SATA. But recently have been able to get back to the enhanced reliability of SAS drives due to hardware finally coming into our price range.

We've been using SATA drives in manually initiated 3-way mirroring for several years now, upgrading drives about every 2 years or sooner due to the increased storage needs. We have been using enterprise graded SATA drives from WDC and HGST with 5 years warranties. Out of about a dozen drives, we did have one fail so far.

But a bigger problem is that of bit rot. Most of our files are pdfs of documents scanned at 600dpi full color. On occasion we can't open a file that was probably openable at one point in time. I've done some research on this, sometimes pulling backups of the file from several stored generations of drives (we keep the drives we upgrade from as cold storage) without any evidence of bit rot. But I have also caught one or two.

So my question is that now that we can use SAS drives, do you see any perceivable difference in bit rot on SAS drives vs SATA when used in the same application? I know raid5 as well as smarter file systems can negate and sometimes even prevent the issue from surfacing, but I'm really trying to see if there's a difference at the drive hardware level.

I've always wondered this question (even before we could afford sas), but didn't know who to ask. I know if anyone knows, you guys do! :) Thank you in advance for any feedback.
 

ajs

Active Member
Mar 27, 2018
101
36
28
Minnesota
This really comes down to your definition of "bit rot". Data that was written correctly to a drive should (sans firmware bug) either return the correct data, or not return any data (read error). I think when most people talk about bit rot, what they are really experiencing is the data being corrupted "in flight".. normally due to a bad HBA, bad cable, or bad system memory. Bit rot where the data is 'flipped' after written to the media doesn't really happen.. the ecc and crc checks on the drive media don't allow for it.

In flight corruption is where a SCSI feature called "protection information" comes into play. PI offers end-to-end data integrity protection from the drive to the host. This would probably be the biggest benefit SAS over SATA would have in your case.

https://www.hgst.com/sites/default/files/resources/End-to-end_Data_Protection.pdf
 

Samir

Post Liker and Deal Hunter Extraordinaire!
Jul 21, 2017
3,284
1,471
113
49
HSV and SFO
Thank you for the great answer and link.

For us what I've seen that I've coined 'bit rot' is a spontaneous change in a bit, usually after a few years. I've seen this with my own personal photography archive as well. I copy the file to 3 drives and then compare all 3 to the original when the image is offloaded. Before the image is deleted from the original (usually after about another week or so), it is again compared to the 3 drives. Then about every year or so, all 3 drives are compared in their entirety. What I've found (contrary to what you posted above about drive reads) is that sometimes one file will be different than the other two drives. Upon a detailed file comparison, it is usually a single bit that has changed in that file. After checking the file again and again to ensure this is a real change, I will re-copy the good file from the 2 drives overwriting the corrupted one and compare them again to ensure it is correct..

With images, a single bit change usually had zero ill effect since the bit was part of many that created the image. But in the pdfs that we're storing today, a single bit can render the file unreadable. SAS definitely does have a huge advantage with EDP, assuming it is active in the older host adapters we're using (perc 5ir).
 

Evan

Well-Known Member
Jan 6, 2016
3,346
598
113
If your pdf’s are static why not setup md5deep or a file checksum for each files and compare say monthly ?

I was going to say that the drive mechanics between SAS and sata for the same model drive are same but the SAS signaling is more robust.
 

BackupProphet

Well-Known Member
Jul 2, 2014
1,088
643
113
Stavanger, Norway
olavgg.com
SAS still has better CRC protection. Though big players like Dropbox use SATA drives in their DC.

I believe SATA is good enough, and filesystem like ZFS certainly adds the extra needed protection. I have had errors with SATA cables, but it seems like the hard drives themself are really good to pick up errors and tell the operating system that it should retry.
 

Samir

Post Liker and Deal Hunter Extraordinaire!
Jul 21, 2017
3,284
1,471
113
49
HSV and SFO
If your pdf’s are static why not setup md5deep or a file checksum for each files and compare say monthly ?

I was going to say that the drive mechanics between SAS and sata for the same model drive are same but the SAS signaling is more robust.
We've considered the md5 route, but it's actually easier to just run a compare between the drives since we'd be doing two copies of the files at once. But even that is just something that will alert us to the problem. Preventing the problem would be more useful.

Yeah, I can feel the difference between SAS and SATA drives, even of different capacities. There is that same feeling of quality I can't really describe that you can see and feel on a SCSI and SAS drive that feels like it's lacking otherwise.
 

Samir

Post Liker and Deal Hunter Extraordinaire!
Jul 21, 2017
3,284
1,471
113
49
HSV and SFO
SAS still has better CRC protection. Though big players like Dropbox use SATA drives in their DC.

I believe SATA is good enough, and filesystem like ZFS certainly adds the extra needed protection. I have had errors with SATA cables, but it seems like the hard drives themself are really good to pick up errors and tell the operating system that it should retry.
I used to think SATA was good enough as well. But I think I'm about to change my mind. SAS seems to be higher quality and designed for quality data retention, especially in 10k and 15k variants of 73, 146, 300, 450, 600gb sizes. A local hosting owner that I met said he had more failures with the 7200 class sas than the faster ones, which makes sense to me as the 7200 class SAS drives seem like the SATA ones with just an interface change.
 

EffrafaxOfWug

Radioactive Member
Feb 12, 2015
1,394
511
113
A colleague and I wrote a hashing blockchain system for our forensic file captures 15 years ago. I'm still using much of that code today. If your files are mostly static then why not use md5deep/hasdeeep?

In all my experience, SAS is no better quality-wise than SATA and I'm yet to experience supposed bitrot on any active hardware myself.
 

Samir

Post Liker and Deal Hunter Extraordinaire!
Jul 21, 2017
3,284
1,471
113
49
HSV and SFO
A colleague and I wrote a hashing blockchain system for our forensic file captures 15 years ago. I'm still using much of that code today. If your files are mostly static then why not use md5deep/hasdeeep?

In all my experience, SAS is no better quality-wise than SATA and I'm yet to experience supposed bitrot on any active hardware myself.
Interesting. I just read up on md5deep. We regularly add new files and move files, so we would be having to generate a new md5 each day. And there's no real way to check the hashes in this scenario.

I never saw bitrot on SCSI devices or smaller IDE ones (back in the day). But once the 128GB barrier was broken and the drives got larger, I started to see them at about the 250GB mark (some video files on my video archive drives had bits changes compared to the other 2 copies).

Fast forward to today and drives have changed quite a bit in the last 20-30 years. But seems like the bit rot I saw early on is still here, at least on the SATA drives I've been using thus far.
 

ttabbal

Active Member
Mar 10, 2016
746
207
43
47
I've had some problems before, but I switched to ZFS and haven't had problems since. If you are already willing to use triple mirrors, changing filesystems isn't a big deal.
 

Samir

Post Liker and Deal Hunter Extraordinaire!
Jul 21, 2017
3,284
1,471
113
49
HSV and SFO
I've had some problems before, but I switched to ZFS and haven't had problems since. If you are already willing to use triple mirrors, changing filesystems isn't a big deal.
The only problem I have with alternate file systems is that recovery (worst case scenario) seems to cost a whole lot more than the mainstream. Hence why I wanted to examine if there was a hardware change that would make a difference.
 

gea

Well-Known Member
Dec 31, 2010
3,155
1,193
113
DE
The point with ZFS is that it offers real end to end data protection with data/metadata checksums and a self healing behaviour on access or scrub. (real data on disk <-> data in the ZFS subsystem), not only SAS HBA <-> SAS disk. With CopyOnWrite it adds a crash resistent behaviour (no corrupted filesystem or raid on a crash during a write). As it is in production use for more than 10 years it is very safe. Since I use ZFS (nearly 10 years) I have not had any dataloss with ZFS.

The newest Open-ZFS systems ex the upcoming OmniOS 151026 stable adds improved data recovery ex restore the part of data that is valid even if a vdev is missing, see https://github.com/omniosorg/omnios-build/blob/r151026/doc/ReleaseNotes.mdmore and Turbocharging ZFS Data Recovery

You should care about a disaster backup anyway, even with ZFS
 

mrkrad

Well-Known Member
Oct 13, 2012
1,244
52
48
You just need a mega raid or adaptec/hp smart array controller and drives that support PI - protection information per sector, it is read back when the data is Read and thus any bit-rot or uncorrectable error /in transmission can be identified - same way as ZFS does but using hardware to compare. Many SAS drives support PI and it provides better end-to-end error detection than sata - which will happily push bad data back to the controller without notification of error (including wiring, bad ram,uce)
 

ttabbal

Active Member
Mar 10, 2016
746
207
43
47
The only problem I have with alternate file systems is that recovery (worst case scenario) seems to cost a whole lot more than the mainstream. Hence why I wanted to examine if there was a hardware change that would make a difference.

If your data is that important to you, you should have many layers of redundant backups. Data restored from one of those services is iffy at best. And they cost so much, even for common filesystems, that I'm surprised they have any customers at all. It's far cheaper to buy some LTO drives and tapes than to use that sort of thing. Less again to use multiple servers with ZFS snapshots and triple mirrors or raidz2/3. If it's really that important, I'd do both an offsite server array and on+offsite tape.

At the end of the day, it's your data, so you have to decide. I hope you never have to use any of them. :)
 

Samir

Post Liker and Deal Hunter Extraordinaire!
Jul 21, 2017
3,284
1,471
113
49
HSV and SFO
Thank you very much for the replies and information about ZFS and recovery.

We also have a HP P400 controller as well as the Dell perc one, so we're able to run a lot of simultaneous backups. And we have off-site backups in two different states each a day or more away geographically.

I didn't know recovery results were that sketchy. Sounds like something to definitely something to avoid at all costs.
 

jfeldt

Member
Jul 19, 2015
48
9
8
54
Bit-rot is something I have experienced as well, and also worry about. I was scared to get scared in proprietary hardware methods to combat it so I settled on a combination of ZFS along with scrubs, and a PAR2 with 10% recovery on things I care about the most (like all of the photos I take), along with on-site and off-site backups. I've looked at SAS/SATA since I was also a part of the SCSI fold in the early 90s, and the last time I looked, it seemed like most SAS drives were still an order of magnitude better rated for errors (one in 10^14 versus 10^13 if I remember correctly), so if you don't mind the cost, I think a combo of those plus ZFS plus PAR2 plus on-and-offf site backups (multiples on different storage media if you can) would get you very well set.
 

i386

Well-Known Member
Mar 18, 2016
4,240
1,546
113
34
Germany
it seemed like most SAS drives were still an order of magnitude better rated for errors (one in 10^14 versus 10^13 if I remember correctly)
Newer (nearline) sas and (enterprise) sata hdds or ssds have the same error rate (ex: hgst dc 14tb -> 1 in 10^15 bits). There are no big differences anymore as sata got similar features/functions over the years that scsi had for enterprise use cases.
 

Samir

Post Liker and Deal Hunter Extraordinaire!
Jul 21, 2017
3,284
1,471
113
49
HSV and SFO
Newer (nearline) sas and (enterprise) sata hdds or ssds have the same error rate (ex: hgst dc 14tb -> 1 in 10^15 bits). There are no big differences anymore as sata got similar features/functions over the years that scsi had for enterprise use cases.
This may be the case in terms of specs, but when I recently spoke to the owner of a datacenter, he found that sata and sas drives in the 7200 class had more problems than 10/15k sas drives.
 

Evan

Well-Known Member
Jan 6, 2016
3,346
598
113
This may be the case in terms of specs, but when I recently spoke to the owner of a datacenter, he found that sata and sas drives in the 7200 class had more problems than 10/15k sas drives.
I would agree to this also.

But, my point was that for any given drive, eg a hgst he8 the SAS and sata drives should be for all intents identical.
 

Samir

Post Liker and Deal Hunter Extraordinaire!
Jul 21, 2017
3,284
1,471
113
49
HSV and SFO
I would agree to this also.

But, my point was that for any given drive, eg a hgst he8 the SAS and sata drives should be for all intents identical.
Interesting that you agree with my friend on this.

And taking your point into consideration, maybe this is why there are no sata 15k drives--no one will pay a premium for a more reliable sata drive vs sas.