Is ECC still considered an absolute must with ZFS/BRTFS?

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

SRussell

Active Member
Oct 7, 2019
327
152
43
US
I have a SuperMicro Workstation that I am building into an unRAID server for Plex, Docker, and Home Assistant.

The board: X12SAE-5 | Motherboards | Products | Supermicro

The plan is 4x 10TB ZFS raidz1 - I have two SuperServers that run ECC and are my primary and secondary backup. I have a third system being deployed as remote backup. They all run the exact same hardware and configuration - ZFS with ECC.

The unRAID box will store microscope and horticulture demonstration and training videos. Do I really need to use ECC in this system or is desktop RAM perfectly safe for my use case?

Any pro or con opinions on using an 11th gen i5 or i7 vs Xeon W in a LGA 1200 socket?
 

alaricljs

Active Member
Jun 16, 2023
199
74
28
I've you're not running TBs of ram or a 'fleet' of systems then generally it's not a big deal to run without ECC. There is still a risk, but as long as you have backups and aren't running mission critical "something" then the ROI is likely not there even if you do end up restoring a backup once or twice. In a fleet of ~1000 machines running 24x7 with a large percentage of them overclocked or at least pegged at 100% boost we have ECC failures only in the standard timeframes: infant (1-2mo) or elder (4+ yrs) mortality. And they are still few and far between.

*edit - DDR4... we've only just started rolling DDR5 systems
 

i386

Well-Known Member
Mar 18, 2016
4,250
1,548
113
34
Germany
(I'm currently using a threadripper pro system with non ecc ram. It works, but I would feel & sleep a lot better if it was ecc ram. At some point I will change it to ecc ram.
If the zfs gurus were to believe all my data would have been lost or damaged because of not using zfs and not using ecc ram, so far this didn't happen and the md5 and sha256 checksums still are the same...)

I would say no, you don't need it. But if your platform supports it and you have the budget go for ecc ram for a better feeling and extra safety* :D

*finding reliable information and especially statistics about ecc vs non-ecc are pretty hard to find. there are some papers about this topic but most of them are behind expensive paywalls
 
  • Like
Reactions: SRussell

ericloewe

Active Member
Apr 24, 2017
295
129
43
30
It's really simple, there are only two types of data in this world: Things you need to store, and things you don't need to store. For the former category, since you need to store it safely, it only makes sense to pull out all the stops (and ECC memory is only a small piece - backups, reliable storage, there are many other things to worry about).
For the latter category, who cares, just delete it already.
 
  • Like
Reactions: SRussell

louie1961

Active Member
May 15, 2023
171
69
28
You'll never get a consensus on this topic. In my mind a robust backup strategy gives you all the protection you need. I think ECC only really makes sense for situations when you are actively processing data and need reliability (i.e., a server). Lots of people run NAS devices with no ECC (think of all the Synology devices out there without ECC) and I have never heard of an example bit rot actually having had happened in the wild and trashing a ton of data. Likewise for the supposed ZFS scrub of death.
 
  • Like
Reactions: SRussell

SRussell

Active Member
Oct 7, 2019
327
152
43
US
You'll never get a consensus on this topic. In my mind a robust backup strategy gives you all the protection you need. I think ECC only really makes sense for situations when you are actively processing data and need reliability (i.e., a server). Lots of people run NAS devices with no ECC (think of all the Synology devices out there without ECC) and I have never heard of an example bit rot actually having had happened in the wild and trashing a ton of data. Likewise for the supposed ZFS scrub of death.
I figured I had two systems running ZFS with ECC, a third coming online, and I have my 3-2-1. The difference in pricing, for what is a content server, is about $350.
 

reasonsandreasons

Active Member
May 16, 2022
133
88
28
This question always gets muddled because there's two points competing:
  • ZFS does not uniquely require ECC more than any other filesystem; the "scrub of death" scenario is basically bunk.
  • Generally speaking, people who run ZFS and deal with its quirks (especially around expansion) are very concerned about data integrity. ECC RAM prevents a type of error that ZFS can't effectively protect against, so it's (understandably) popular amongst ZFS enthusiasts.
In your case, I'd suggest just getting the requisite Xeon and ECC unless the pricing is prohibitive. If you didn't already have the W580 board it'd be more marginal, but in my book it's worth the peace of mind as you'll presumably be storing the first copy of everything there. If any bits are flipped on this box having everything backed up doesn't mean much.
 
  • Like
Reactions: SRussell

Stephan

Well-Known Member
Apr 21, 2017
945
714
93
Germany
Attitude-wise I'm on the extreme ECC/ZFS-side of the reliability spectrum. I also run tape backups to LTO drives.

I want reliability through checksumming over the entire data path. From untrusted drive to untrusted cabling to barely trusted controller to trusted CPU (multiple ECC-like systems for busses, caches etc.) to trusted ECC RAM, and back.

ECC RAM support in CPUs has error counters. You can diagnose slowly dying RAM sticks without rebooting to memtest86.

Radioactive decay of elements within the chip, put there during manufacturing as contamination, sometimes has enough energy to flip a bit. Some machines I just never turn off, like my main workstation. In such a decay event I want everything to just continue chugging along.

Sometimes all of this is thwarted by
  • Unstable CPU microcode updates from Intel or AMD
  • CPU bugs (bet you never heard of that one: linux/arch/x86/kernel/cpu/mce/core.c at master · torvalds/linux)
  • Bad motherboard designs, 4000+ LGA pin count really has gotten out of control
  • Bad Linux kernel patches and filesystem or block layer data corruption roughly every decade
  • ZFS bugs like the recent sparse file handling issue
  • Compiler bugs
Do I want to add another line item "Non-ECC RAM messing up a random bit every quarter", no.
 
  • Like
Reactions: pimposh and TRACKER

SRussell

Active Member
Oct 7, 2019
327
152
43
US
This question always gets muddled because there's two points competing:
  • ZFS does not uniquely require ECC more than any other filesystem; the "scrub of death" scenario is basically bunk.
  • Generally speaking, people who run ZFS and deal with its quirks (especially around expansion) are very concerned about data integrity. ECC RAM prevents a type of error that ZFS can't effectively protect against, so it's (understandably) popular amongst ZFS enthusiasts.
In your case, I'd suggest just getting the requisite Xeon and ECC unless the pricing is prohibitive. If you didn't already have the W580 board it'd be more marginal, but in my book it's worth the peace of mind as you'll presumably be storing the first copy of everything there. If any bits are flipped on this box having everything backed up doesn't mean much.
This is a very good point. The price difference is about $380 between the same physical quantity of RAM. I am a home user but my video content does generate revenue. In the big picture if I lost two hours of video to corruption or bit rot it would exceed $380 in revenue.

I appreciate this insight.
 
  • Like
Reactions: reasonsandreasons

nabsltd

Well-Known Member
Jan 26, 2022
431
293
63
Any pro or con opinions on using an 11th gen i5 or i7 vs Xeon W in a LGA 1200 socket?
The only thing that I haven't seen pointed out is that many motherboards have a lower limit on total RAM if you don't use ECC. This is not an issue for the motherboard you have, but it's something to remember for the future.
 

NPS

Active Member
Jan 14, 2021
147
44
28
That's more a thing of UDIMM vs. RDIMM and as those are completely (even mechanically) incompatible with DDR5, this probably will be a "non topic" in the future. Even with DDR4 there are many RDIMM compatible platforms that don't work with UDIMMs. So it's starting to be a thing of the past.