ZFS on Hardware-Raid

gea

Well-Known Member
Dec 31, 2010
2,502
842
113
DE
Your current problem is not ZFS but OS, configuration or hardware related.
What you should consider

Fastest resilver is sequential resilver on Oracle Solaris but not free
But there is work to implement it in Open-ZFS

As iops and fragmentation is the main time related factor, you should
care about. Iops depends on vdev layout especially number of vdevs.
Raid-Z vdev has the iops of a single disk while on a mirror iops read is 2 x disk and write is 1x.
So a pool from many mirrors is much faster than a raid-z with less vdevs.

Fragmentation depends on fillrate. If you want a fast pool stay below say 60% fillrate

With Raid-Z always Z2 or Z3 to allow any two disks to fail. On mirrors this is expensive
but a failure of two disks in a mirror is not very likely. A backup is always a good idea.

A larger recordsize reduce metadata but can slow down if your use case is for ex. iSCSI
or VM storage where the application works with a small blocksize. Asift=12 is default
by current setups (512e or 4kn)

With many disks especially with an expander use SAS disks. A single bad Sata disk can
initiate blocks or resets on an expander.

Ram as readcache for metadate (or with less RAM an L2Arc, ex an Intel Optane) reduce
reads from pool what makes a resilver faster.

Avoid dedup (at least outside Solaris and dedup2) and stay with lz4 compress.

Prefer a Solarish based OS for ZFS storage or a Free-BSD one.

With this in mind a resilver is is a matter of say a day or less.
 
Last edited:

ttabbal

Active Member
Mar 10, 2016
747
202
43
43
Or reduce the random load on the pool while repairs are in progress. It was mentioned that this is a backup target. Disable that while the resliver runs.

One of the reasons array rebuild is fast on hardware RAID is that it is often done in BIOS or EFI with no other load. Keeping the disks in sequential mode helps a LOT.

I replaced a mirror recently in OpenZFS on Linux. Keeping the load down I was near the max sequential write of the drive most of the time and it was hours, not days, to complete. In this case it was a 6TB disk, but the array is not full. I believe it had about 3TB to copy.