Your current problem is not ZFS but OS, configuration or hardware related.
What you should consider
Fastest resilver is sequential resilver on Oracle Solaris but not free
But there is work to implement it in Open-ZFS
As iops and fragmentation is the main time related factor, you should
care about. Iops depends on vdev layout especially number of vdevs.
Raid-Z vdev has the iops of a single disk while on a mirror iops read is 2 x disk and write is 1x.
So a pool from many mirrors is much faster than a raid-z with less vdevs.
Fragmentation depends on fillrate. If you want a fast pool stay below say 60% fillrate
With Raid-Z always Z2 or Z3 to allow any two disks to fail. On mirrors this is expensive
but a failure of two disks in a mirror is not very likely. A backup is always a good idea.
A larger recordsize reduce metadata but can slow down if your use case is for ex. iSCSI
or VM storage where the application works with a small blocksize. Asift=12 is default
by current setups (512e or 4kn)
With many disks especially with an expander use SAS disks. A single bad Sata disk can
initiate blocks or resets on an expander.
Ram as readcache for metadate (or with less RAM an L2Arc, ex an Intel Optane) reduce
reads from pool what makes a resilver faster.
Avoid dedup (at least outside Solaris and dedup2) and stay with lz4 compress.
Prefer a Solarish based OS for ZFS storage or a Free-BSD one.
With this in mind a resilver is is a matter of say a day or less.
What you should consider
Fastest resilver is sequential resilver on Oracle Solaris but not free
But there is work to implement it in Open-ZFS
As iops and fragmentation is the main time related factor, you should
care about. Iops depends on vdev layout especially number of vdevs.
Raid-Z vdev has the iops of a single disk while on a mirror iops read is 2 x disk and write is 1x.
So a pool from many mirrors is much faster than a raid-z with less vdevs.
Fragmentation depends on fillrate. If you want a fast pool stay below say 60% fillrate
With Raid-Z always Z2 or Z3 to allow any two disks to fail. On mirrors this is expensive
but a failure of two disks in a mirror is not very likely. A backup is always a good idea.
A larger recordsize reduce metadata but can slow down if your use case is for ex. iSCSI
or VM storage where the application works with a small blocksize. Asift=12 is default
by current setups (512e or 4kn)
With many disks especially with an expander use SAS disks. A single bad Sata disk can
initiate blocks or resets on an expander.
Ram as readcache for metadate (or with less RAM an L2Arc, ex an Intel Optane) reduce
reads from pool what makes a resilver faster.
Avoid dedup (at least outside Solaris and dedup2) and stay with lz4 compress.
Prefer a Solarish based OS for ZFS storage or a Free-BSD one.
With this in mind a resilver is is a matter of say a day or less.
Last edited: