How to speed up resilver? (ZFS + Ubuntu bionic)

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

mrjayviper

Member
Jul 28, 2017
62
2
8
54
My setup:

  • 8x 2TB RAIDZ -was built with ashift=12 (confirmed using zdb)
  • using built-in SATA ports (AMD chipset with 8 ports)
I replaced 1 drive with a 3TB one and the resilver rate is around 5.x MBps. According to the status, it will take around 240 hours to complete!

Am I doing something wrong?

I can rebuilt/destroy the array as I've copied array contents to other disks.

Thanks
 

zxv

The more I C, the less I see.
Sep 10, 2017
156
57
28
If it's a single vdev (8 disks as a single raidz stripe), then yes, that will be slower than say, 4 mirrored pairs. But even so, 5MB/s seems low.

Can you paste the output of "zpool status -v" and "zfs get all <poolname>" and we can try to see what's going on.
 

fjes82

New Member
Jan 4, 2020
1
2
1
My setup:

  • 8x 2TB RAIDZ -was built with ashift=12 (confirmed using zdb)
  • using built-in SATA ports (AMD chipset with 8 ports)
I replaced 1 drive with a 3TB one and the resilver rate is around 5.x MBps. According to the status, it will take around 240 hours to complete!

Am I doing something wrong?

I can rebuilt/destroy the array as I've copied array contents to other disks.

Thanks
For the record I had the same problem and spend 2 nights very worried.

I tried everything: replace, detach/attach, reboots, adjust parameters (resilver_delay, resilver_min_time_ms, top_maxinflight) and tweeking zfs_vdev_async_write_min_active (which was already set to good values in my ubuntu 18.04 repo defaults) without any improvement.

In the end, I discovered that my replaced disk was too hot (it was hitting 72ºC, the limit for my ST8000VN disk), as reported by smartctl tests. I couldnt even touch it, but could very well fry some eggs in it.

I installed him very close to other disks and ventilation was insufficient at that moment.

After I solved these issues, my 2,67TB resilver came from 562h to 13h.

Strangely, "iostat -x" reported near 99% usage with very little data transfer in the first scenario, and approximately 20% usage in the latter.

Hope it helps any poor soul wasting his weekend like I did.
 

vanfawx

Active Member
Jan 4, 2015
365
67
28
45
Vancouver, Canada
As well, ZFS 0.8 has introduced a more sequential scrub/resilver logic that should improve things as well. On one of my systems, resilver went from 8 hours to 3.
 

gea

Well-Known Member
Dec 31, 2010
3,141
1,182
113
DE
There are tuning options for resilver time but with a very little improvement. Resilver must read all metadata and data what means that resilver time is traditionally mostly related to the pool performance for small io.

To improve this, Oracle introduced sequential resilvering in 2015 what means that first metadate is read to sort datareads to make them more sequential than random, Sequential Resilvering

This improvement is now available in Open-ZFS (current Illumos, Free-BSD and ZoL) and called sorted resilvering. Only this can reduce resilver/scrub time dramatically.
 

vangoose

Active Member
May 21, 2019
326
104
43
Canada
I just did a scrub on 24 * 3 TB disk zpool, 3*raidz2 vdev(8 disk), 28TB data, Resilver rate is 1.5GB/s
CentOS 8, zfs 0.82