Replacing a ZFS Drive Something Seems Wrong

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

pinkanese

New Member
Jun 19, 2014
27
10
3
33
I have a ZFS array attached to my Proxmox host for media and general storage. One of the drives had a large number of relocated sectors and came up faulted so I got a replacement. Ran the replace command but the completion percentage hasn't moved since I started and the ETA goes up and up.

I know it is a slightly strange setup, but I have 7 3TB drives in a Raid z2 and a pair of 4TB drives in a mirror. one of the 3TB drives failed.

I tried to offline the faulted drive but that didn't seem to have any effect. So I just ran zpool replace PoolofThrees /dev/disk/by-id/wwn-0x5000039ff4f5a07c /dev/disk/by-id/wwn-0x5000039ff4cb4a1c

Code:
  pool: PoolofThrees
state: UNAVAIL
status: One or more devices is currently being resilvered.  The pool will
        continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
  scan: resilver in progress since Tue Dec 22 18:04:33 2020
        2.56T scanned at 228M/s, 917G issued at 79.8M/s, 17.3T total
        169M resilvered, 5.17% done, 2 days 11:58:32 to go
config:

        NAME                          STATE     READ WRITE CKSUM
        PoolofThrees                                 UNAVAIL      0     0     0  insufficient replicas
          raidz2-0                                        UNAVAIL      0     0     0  cannot open
            wwn-0x5000039fe3cac490    ONLINE       0     0     0
            wwn-0x5000039fe3cb445e    ONLINE       0     0     0
            wwn-0x5000039fe3c400e1    ONLINE       0     0     0
            wwn-0x5000039ff4f2df57    ONLINE       0     0     0
            wwn-0x5000039ff4f2df60    ONLINE       0     0     0
            wwn-0x5000039ff4f595db    ONLINE       0     0     0
            replacing-6                       DEGRADED     0     0     0
              wwn-0x5000039ff4f5a07c  FAULTED     76     0     0  too many errors
              wwn-0x5000039ff4cb4a1c  ONLINE       0     0     0  (resilvering)
          mirror-1                              ONLINE       0     0     0
            wwn-0x5000cca250ebf410    ONLINE       0     0     0
            wwn-0x5000cca23dd2ea40    ONLINE       0     0     0

errors: 1616047 data errors, use '-v' for a list
When I add -v to see errors I get errors: List of errors unavailable: pool I/O is currently suspended and when I check the event log all the errors are marked as "pool_failmode=wait".

I expect the resilvering to take a few days for a 3TB drive, but it has been sitting at 5.17% for a few hours now. I am also a little confused as to why the pool is unavailable.

Did I break something?
 

andrewbedia

Active Member
Jan 11, 2013
698
247
43
Not terribly familiar with Proxmox (I know what it is, but I've never used it), but I know a lot about ZFS

Your I/O is completely halted due to a top level vdev going unavail. You don't have a lot of data here (2.57TB isn't a terribly high amount). What I'd recommend is exporting the pool (if it will let you) and then try re-importing it. If the re-silver kicks in, see if it will complete. If it completes, probably scrub it to see if all of the permanent errors get recovered. It might not let you export it gracefully and you'll need to reboot. You might have to hard reboot it because of D state processes holding up the entire system.

If it will not complete (pool goes UNAVAIL again), I would see if you can import it with the replacement drive out (cancelling the resilver and just leaving the vdev DEGRADED). At that point, snapshot all of your datasets/filesystems. Re-import the pool readonly e.g. `zpool export PoolofThrees; zpool import -o readonly=on PoolofThrees`. At that point, evacuate the data: create a new array on a different drive/drives (maybe a USB drive from best buy? or drives you have laying around) and then send all of the datasets over. Verify it's all good on the new array. Destroy and re-create the old array with the new drive already in place. Sync the data back.
 
  • Like
Reactions: rubylaser

pinkanese

New Member
Jun 19, 2014
27
10
3
33
Exporting didn't work. So I forcibly shut down the server. Pulled the degraded drive plus the new one, rebooted and the pool looked normal with the bad drive marked as offline. Did a zpool clear, then added the replacement drive back in. Wiped the new drive and reran the replace command. Everything seems to be working fine now.

The ETA is now running down instead of up. I am very pleased. Thank you very much andrewbedia

The only thing I don't understand is what went wrong. Should I have detached the failing drive before I tried to replace it? I am guessing ZFS was trying to pull data off the failing drive to rebuild the array and ran into issues.
 

andrewbedia

Active Member
Jan 11, 2013
698
247
43
glad to be of help.

as for what happened... hmm, I'm honestly not too sure. I'm not super good at digging far under the hood when things go sideways like that. There is logging in zfs for events like this, but I've never looked into it.

There's no exact answer I suppose for deciding to remove the drive before or after the replace. On paper, the safest way is to leave the "bad" drive in the array in case the new drive dies. This is helpful if you're pre-emptively replacing a drive that is showing signs of going bad, but hasn't totally fallen off into hopeless territory.

At the same time, the bad drive can be so "bad" that it just hangs up the whole replace process (hangs up the entire controller?) or makes it extremely slow (e.g. if it's having a problem where it's only able to read at ~1MB/s) where it would be FAR better to rip out the "bad" drive/detach it and then do the replace. I have also seen situations where leaving the drive in there will cause the resilver to go in a loop and never complete.

Sorry, but that's the best I got.