Hi. I hope someone here can help me.
Since 2015 I have z pool of 5 WD reds in raidz2 on Ubuntu server 16.04 and recently one of my drives started reporting checksum errors. After scrub error count went up and I decided to check if I should replace the drive. Smartctl said what SMART reports what everything is good and there were no errors even with long test. So the faulty drive was removed to do some disk surface tests with windows tools (I'm not that good with linux disk tools) and all partitions were removed and formatted. I decided what disk is healthy enough (there was no any disk problems reported) and possibly sata cable was the reason of check sum errors. I also thought what it also could be my power supply but in that case I think there had to be errors on more than 1 disk. So I replaced the cable and installed the disk back to it's place. Since all partitions gone I had to do a resilver of my zpool:
I use ashift=9 because some weird reasons I had in 2015. I would do ashift=12 now but it is as it is.
And after 35 and 1/2 hours resilverng was finished with results as seen in spoiler on top. Just in case I'll copy it here:
And here is my problem - disk is stuck in replacing state with insufficient replicas message and by some reason zpool says that drive is removed but it is online actually and I'm able to see it on dev list, read it's smart data and etc.
So is there any way to bring that disk back online and finish that replacing process?
Code:
pool: Zdata
state: DEGRADED
scan: resilvered 552K in 35h34m with 0 errors on Tue May 28 13:07:29 2019
config:
NAME STATE READ WRITE CKSUM
Zdata DEGRADED 0 0 0
raidz2-0 DEGRADED 0 0 0
replacing-0 UNAVAIL 0 0 0 insufficient replicas
old UNAVAIL 0 0 4,79K
ata-WDC_WD40EFRX-68WT0N0_WD-WCC4E1JVVJ7K REMOVED 0 0 0
ata-WDC_WD40EFRX-68WT0N0_WD-WCC4E1JVVZFK ONLINE 0 0 0
ata-WDC_WD40EFRX-68WT0N0_WD-WCC4E3CLYA8T ONLINE 0 0 0
ata-WDC_WD40EFRX-68WT0N0_WD-WCC4E7TPZJ0K ONLINE 0 0 0
ata-WDC_WD40EFRX-68WT0N0_WD-WCC4E7TPZL3C ONLINE 0 0 0
errors: No known data errors
Since 2015 I have z pool of 5 WD reds in raidz2 on Ubuntu server 16.04 and recently one of my drives started reporting checksum errors. After scrub error count went up and I decided to check if I should replace the drive. Smartctl said what SMART reports what everything is good and there were no errors even with long test. So the faulty drive was removed to do some disk surface tests with windows tools (I'm not that good with linux disk tools) and all partitions were removed and formatted. I decided what disk is healthy enough (there was no any disk problems reported) and possibly sata cable was the reason of check sum errors. I also thought what it also could be my power supply but in that case I think there had to be errors on more than 1 disk. So I replaced the cable and installed the disk back to it's place. Since all partitions gone I had to do a resilver of my zpool:
Code:
zpool replace Zdata /dev/disk/by-id/ata-WDC_WD40EFRX-68WT0N0_WD-WCC4E1JVVJ7K -f -o ashift=9
And after 35 and 1/2 hours resilverng was finished with results as seen in spoiler on top. Just in case I'll copy it here:
Code:
pool: Zdata
state: DEGRADED
scan: resilvered 552K in 35h34m with 0 errors on Tue May 28 13:07:29 2019
config:
NAME STATE READ WRITE CKSUM
Zdata DEGRADED 0 0 0
raidz2-0 DEGRADED 0 0 0
replacing-0 UNAVAIL 0 0 0 insufficient replicas
old UNAVAIL 0 0 4,79K
ata-WDC_WD40EFRX-68WT0N0_WD-WCC4E1JVVJ7K REMOVED 0 0 0
ata-WDC_WD40EFRX-68WT0N0_WD-WCC4E1JVVZFK ONLINE 0 0 0
ata-WDC_WD40EFRX-68WT0N0_WD-WCC4E3CLYA8T ONLINE 0 0 0
ata-WDC_WD40EFRX-68WT0N0_WD-WCC4E7TPZJ0K ONLINE 0 0 0
ata-WDC_WD40EFRX-68WT0N0_WD-WCC4E7TPZL3C ONLINE 0 0 0
errors: No known data errors
So is there any way to bring that disk back online and finish that replacing process?
Last edited: