Zpool, same drive designation stuck in 2x z2 sub

Somehow I have stuffed up (story not added due to lack of comprehension)

My pool topgear has been degraded, today it went offline (C3T13D1 did not start at all, removing and reinserting got it back ((time bomb)))

After managing to get back to degraded (can't access the data from network, yet)

I realised that in 2 raidz' I have the same vdev

SSH status -lx screenshot
1646207427688.png
Napp-it Pool Page
1646208336476.png


My head is currently fuzzy as anything so words aren't coming;

But any ideas on how to at least get the same drive SN removed


Physically slots C4T16D1, C3T14D1 are empty
SN G8RHD is in the forbidden pile (Last IOStat I saw, S:0 H:526 T:291)

Extra q: drives I may have pulled out of pools at various stages - how do I 'reset' them so they can go in a 'new' pool
 
Last edited:

gea

Well-Known Member
Dec 31, 2010
2,809
970
113
DE
Newer HBAs use WWN to identify disks. This is a manufacturer id similar to the mac adress of a nic. Such an id ex c1t5000CCA0BCE3CE1Cd0 remains the same when you attach a disk to another physical HBA port or even another server.

"Short numbers" like c0t0d0 reflect a physical port of your controller. With many disks not a good idea. In your case it seems that you have moved disks around what results that a certain disk is expected at another slot.

How to fix:
Export + import the pool. During import all disk labels are read what solves this problem.
 
That 3059178084820097818 isn't a WWN is it though?

I have exported and imported the pool - was thinking I should try and disable the drives while I was working on getting space ready to try and migrate the data;
Now there are three lines with numbers like above

Thought perhaps I'd be able to find the drives that match them in my suspicious pile (if not the red flagged 2) - but they don't really look like WWN -- and my drives are pre WWN stickered

I'm not sure what FW mode the M1015 are in, nor what driver Sol11 is using to keep physical port naming

-- Have to decide which method is best to 'move' files to Temp8Tb pool - possibly with a post check to see if transfer didn't corrupt
-- Cannot start any moves to new server until
-- -- Sort one more Sata Power cable
-- -- Choose firmware for M1015
-- -- Define HW Settings for Nappit AIO VM
 
Last edited:

gea

Well-Known Member
Dec 31, 2010
2,809
970
113
DE
That 3059178084820097818 isn't a WWN is it though?

I have exported and imported the pool - was thinking I should try and disable the drives while I was working on getting space ready to try and migrate the data;
Now there are three lines with numbers like above

Thought perhaps I'd be able to find the drives that match them in my suspicious pile (if not the red flagged 2) - but they don't really look like WWN -- and my drives are pre WWN stickered

I'm not sure what FW mode the M1015 are in, nor what driver Sol11 is using to keep physical port naming

-- Have to decide which method is best to 'move' files to Temp8Tb pool - possibly with a post check to see if transfer didn't corrupt
-- Cannot start any moves to new server until
-- -- Sort one more Sata Power cable
-- -- Choose firmware for M1015
-- -- Define HW Settings for Nappit AIO VM
3059178084820097818 is a drive guid.
This is an internal ZFS number shown when the disk is dead or missing.
Power down, check all cables and connectors, retry.

Check then disks with a manufacturers tool ex WD data lifeguard and an extensive check.
Use a Hirens USB noot stick (Windows 10PE with WD data lifeguard among others),

The disks are very old. If you get the pool back replace with newer disks,

I suppose you use an older IR firmware that use short port numbers instead wwn.
During boot the current firmware is shown. A firmware update to a new firmware 20.0.0.7 gives WWN.