madadm is behaving in a way that I don't fully understand. I have a seven-drive RAID 6 array of 6TB drives (drives are sda - sdg) and a hot a spare (sdh). So with 2x parity drives for RAID 6, that is five drives of storage for 30TB. More on that in a moment.
Drive sdc showed some read errors (pasted at the bottom) that seem to have triggered the spare to be brought into the array. However, after the rebuild was complete (I think it was a 'rebuild'), mdadm doesn't show any drive in a degraded state, and sdh is now just an active drive. I'm fairly certain the size of the array grew as well to 36TB.
Note that I didn't issue any commands to mdadm, other than mdadm -D to monitor the progress of what was going on?
Does this make any sense to anyone? It seems like mdadm should show a degraded drive if it's going to activate the spare. And it also makes little sense to me that the array size grew from 30 to 36 TB.
It almost feels like mdadm decided to move sdh from a spare into the RAID array, but I certainly didn't ask it to do that.
Finally, dmesg shows some odd messages re: sdh and power and device resets (relavent log messages are again at the bottom).
Thanks in advance for any thoughts you have!
/dev/md124:
Version : 1.2
Creation Time : Mon Apr 20 00:08:20 2015
Raid Level : raid6
Array Size : 35162339328 (32.75 TiB 36.01 TB)
Used Dev Size : 5860389888 (5.46 TiB 6.00 TB)
Raid Devices : 8
Total Devices : 8
Persistence : Superblock is persistent
Intent Bitmap : Internal
Update Time : Wed Aug 2 14:46:52 2023
State : clean
Active Devices : 8
Working Devices : 8
Failed Devices : 0
Spare Devices : 0
Layout : left-symmetric
Chunk Size : 512K
Consistency Policy : bitmap
Name : localhost:export
UUID : 94dfc16f:5ba9e1a2:e31dda07:482141e3
Events : 3174657
Number Major Minor RaidDevice State
0 8 1 0 active sync /dev/sda1
1 8 17 1 active sync /dev/sdb1
2 8 33 2 active sync /dev/sdc1
8 8 49 3 active sync /dev/sdd1
4 8 65 4 active sync /dev/sde1
5 8 81 5 active sync /dev/sdf1
6 8 97 6 active sync /dev/sdg1
7 8 113 7 active sync /dev/sdh1
sdc errors in dmesg
[Sun Jul 30 11:32:23 2023] sd 0:0:2:0: [sdc] tag#1 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE cmd_age=4s
[Sun Jul 30 11:32:23 2023] sd 0:0:2:0: [sdc] tag#1 Sense Key : Medium Error [current] [descriptor]
[Sun Jul 30 11:32:23 2023] sd 0:0:2:0: [sdc] tag#1 Add. Sense: Unrecovered read error
[Sun Jul 30 11:32:23 2023] sd 0:0:2:0: [sdc] tag#1 CDB: Read(16) 88 00 00 00 00 01 bd 96 6e 00 00 00 01 00 00 00
[Sun Jul 30 11:32:23 2023] blk_update_request: critical medium error, dev sdc, sector 7475719856
[Sun Jul 30 13:55:37 2023] sd 0:0:2:0: [sdc] tag#1 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE cmd_age=3s
[Sun Jul 30 13:55:37 2023] sd 0:0:2:0: [sdc] tag#1 Sense Key : Medium Error [current] [descriptor]
[Sun Jul 30 13:55:37 2023] sd 0:0:2:0: [sdc] tag#1 Add. Sense: Unrecovered read error
[Sun Jul 30 13:55:37 2023] sd 0:0:2:0: [sdc] tag#1 CDB: Read(16) 88 00 00 00 00 02 24 c9 a5 00 00 00 01 00 00 00
[Sun Jul 30 13:55:37 2023] blk_update_request: critical medium error, dev sdc, sector 9207129536
[Sun Jul 30 13:55:41 2023] sd 0:0:2:0: [sdc] tag#0 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE cmd_age=3s
[Sun Jul 30 13:55:41 2023] sd 0:0:2:0: [sdc] tag#0 Sense Key : Medium Error [current] [descriptor]
[Sun Jul 30 13:55:41 2023] sd 0:0:2:0: [sdc] tag#0 Add. Sense: Unrecovered read error
[Sun Jul 30 13:55:41 2023] sd 0:0:2:0: [sdc] tag#0 CDB: Read(16) 88 00 00 00 00 02 24 c9 a5 c0 00 00 00 40 00 00
[Sun Jul 30 13:55:41 2023] blk_update_request: critical medium error, dev sdc, sector 9207129536
[Sun Jul 30 13:55:41 2023] md/raid:md124: read error corrected (8 sectors at 9207127488 on sdc1)
[Sun Jul 30 13:55:41 2023] md/raid:md124: read error corrected (8 sectors at 9207127496 on sdc1)
[Sun Jul 30 13:55:41 2023] md/raid:md124: read error corrected (8 sectors at 9207127504 on sdc1)
[Sun Jul 30 13:55:41 2023] md/raid:md124: read error corrected (8 sectors at 9207127512 on sdc1)
[Sun Jul 30 13:55:41 2023] md/raid:md124: read error corrected (8 sectors at 9207127520 on sdc1)
[Sun Jul 30 13:55:41 2023] md/raid:md124: read error corrected (8 sectors at 9207127528 on sdc1)
[Sun Jul 30 13:55:41 2023] md/raid:md124: read error corrected (8 sectors at 9207127536 on sdc1)
[Sun Jul 30 13:55:41 2023] md/raid:md124: read error corrected (8 sectors at 9207127544 on sdc1)
[Sun Jul 30 13:55:45 2023] sd 0:0:2:0: [sdc] tag#1 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE cmd_age=4s
[Sun Jul 30 13:55:45 2023] sd 0:0:2:0: [sdc] tag#1 Sense Key : Medium Error [current] [descriptor]
[Sun Jul 30 13:55:45 2023] sd 0:0:2:0: [sdc] tag#1 Add. Sense: Unrecovered read error
[Sun Jul 30 13:55:45 2023] sd 0:0:2:0: [sdc] tag#1 CDB: Read(16) 88 00 00 00 00 02 24 c9 a8 00 00 00 01 00 00 00
[Sun Jul 30 13:55:45 2023] blk_update_request: critical medium error, dev sdc, sector 9207130176
[Sun Jul 30 13:55:49 2023] sd 0:0:2:0: [sdc] tag#0 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE cmd_age=4s
[Sun Jul 30 13:55:49 2023] sd 0:0:2:0: [sdc] tag#0 Sense Key : Medium Error [current] [descriptor]
[Sun Jul 30 13:55:49 2023] sd 0:0:2:0: [sdc] tag#0 Add. Sense: Unrecovered read error
sdh errors in dmesg
[Sat Jul 29 09:50:05 2023] sd 0:0:7:0: Power-on or device reset occurred
[Sat Jul 29 09:57:45 2023] sd 0:0:7:0: Power-on or device reset occurred
[Sat Jul 29 09:57:46 2023] sd 0:0:7:0: Power-on or device reset occurred
[Sat Jul 29 10:05:21 2023] sd 0:0:7:0: Power-on or device reset occurred
[Sat Jul 29 10:05:22 2023] sd 0:0:7:0: Power-on or device reset occurred
[Sat Jul 29 10:05:23 2023] sd 0:0:7:0: Power-on or device reset occurred
[Sat Jul 29 10:05:24 2023] sd 0:0:7:0: Power-on or device reset occurred
[Sat Jul 29 10:05:39 2023] sd 0:0:7:0: Power-on or device reset occurred
[Sat Jul 29 10:05:39 2023] sd 0:0:7:0: Power-on or device reset occurred
[Sat Jul 29 10:15:50 2023] sd 0:0:7:0: Power-on or device reset occurred
[Sat Jul 29 10:15:50 2023] sd 0:0:7:0: Power-on or device reset occurred
[Wed Aug 2 13:38:24 2023] sd 0:0:7:0: Power-on or device reset occurred
[Wed Aug 2 13:38:24 2023] sd 0:0:7:0: Power-on or device reset occurred
[Wed Aug 2 13:39:39 2023] sd 0:0:7:0: Power-on or device reset occurred
[Wed Aug 2 13:39:40 2023] sd 0:0:7:0: Power-on or device reset occurred
[Wed Aug 2 13:43:28 2023] sd 0:0:7:0: Power-on or device reset occurred
[Wed Aug 2 13:43:28 2023] sd 0:0:7:0: Power-on or device reset occurred
[Wed Aug 2 13:43:30 2023] sd 0:0:7:0: Power-on or device reset occurred
[Wed Aug 2 13:43:30 2023] sd 0:0:7:0: Power-on or device reset occurred
[Wed Aug 2 13:52:17 2023] sd 0:0:7:0: Power-on or device reset occurred
[Wed Aug 2 13:52:17 2023] sd 0:0:7:0: Power-on or device reset occurred
[Wed Aug 2 13:52:45 2023] sd 0:0:7:0: Power-on or device reset occurred
[Wed Aug 2 13:52:45 2023] sd 0:0:7:0: Power-on or device reset occurred
Drive sdc showed some read errors (pasted at the bottom) that seem to have triggered the spare to be brought into the array. However, after the rebuild was complete (I think it was a 'rebuild'), mdadm doesn't show any drive in a degraded state, and sdh is now just an active drive. I'm fairly certain the size of the array grew as well to 36TB.
Note that I didn't issue any commands to mdadm, other than mdadm -D to monitor the progress of what was going on?
Does this make any sense to anyone? It seems like mdadm should show a degraded drive if it's going to activate the spare. And it also makes little sense to me that the array size grew from 30 to 36 TB.
It almost feels like mdadm decided to move sdh from a spare into the RAID array, but I certainly didn't ask it to do that.
Finally, dmesg shows some odd messages re: sdh and power and device resets (relavent log messages are again at the bottom).
Thanks in advance for any thoughts you have!
/dev/md124:
Version : 1.2
Creation Time : Mon Apr 20 00:08:20 2015
Raid Level : raid6
Array Size : 35162339328 (32.75 TiB 36.01 TB)
Used Dev Size : 5860389888 (5.46 TiB 6.00 TB)
Raid Devices : 8
Total Devices : 8
Persistence : Superblock is persistent
Intent Bitmap : Internal
Update Time : Wed Aug 2 14:46:52 2023
State : clean
Active Devices : 8
Working Devices : 8
Failed Devices : 0
Spare Devices : 0
Layout : left-symmetric
Chunk Size : 512K
Consistency Policy : bitmap
Name : localhost:export
UUID : 94dfc16f:5ba9e1a2:e31dda07:482141e3
Events : 3174657
Number Major Minor RaidDevice State
0 8 1 0 active sync /dev/sda1
1 8 17 1 active sync /dev/sdb1
2 8 33 2 active sync /dev/sdc1
8 8 49 3 active sync /dev/sdd1
4 8 65 4 active sync /dev/sde1
5 8 81 5 active sync /dev/sdf1
6 8 97 6 active sync /dev/sdg1
7 8 113 7 active sync /dev/sdh1
sdc errors in dmesg
[Sun Jul 30 11:32:23 2023] sd 0:0:2:0: [sdc] tag#1 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE cmd_age=4s
[Sun Jul 30 11:32:23 2023] sd 0:0:2:0: [sdc] tag#1 Sense Key : Medium Error [current] [descriptor]
[Sun Jul 30 11:32:23 2023] sd 0:0:2:0: [sdc] tag#1 Add. Sense: Unrecovered read error
[Sun Jul 30 11:32:23 2023] sd 0:0:2:0: [sdc] tag#1 CDB: Read(16) 88 00 00 00 00 01 bd 96 6e 00 00 00 01 00 00 00
[Sun Jul 30 11:32:23 2023] blk_update_request: critical medium error, dev sdc, sector 7475719856
[Sun Jul 30 13:55:37 2023] sd 0:0:2:0: [sdc] tag#1 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE cmd_age=3s
[Sun Jul 30 13:55:37 2023] sd 0:0:2:0: [sdc] tag#1 Sense Key : Medium Error [current] [descriptor]
[Sun Jul 30 13:55:37 2023] sd 0:0:2:0: [sdc] tag#1 Add. Sense: Unrecovered read error
[Sun Jul 30 13:55:37 2023] sd 0:0:2:0: [sdc] tag#1 CDB: Read(16) 88 00 00 00 00 02 24 c9 a5 00 00 00 01 00 00 00
[Sun Jul 30 13:55:37 2023] blk_update_request: critical medium error, dev sdc, sector 9207129536
[Sun Jul 30 13:55:41 2023] sd 0:0:2:0: [sdc] tag#0 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE cmd_age=3s
[Sun Jul 30 13:55:41 2023] sd 0:0:2:0: [sdc] tag#0 Sense Key : Medium Error [current] [descriptor]
[Sun Jul 30 13:55:41 2023] sd 0:0:2:0: [sdc] tag#0 Add. Sense: Unrecovered read error
[Sun Jul 30 13:55:41 2023] sd 0:0:2:0: [sdc] tag#0 CDB: Read(16) 88 00 00 00 00 02 24 c9 a5 c0 00 00 00 40 00 00
[Sun Jul 30 13:55:41 2023] blk_update_request: critical medium error, dev sdc, sector 9207129536
[Sun Jul 30 13:55:41 2023] md/raid:md124: read error corrected (8 sectors at 9207127488 on sdc1)
[Sun Jul 30 13:55:41 2023] md/raid:md124: read error corrected (8 sectors at 9207127496 on sdc1)
[Sun Jul 30 13:55:41 2023] md/raid:md124: read error corrected (8 sectors at 9207127504 on sdc1)
[Sun Jul 30 13:55:41 2023] md/raid:md124: read error corrected (8 sectors at 9207127512 on sdc1)
[Sun Jul 30 13:55:41 2023] md/raid:md124: read error corrected (8 sectors at 9207127520 on sdc1)
[Sun Jul 30 13:55:41 2023] md/raid:md124: read error corrected (8 sectors at 9207127528 on sdc1)
[Sun Jul 30 13:55:41 2023] md/raid:md124: read error corrected (8 sectors at 9207127536 on sdc1)
[Sun Jul 30 13:55:41 2023] md/raid:md124: read error corrected (8 sectors at 9207127544 on sdc1)
[Sun Jul 30 13:55:45 2023] sd 0:0:2:0: [sdc] tag#1 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE cmd_age=4s
[Sun Jul 30 13:55:45 2023] sd 0:0:2:0: [sdc] tag#1 Sense Key : Medium Error [current] [descriptor]
[Sun Jul 30 13:55:45 2023] sd 0:0:2:0: [sdc] tag#1 Add. Sense: Unrecovered read error
[Sun Jul 30 13:55:45 2023] sd 0:0:2:0: [sdc] tag#1 CDB: Read(16) 88 00 00 00 00 02 24 c9 a8 00 00 00 01 00 00 00
[Sun Jul 30 13:55:45 2023] blk_update_request: critical medium error, dev sdc, sector 9207130176
[Sun Jul 30 13:55:49 2023] sd 0:0:2:0: [sdc] tag#0 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE cmd_age=4s
[Sun Jul 30 13:55:49 2023] sd 0:0:2:0: [sdc] tag#0 Sense Key : Medium Error [current] [descriptor]
[Sun Jul 30 13:55:49 2023] sd 0:0:2:0: [sdc] tag#0 Add. Sense: Unrecovered read error
sdh errors in dmesg
[Sat Jul 29 09:50:05 2023] sd 0:0:7:0: Power-on or device reset occurred
[Sat Jul 29 09:57:45 2023] sd 0:0:7:0: Power-on or device reset occurred
[Sat Jul 29 09:57:46 2023] sd 0:0:7:0: Power-on or device reset occurred
[Sat Jul 29 10:05:21 2023] sd 0:0:7:0: Power-on or device reset occurred
[Sat Jul 29 10:05:22 2023] sd 0:0:7:0: Power-on or device reset occurred
[Sat Jul 29 10:05:23 2023] sd 0:0:7:0: Power-on or device reset occurred
[Sat Jul 29 10:05:24 2023] sd 0:0:7:0: Power-on or device reset occurred
[Sat Jul 29 10:05:39 2023] sd 0:0:7:0: Power-on or device reset occurred
[Sat Jul 29 10:05:39 2023] sd 0:0:7:0: Power-on or device reset occurred
[Sat Jul 29 10:15:50 2023] sd 0:0:7:0: Power-on or device reset occurred
[Sat Jul 29 10:15:50 2023] sd 0:0:7:0: Power-on or device reset occurred
[Wed Aug 2 13:38:24 2023] sd 0:0:7:0: Power-on or device reset occurred
[Wed Aug 2 13:38:24 2023] sd 0:0:7:0: Power-on or device reset occurred
[Wed Aug 2 13:39:39 2023] sd 0:0:7:0: Power-on or device reset occurred
[Wed Aug 2 13:39:40 2023] sd 0:0:7:0: Power-on or device reset occurred
[Wed Aug 2 13:43:28 2023] sd 0:0:7:0: Power-on or device reset occurred
[Wed Aug 2 13:43:28 2023] sd 0:0:7:0: Power-on or device reset occurred
[Wed Aug 2 13:43:30 2023] sd 0:0:7:0: Power-on or device reset occurred
[Wed Aug 2 13:43:30 2023] sd 0:0:7:0: Power-on or device reset occurred
[Wed Aug 2 13:52:17 2023] sd 0:0:7:0: Power-on or device reset occurred
[Wed Aug 2 13:52:17 2023] sd 0:0:7:0: Power-on or device reset occurred
[Wed Aug 2 13:52:45 2023] sd 0:0:7:0: Power-on or device reset occurred
[Wed Aug 2 13:52:45 2023] sd 0:0:7:0: Power-on or device reset occurred