mdadm raid lost

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

pwr94

New Member
Aug 31, 2022
8
0
1
Hi

I have the following commercial nas Netgear ReadyNas 104. My network shares disappeared, when I enter the web application to manage the nas there is nothing. I get a message about a degraded disk.

So possibly this is the problem, log in via ssh to get some more information.

4.4.218.armada.1 #1 SMP Mon Mar 14 21:47:14 UTC 2022
I can see that there are only two active raids, supposedly the base one that hosts the system, and the one for the swap partitions.

~# cat /proc/mdstat
Personalities : [raid0] [raid1] [raid10] [raid6] [raid5] [raid4]
md1 : active raid10 sda2[0] sdc2[3] sdd2[2] sdb2[1]
1044480 blocks super 1.2 512K chunks 2 near-copies [4/4] [UUUU]

md0 : active raid1 sda1[0] sdb1[5] sdc1[4]
4192192 blocks super 1.2 [4/3] [UUU_]
We can see how a piece is missing in the md0 raid, apparently some partitions are damaged.

~# mdadm --assemble --scan -v
mdadm: looking for devices for further assembly
mdadm: no recogniseable superblock on /dev/md/1
mdadm: no recogniseable superblock on /dev/md/0
mdadm: /dev/sdd2 is busy - skipping
mdadm: Cannot read superblock on /dev/sdd1
mdadm: no RAID superblock on /dev/sdd1
mdadm: No super block found on /dev/sdd (Expected magic a92b4efc, got 00000000)
mdadm: no RAID superblock on /dev/sdd
mdadm: /dev/sdc2 is busy - skipping
mdadm: /dev/sdc1 is busy - skipping
mdadm: No super block found on /dev/sdc (Expected magic a92b4efc, got 00000000)
mdadm: no RAID superblock on /dev/sdc
mdadm: /dev/sdb2 is busy - skipping
mdadm: /dev/sdb1 is busy - skipping
mdadm: No super block found on /dev/sdb (Expected magic a92b4efc, got 00000000)
mdadm: no RAID superblock on /dev/sdb
mdadm: /dev/sda2 is busy - skipping
mdadm: /dev/sda1 is busy - skipping
mdadm: No super block found on /dev/sda (Expected magic a92b4efc, got 00000000)
mdadm: no RAID superblock on /dev/sda
mdadm: No super block found on /dev/mtdblock4 (Expected magic a92b4efc, got ffffffff)
mdadm: no RAID superblock on /dev/mtdblock4
mdadm: No super block found on /dev/mtdblock3 (Expected magic a92b4efc, got 22d44a6e)
mdadm: no RAID superblock on /dev/mtdblock3
mdadm: No super block found on /dev/mtdblock2 (Expected magic a92b4efc, got 2a00000a)
mdadm: no RAID superblock on /dev/mtdblock2
mdadm: No super block found on /dev/mtdblock1 (Expected magic a92b4efc, got 00000000)
mdadm: no RAID superblock on /dev/mtdblock1
mdadm: No super block found on /dev/mtdblock0 (Expected magic a92b4efc, got 2b023063)
mdadm: no RAID superblock on /dev/mtdblock0
mdadm: /dev/sdd3 is identified as a member of /dev/md/data-0, slot -1.
mdadm: /dev/sdc3 is identified as a member of /dev/md/data-0, slot 2.
mdadm: /dev/sdb3 is identified as a member of /dev/md/data-0, slot 1.
mdadm: /dev/sda3 is identified as a member of /dev/md/data-0, slot 0.
mdadm: added /dev/sdb3 to /dev/md/data-0 as 1
mdadm: added /dev/sdc3 to /dev/md/data-0 as 2 (possibly out of date)
mdadm: no uptodate device for slot 3 of /dev/md/data-0
mdadm: added /dev/sdd3 to /dev/md/data-0 as -1
mdadm: added /dev/sda3 to /dev/md/data-0 as 0
mdadm: /dev/md/data-0 assembled from 2 drives and 1 spare - not enough to start the array.
mdadm: looking for devices for further assembly
mdadm: /dev/sdd2 is busy - skipping
mdadm: /dev/sdc2 is busy - skipping
mdadm: /dev/sdc1 is busy - skipping
mdadm: /dev/sdb2 is busy - skipping
mdadm: /dev/sdb1 is busy - skipping
mdadm: /dev/sda2 is busy - skipping
mdadm: /dev/sda1 is busy - skipping
mdadm: No arrays found in config file or automatically
The partitions *3 of each disk is where I have the alleged raid 5 with all the information, but apparently it is not mounted.

mdadm -D /dev/md0
/dev/md0:
Version : 1.2
Creation Time : Tue Jun 25 01:22:54 2013
Raid Level : raid1
Array Size : 4192192 (4.00 GiB 4.29 GB)
Used Dev Size : 4192192 (4.00 GiB 4.29 GB)
Raid Devices : 4
Total Devices : 3
Persistence : Superblock is persistent

Update Time : Sat Feb 4 16:16:52 2023
State : clean, degraded
Active Devices : 3
Working Devices : 3
Failed Devices : 0
Spare Devices : 0

Consistency Policy : unknown

Name : 0e345282:0 (local to host 0e345282)
UUID : e22023fd:0e9aac50:725bf721:0f84047e
Events : 967102

Number Major Minor RaidDevice State
0 8 1 0 active sync /dev/sda1
4 8 33 1 active sync /dev/sdc1
5 8 17 2 active sync /dev/sdb1
- 0 0 3 removed


mdadm -D /dev/md1
/dev/md1:
Version : 1.2
Creation Time : Thu Feb 2 16:41:00 2023
Raid Level : raid10
Array Size : 1044480 (1020.00 MiB 1069.55 MB)
Used Dev Size : 522240 (510.00 MiB 534.77 MB)
Raid Devices : 4
Total Devices : 4
Persistence : Superblock is persistent

Update Time : Thu Feb 2 17:25:49 2023
State : clean
Active Devices : 4
Working Devices : 4
Failed Devices : 0
Spare Devices : 0

Layout : near=2
Chunk Size : 512K

Consistency Policy : unknown

Name : 0e345282:1 (local to host 0e345282)
UUID : de5ad081:1c7d2a99:0618a920:625f66ba
Events : 19

Number Major Minor RaidDevice State
0 8 2 0 active sync set-A /dev/sda2
1 8 18 1 active sync set-B /dev/sdb2
2 8 50 2 active sync set-A /dev/sdd2
3 8 34 3 active sync set-B /dev/sdc2

mdadm --examine /dev/sda3 /dev/sdb3 /dev/sdc3 /dev/sdd3
/dev/sda3:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x0
Array UUID : d8a15531:745d2545:2333dc4d:419eb9b7
Name : 0e345282:data-0 (local to host 0e345282)
Creation Time : Tue Jun 25 01:22:54 2013
Raid Level : raid5
Raid Devices : 4

Avail Dev Size : 3897325681 (1858.39 GiB 1995.43 GB)
Array Size : 5845987968 (5575.17 GiB 5986.29 GB)
Used Dev Size : 3897325312 (1858.39 GiB 1995.43 GB)
Data Offset : 262144 sectors
Super Offset : 8 sectors
Unused Space : before=262064 sectors, after=369 sectors
State : clean
Device UUID : a2770411:80e5fbaf:b11d7184:d43723dc

Update Time : Thu Dec 15 15:01:47 2022
Checksum : a79232df - correct
Events : 84389

Layout : left-symmetric
Chunk Size : 64K

Device Role : Active device 0
Array State : AA.. ('A' == active, '.' == missing, 'R' == replacing)
/dev/sdb3:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x0
Array UUID : d8a15531:745d2545:2333dc4d:419eb9b7
Name : 0e345282:data-0 (local to host 0e345282)
Creation Time : Tue Jun 25 01:22:54 2013
Raid Level : raid5
Raid Devices : 4

Avail Dev Size : 3897325681 (1858.39 GiB 1995.43 GB)
Array Size : 5845987968 (5575.17 GiB 5986.29 GB)
Used Dev Size : 3897325312 (1858.39 GiB 1995.43 GB)
Data Offset : 262144 sectors
Super Offset : 8 sectors
Unused Space : before=262064 sectors, after=369 sectors
State : clean
Device UUID : 3f21d155:5fec0ece:7a7b7907:b8a6e495

Update Time : Thu Dec 15 15:01:47 2022
Checksum : 473bc1e9 - correct
Events : 84389

Layout : left-symmetric
Chunk Size : 64K

Device Role : Active device 1
Array State : AA.. ('A' == active, '.' == missing, 'R' == replacing)
/dev/sdc3:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x0
Array UUID : d8a15531:745d2545:2333dc4d:419eb9b7
Name : 0e345282:data-0 (local to host 0e345282)
Creation Time : Tue Jun 25 01:22:54 2013
Raid Level : raid5
Raid Devices : 4

Avail Dev Size : 3897325681 (1858.39 GiB 1995.43 GB)
Array Size : 5845987968 (5575.17 GiB 5986.29 GB)
Used Dev Size : 3897325312 (1858.39 GiB 1995.43 GB)
Data Offset : 262144 sectors
Super Offset : 8 sectors
Unused Space : before=262064 sectors, after=369 sectors
State : clean
Device UUID : a4172ab0:bf85504a:8817884d:1939a351

Update Time : Thu Dec 15 14:56:59 2022
Checksum : 1fa17124 - correct
Events : 83531

Layout : left-symmetric
Chunk Size : 64K

Device Role : Active device 2
Array State : AAAA ('A' == active, '.' == missing, 'R' == replacing)
/dev/sdd3:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x8
Array UUID : d8a15531:745d2545:2333dc4d:419eb9b7
Name : 0e345282:data-0 (local to host 0e345282)
Creation Time : Tue Jun 25 01:22:54 2013
Raid Level : raid5
Raid Devices : 4

Avail Dev Size : 3897325681 (1858.39 GiB 1995.43 GB)
Array Size : 5845987968 (5575.17 GiB 5986.29 GB)
Used Dev Size : 3897325312 (1858.39 GiB 1995.43 GB)
Data Offset : 262144 sectors
Super Offset : 8 sectors
Unused Space : before=261864 sectors, after=369 sectors
State : clean
Device UUID : 160d3f60:27b10fef:e0b8a1f2:43d3f6e8

Update Time : Thu Dec 15 15:01:47 2022
Bad Block Log : 512 entries available at offset 264 sectors - bad blocks present.
Checksum : b0eccbad - correct
Events : 84389

Layout : left-symmetric
Chunk Size : 64K

Device Role : spare
Array State : AA.. ('A' == active, '.' == missing, 'R' == replacing)
Apparently the sdd3 partition has bad sectors, I'm not familiar with btrfs, but because it is not capable of raising the raid with the data and the *3 partitions, it would have to have redundancy.

Any ideas?

Thanks.
 

CyklonDX

Well-Known Member
Nov 8, 2022
850
279
63
just some questions

You have raid1 with 3 disks?
md0 : active raid1 sda1[0] sdb1[5] sdc1[4]

if this is correct raid10 then you should just unplug your dead/dying disk
md1 : active raid10 sda2[0] sdc2[3] sdd2[2] sdb2[1]

(mdadm --examine /dev/sda3 /dev/sdb3 /dev/sdc3 /dev/sdd3) it also makes me really unsure what is showing correct values - as it shows raid5, and reports missing disks.

Array State : AA.. ('A' == active, '.' == missing, 'R' == replacing)
/dev/sdc3:

Array State : AA.. ('A' == active, '.' == missing, 'R' == replacing)
/dev/sdb3:

2 active, and 2 missing on this.
 
  • Like
Reactions: pwr94

pwr94

New Member
Aug 31, 2022
8
0
1
just some questions

You have raid1 with 3 disks?
md0 : active raid1 sda1[0] sdb1[5] sdc1[4]

if this is correct raid10 then you should just unplug your dead/dying disk
md1 : active raid10 sda2[0] sdc2[3] sdd2[2] sdb2[1]

(mdadm --examine /dev/sda3 /dev/sdb3 /dev/sdc3 /dev/sdd3) it also makes me really unsure what is showing correct values - as it shows raid5, and reports missing disks.

Array State : AA.. ('A' == active, '.' == missing, 'R' == replacing)
/dev/sdc3:

Array State : AA.. ('A' == active, '.' == missing, 'R' == replacing)
/dev/sdb3:

2 active, and 2 missing on this.
Thanks for your help.

Apparently Netgear implements a system called "raid X", a system that expands automatically, that runs in the background etc... Simple until it fails and you don't know what it does....

It seems correct where all the data is located in partition 3 of each disk, in this case the 4.

Correct, two are missing in raid 5 of data, and being a raid 5 it only has redundancy of 1 disks, therefore this is the fault, possibly.

I have analyzed the disks with smartctl, and it doesn't look very good.

~# smartctl -l selftest /dev/sda
smartctl 6.6 2017-11-05 r4594 [armv7l-linux-4.4.218.armada.1] (local build)
Copyright (C) 2002-17, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF READ SMART DATA SECTION ===
SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Short offline Completed without error 00% 13529 -
# 2 Extended offline Completed without error 00% 12823 -
# 3 Extended offline Completed without error 00% 12079 -
# 4 Extended offline Completed without error 00% 11359 -
# 5 Extended offline Completed without error 00% 10614 -
# 6 Extended offline Completed without error 00% 9894 -
# 7 Extended offline Completed without error 00% 9150 -
# 8 Extended offline Completed without error 00% 8425 -
# 9 Extended offline Completed without error 00% 7705 -
#10 Extended offline Completed without error 00% 6961 -
#11 Extended offline Completed without error 00% 6241 -
#12 Extended offline Completed without error 00% 5498 -
#13 Extended offline Completed without error 00% 4826 -
#14 Extended offline Completed without error 00% 4082 -
#15 Extended offline Completed without error 00% 3338 -
#16 Extended offline Completed without error 00% 2618 -
#17 Extended offline Completed without error 00% 1873 -
#18 Extended offline Completed without error 00% 1153 -
#19 Extended offline Completed without error 00% 409 -
#20 Extended offline Completed without error 00% 65201 -
#21 Extended offline Completed without error 00% 64481 -



~# smartctl -l selftest /dev/sdb
smartctl 6.6 2017-11-05 r4594 [armv7l-linux-4.4.218.armada.1] (local build)
Copyright (C) 2002-17, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF READ SMART DATA SECTION ===
SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Extended offline Completed: read failure 90% 12818 9457920
# 2 Extended offline Completed: read failure 90% 12074 9457920
# 3 Extended offline Completed: read failure 90% 11354 9457920
# 4 Extended offline Completed: read failure 90% 10609 9457920
# 5 Extended offline Completed: read failure 90% 9889 9457920
# 6 Extended offline Completed: read failure 90% 9145 9457920
# 7 Extended offline Completed: read failure 90% 8420 9457920
# 8 Extended offline Completed: read failure 90% 7700 9457920
# 9 Extended offline Completed: read failure 90% 6956 9457920
#10 Extended offline Completed: read failure 90% 6236 9457920
#11 Extended offline Completed: read failure 90% 5493 9457920
#12 Extended offline Completed: read failure 90% 4821 9457920
#13 Extended offline Completed: read failure 90% 4077 9457920
#14 Extended offline Completed: read failure 90% 3333 9457920
#15 Extended offline Completed: read failure 90% 2613 9457920
#16 Extended offline Completed: read failure 90% 1868 9457920
#17 Extended offline Completed: read failure 90% 1148 9457920
#18 Extended offline Completed: read failure 90% 404 9457920
#19 Extended offline Completed: read failure 90% 65196 9457920
#20 Extended offline Completed: read failure 90% 64476 9457920
#21 Extended offline Completed: read failure 90% 63732 9457920


~# smartctl -l selftest /dev/sdc
smartctl 6.6 2017-11-05 r4594 [armv7l-linux-4.4.218.armada.1] (local build)
Copyright (C) 2002-17, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF READ SMART DATA SECTION ===
SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Extended offline Completed: read failure 10% 12820 3486899176
# 2 Extended offline Completed without error 00% 12076 -
# 3 Extended offline Completed without error 00% 11356 -
# 4 Extended offline Completed without error 00% 10611 -
# 5 Extended offline Completed without error 00% 9891 -
# 6 Extended offline Completed without error 00% 9147 -
# 7 Extended offline Completed without error 00% 8423 -
# 8 Extended offline Completed without error 00% 7703 -
# 9 Extended offline Completed: read failure 10% 6958 3687044808
#10 Extended offline Completed without error 00% 6239 -
#11 Extended offline Completed without error 00% 5495 -
#12 Extended offline Completed without error 00% 4823 -
#13 Extended offline Completed without error 00% 4080 -
#14 Extended offline Completed without error 00% 3336 -
#15 Extended offline Completed without error 00% 2616 -
#16 Extended offline Completed without error 00% 1871 -
#17 Extended offline Completed without error 00% 1151 -
#18 Extended offline Completed without error 00% 407 -
#19 Extended offline Completed without error 00% 65199 -
#20 Extended offline Completed without error 00% 64479 -
#21 Extended offline Completed without error 00% 63735 -
1 of 2 failed self-tests are outdated by newer successful extended offline self-test # 2

~# smartctl -l selftest /dev/sdd
smartctl 6.6 2017-11-05 r4594 [armv7l-linux-4.4.218.armada.1] (local build)
Copyright (C) 2002-17, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF READ SMART DATA SECTION ===
SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Short offline Completed: read failure 50% 13167 73
# 2 Short offline Completed: read failure 50% 13167 73
# 3 Extended offline Completed: read failure 90% 12818 73
# 4 Extended offline Completed without error 00% 12078 -
# 5 Extended offline Completed without error 00% 11358 -
# 6 Extended offline Completed without error 00% 10613 -
# 7 Extended offline Completed without error 00% 9893 -
# 8 Extended offline Completed without error 00% 9149 -
# 9 Extended offline Completed without error 00% 8425 -
#10 Extended offline Completed without error 00% 7705 -
#11 Extended offline Completed without error 00% 6961 -
#12 Extended offline Completed without error 00% 6241 -
#13 Extended offline Completed without error 00% 5498 -
#14 Extended offline Completed without error 00% 4825 -
#15 Extended offline Completed without error 00% 4082 -
#16 Extended offline Completed without error 00% 3338 -
#17 Extended offline Completed without error 00% 2618 -
#18 Extended offline Completed without error 00% 1873 -
#19 Extended offline Completed without error 00% 1153 -
#20 Extended offline Completed without error 00% 409 -
#21 Extended offline Completed without error 00% 65201 -


~# fdisk -l /dev/sda

Disk /dev/sda: 1.8 TiB, 2000398934016 bytes, 3907029168 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disklabel type: gpt
Disk identifier: 92E3ABD0-7384-4732-B981-F888CD9F7CCF

Device Start End Sectors Size Type
/dev/sda1 64 8388671 8388608 4G Linux RAID
/dev/sda2 8388672 9437247 1048576 512M Linux RAID
/dev/sda3 9437248 3907025072 3897587825 1.8T Linux RAID

~# fdisk -l /dev/sdb

Disk /dev/sdb: 1.8 TiB, 2000398934016 bytes, 3907029168 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disklabel type: gpt
Disk identifier: 2D7A1461-E0C9-489C-ADF4-563D5F892914

Device Start End Sectors Size Type
/dev/sdb1 64 8388671 8388608 4G Linux RAID
/dev/sdb2 8388672 9437247 1048576 512M Linux RAID
/dev/sdb3 9437248 3907025072 3897587825 1.8T Linux RAID

~# fdisk -l /dev/sdc

Disk /dev/sdc: 1.8 TiB, 2000398934016 bytes, 3907029168 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disklabel type: gpt
Disk identifier: AE53452F-FE2D-4CFF-B3FC-EE3181C7B1C9

Device Start End Sectors Size Type
/dev/sdc1 64 8388671 8388608 4G Linux RAID
/dev/sdc2 8388672 9437247 1048576 512M Linux RAID
/dev/sdc3 9437248 3907025072 3897587825 1.8T Linux RAID

~# fdisk -l /dev/sdd

Disk /dev/sdd: 1.8 TiB, 2000398934016 bytes, 3907029168 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disklabel type: gpt
Disk identifier: 993832C6-FFA1-47C9-A7A0-EAB63957317B

Device Start End Sectors Size Type
/dev/sdd1 64 8388671 8388608 4G Linux RAID
/dev/sdd2 8388672 9437247 1048576 512M Linux RAID
/dev/sdd3 9437248 3907025072 3897587825 1.8T Linux RAID
For example, in md0 1 disk is missing that refers to partitions 1. If we look at the output of smartctl and fdisk of disk D, we can see that sector 73 belongs to partition 1 of disk, possibly this is the error. ...

The discs are smashed, they are cheap hard drives...

We present errors on disks B and C, in sectors 9457920 and 3486899176 that refer to partitions 3, of raid 5, where all the information is.

Possibly this is the problem, since it is a raid 5 only has redundancy of one disk, and apparently the disks show damage in those areas.

What do you think, what could I do?

Thanks.
 

Pete.S.

Member
Feb 6, 2019
56
24
8
What do you think, what could I do?
If you suspect that more than one drive have problems, you need to stop what your doing immediately.

Stop running smartctl, it will make things worse, not better.

For each drives that shows I/O errors you should get a new drive and transfer all information from the faulty drive to the replacement. Use dd with some added options I can't remember off the top of my head. Basically you're making a forensic copy and you can extract every piece of data that is possible to extract.

Then you use the replacement drives instead of the faulty ones to rebuild the array.

Depending on how many sectors that couldn't be recovered you might end up with a couple of files that have errors in them. The majority will be fine though.
 
Last edited: