Considering replacing RAID Card, getting warnings, help understand them?

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

SecCon

Arkham Asylum Server Mgmt
May 26, 2022
281
55
28
I have a Huawei RH2288Hv3 server ( this one) with a mezzanine RAID Card ( this one) based on the SAS 3108 controller.

The last few weeks, yeah, just in time for Christmas, I am getting warnings ...

Controller ID: 0 Unexpected sense: PD
= Port A:1:1Unrecovered read error, CDB = 0x88 0x00 0x00 0x00 0x00 0x00 0x31 0x54 0xc8 0x00 0x00 0x00 0x02 0x00 0x00 0x00 , Sense = 0xf0 0x00 0x03 0x31 0x54 0xc9 0xb7 0x0a 0x00 0x00 0x00 0x00 0x11 0x00 0x00 0x00 0x00 0x00

Controller ID: 0 Consistency Check corrected medium error:
( VD 3 Location 0x2e310d2f,
PD Port A:1:1 Location 0x2e310d2f)

Controller ID: 0 Unexpected sense: PD
= Port A:1:1Unrecovered read error, CDB = 0x88 0x00 0x00 0x00 0x00 0x00 0x2e 0x31 0x0c 0x00 0x00 0x00 0x02 0x00 0x00 0x00 , Sense = 0xf0 0x00 0x03 0x2e 0x31 0x0d 0x2f 0x0a 0x00 0x00 0x00 0x00 0x11 0x00 0x00 0x00 0x00 0x00

Controller ID: 0 Consistency Check corrected medium error:
( VD 3 Location 0x62a993b7,
PD Port A:1:1 Location 0x3154c9b7)
I attach an edited log file from dec 1 to today.

1703322398174.png

The Consistency checks do seem to handle the read errors and when I log on (this was about 3 in the morning) everything is fine. I have been paying attention and have e-mail notifications enabled to track this. The sequence in the screenshot happened three times so far, that I can see.

My question is if this is the controller or if it could be the backplane or cabling? Is there a way to establish that?

I have no issue getting another controller, perhaps something newer and mor normal than a mezzanine card, but I really need to check somehow if it is the backplane or the cables instead. All that is standard hardware and was mounted as I got the server. Maybe a thorough cleaning might help as well... open to suggestions.
 

Attachments

SecCon

Arkham Asylum Server Mgmt
May 26, 2022
281
55
28
I have seen no disk related errors and nothing else has glitched that could be related to power supply , then again if you see indication of it in the attached log , please point it out , I am by no means an expert reading those logs, but it seems to be controller related warnings, not disk or power . Having said that we did have a power outage a couple of weeks ago , not sure if it shows ...
 

SecCon

Arkham Asylum Server Mgmt
May 26, 2022
281
55
28
I started a Patrol Read to check drives.

A patrol read periodically verifies all sectors of drives connected to a controller,
including the system reserved area in the RAID configured drives.
Enclosure or voltage or anything else show no errors. No red or yellow dots anywhere.
 

SecCon

Arkham Asylum Server Mgmt
May 26, 2022
281
55
28
... and the patrol read gets interrupted by more Unexpected sense errors... sigh
 

SecCon

Arkham Asylum Server Mgmt
May 26, 2022
281
55
28
According to Huawei, which gave me a fast response, the errors are from a HDD on Slot 1.

Fortunately the software CrystalDiskInfo is capable of drilling through the Raid Controller (so to speak) and get the physical disk info. I have located the disk. It is on Slot1, a 4TB WDC WD4000FYYZ. Not hard to find a replacement. It also gives me a warning in regards to that disk.


Disk located


60-80€ replacement, if I don't have one in my stash.

It is not one of the more important drives pertaining to my main storage, those are the ones in the Virtual Drive 1.

I will be replacing it shortly, system function should not be affected. And hope for no more errors.
 
Last edited:
  • Like
Reactions: bateman