Adaptec raid problem

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

perdrix

Member
Mar 3, 2016
75
7
8
71
Cross posted from Ubuntu forums:

Came to look at the system this morning and it was sick:

I'd left a journalctl window open and was seeing lots of errors reports against /dev/sda1 which is a hardware raid behind an Adaptec ASR-51245.

After rebooting I went back in the journal to find this in the wee small hours:

Code:
Oct 19 04:03:03 charon kernel: aacraid: Host adapter abort request.
                               aacraid: Outstanding commands on (0,0,0,0):
Oct 19 04:03:03 charon kernel: aacraid: Host adapter reset request. SCSI hang ?
Oct 19 04:03:18 charon kernel: aacraid: Host adapter reset request. SCSI hang ?
Oct 19 04:03:18 charon kernel: aacraid 0000:01:00.0: outstanding cmd: midlevel-0
Oct 19 04:03:18 charon kernel: aacraid 0000:01:00.0: outstanding cmd: lowlevel-0
Oct 19 04:03:18 charon kernel: aacraid 0000:01:00.0: outstanding cmd: error handler-0
Oct 19 04:03:18 charon kernel: aacraid 0000:01:00.0: outstanding cmd: firmware-1
Oct 19 04:03:18 charon kernel: aacraid 0000:01:00.0: outstanding cmd: kernel-0
Oct 19 04:03:48 charon kernel: sd 0:0:0:0: Device offlined - not ready after error recovery
Oct 19 04:03:48 charon kernel: sd 0:0:0:0: [sda] tag#215 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
Oct 19 04:03:48 charon kernel: sd 0:0:0:0: [sda] tag#215 CDB: Read(16) 88 00 00 00 00 00 00 05 27 48 00 00 00 08 00 00
Oct 19 04:03:48 charon kernel: blk_update_request: I/O error, dev sda, sector 337736 op 0x0:(READ) flags 0x1000 phys_seg 1 prio class 0
Oct 19 04:03:48 charon kernel: BTRFS error (device sda1): bdev /dev/sda1 errs: wr 1, rd 1, flush 0, corrupt 3, gen 0
So is this a hardware problem, a driver problem or ???

Has anyone hit this sort of problem before and what if any solution did you find?

Disks are clean as far as SMART is concerned, and a btrfs scrub on the array isn't having any problems (so far).

Help!!
David
 

perdrix

Member
Mar 3, 2016
75
7
8
71
It's possible I may have an explanation but would like to bounce it off the experts.

The timeout for the SCSI device /dev/sda is set to 45 seconds which is fine if the drives are spinning. However the Adaptec Raid controller can power the drives down after a time (currently set to 30 minutes of inactivity).

I'm guessing that the problem above was encountered when the drives had been spun down by the controller.

How do I change the SCSI timeout for the drive permanently and what should I set it to in the case where the RAID controller can power the drives down.

cat /sys/block/sda/device/timeout returns the value 45

David