Random drives degraded: mpt_sas0 Abort_command! error

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

grogthegreat

New Member
Apr 21, 2016
23
7
3
36
First the build:
Supermicro X11SSL-F motherboard (bios updated to latest: 1.0b)
Intel i3 6100 CPU
32GB Crucial ECC UDIMM DDR4
LSI 9212-4i4e (updated to P19)
Supermicro 826E1-R800 chassis with SAS-826EL1 backplane (3Gb SAS expander

The issue:
I am seeing multiple drives sometimes become degraded. One of the mirrors I can't get a successful resilver. The degraded drives look perfectly fine from their Smart values with no reallocated sectors. The pool will be working normally, or resilvering, when all of a sudden any drive in use will start to show a large number of hard and transfer errors. Using IPMI to pull up the KVM console I see a large number of this error message:
scsi: /pci@0,0/pci8086,1905@1,1/pci1000,3060@0 (mpt_sas0)
Aborted_command!

Any idea how to narrow this down? Is this a firmware issue on my 9212? Could it be a bad backplane, cable, or HBA? The voltages from the PSU look okay from IPMI. Am I stuck with replacing different hardware till the error goes away?
 

Attachments

gea

Well-Known Member
Dec 31, 2010
3,163
1,195
113
DE
What I would try
- disable mpio in the mpt_sas.conf (ex napp-it menu disks > details to rule out mpio problems)

Next problem may be that Sata disks on an SAS expander can produce problems when one disk fails. This is hard to identify and the reason why professisonal storage vendors refuse support for expander + Sata solutions.

What you can do is
- if you have an indication of a special problem disk either from logs or the activity LED, remove that disk
- otherwise remove all disks and insert them on by one and check console/logs for problems