Enclosure keeps going offline

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

ccoager

New Member
Aug 14, 2018
10
0
1
I have a server running Ubuntu 16.04.5 (up-to-date). I recently purchased a LSI MegaRAID SAS 9286-8e (firmware 23.34.0-0019) attached to a HP SAS Expander (firmware 2.06, 2.10) with 15 drives attached. The RAID card, the HBA and 8 of the drives are brand new.

The drives are configured as single, independent RAID0 from the MegaRAID, for lack of better knowledge. I didn't figure out till months later how to configure them as JBOD from the CLI. I will probably reconfigure them all as JBOD in the future. The storage is configured as zfs RAID-Z with the 8 new drives.

The problem is, intermittently and often, the alarm goes off on the LSI MegaRAID and all drives go offline simultaneously. Seconds later the enclosure reattaches again. Minutes later the enclosure goes offline again, then reattaches. This happens over and over with no set period of time.

Troubleshooting I've tried...I've replaced the LSI MegaRAID SAS 9286-8e, tried two different HP SAS expander cards, tried HP SAS expander firmware 2.06 and 2.10, tried a Monoprice SAS SFF-8088 and a Tripp Lite SAS SFF-8088 cable. I'm pretty much out of ideas at this point. Can anyone assist?

I've uploaded the dmesg, storcli output and event logs from storcli.
 

Attachments

Jeggs101

Well-Known Member
Dec 29, 2010
1,529
241
63
Sounds like something in the SAS path rather than power. The HP SAS expander is really old now. I'd look there but I might be totally wrong.
 

gregsachs

Active Member
Aug 14, 2018
562
192
43
I had very similar issue, albeit with windows, where my raid card was set up with pcie link state power management enabled. It would sleep, and the external jbod would go offline. ubuntu look for pci express active-state power management
 

whitey

Moderator
Jun 30, 2014
2,766
868
113
41
If you end-goal is to use ZFS you are doing something fundamentally wrong by first off using a RAID ctrl and not a HBA, definitely do NOT want to have any abstraction/raid in between ZFS and disks, JBOD is back on the right track but I'd just get a HBA and be done with it. If I misunderstood and only your new 8 disks and HBA are connected and you have a raid-z that is fine although I would have gone for raidz2. Not sure what to tell ya on RAID ctrl, not my cup o' tea.
 

ccoager

New Member
Aug 14, 2018
10
0
1
@whitey
I checked out the Highpoint HBA's, none of their products had enough SATA channels to meet my requirements. This is typically why you would want a RAID card.
 

Stefan75

Member
Jan 22, 2018
96
10
8
48
Switzerland
@whitey
I checked out the Highpoint HBA's, none of their products had enough SATA channels to meet my requirements. This is typically why you would want a RAID card.
I think whitey meant that there are LSI adapters that are raid only, and others that are pure (dumb) HBA by going into IT mode. I even deleted/removed the BIOS of my 9201-16e because I don't need it ;)
 

whitey

Moderator
Jun 30, 2014
2,766
868
113
41
@Stefan75 ...EXACTLY, LSI HBA's in the 8-16 port variants are what @ccoager should be looking at and YES for faster boot times leave the BIOS off if you don't need to boot off the HBA.
 

XeonSam

Active Member
Aug 23, 2018
159
77
28
I had this happen to me. But it was for OSNEXUS (Quantastor) and not FreeNas, but using ZFS.
The problem was the SAS expander. These expanders are made and tested with SAS drives but not SATA drives.
Entire disk groups would dissapear and when they reappeared they would show up as degraded