Drives Drop - SAS card or Enclosure?

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

Bradford

Active Member
May 27, 2016
223
50
28
I posted a few months ago here, thinking my drives were failing.

The symptom: drives disappear from my Proxmox server. I have a 8x 2.5" hot swap enclosure (fits in 2 5.25" front bays). When the drive disappears, it seems random which ones do. If I reboot, some drives are still missing, even when I query the SAS adapter (LSI2008) with sas2ircu.

What I tried, and what makes me think it's not the drives, is I pulled out a couple drives out for a minute or so while the system was running. I then pushed them back in and the drives were recognized again. I did this for all the missing drives until they all showed back up.

My question: does this sound like a controller issue or an enclosure issue?

My Hardware:

* SAS: IBM M1015 flashed to LSI2008
* SAS Cables: Monoprice 0.75m 30AWG Internal Mini SAS 36-Pin Male with Latch to 7-Pin Female Forward Breakout Cable (new)
* Hot Swap Bay: AMS DS-528SSBA Anti-Vibration 8x2.5" SAS/SATA HDD Backplane Module
 

T_Minus

Build. Break. Fix. Repeat
Feb 15, 2015
7,640
2,057
113
I wonder if 2008 + these enclosures have issues?
There's others online reporting drives in icy dock drop out too.

I had drives dropping out using the same HBA, but only when connected to an enclosure similar to yours, but it would drop drives NOT on the enclosure, and on the 2nd channel too and they'd come back after a reboot, etc...

Now that you mention it... I had the same problem in another chassis with a 2008 based M1015.
This was a rosewill 16x? hot swap, it just randomly only worked with certain positions, for instance I think it's 4x 4 enclosures and I have drives in 1 of each enclosure but they all don't work in 1 or 2... very strange.

I think I'm going to try to swap the HBA for a PERC310 I have flashed too and see if that amtters, and if I have any extra 3008 based try that when I get time. Don't wait on me, it won't be soon but just sharing my experiences too.

On the other hand I have the same HBAs (M1015) flashed to 2008 (p20 iirc) in the same system as above working fine with 8 other 3.5" never an issue, and I have another in my test bench never an issue... so not sure if it's a HBA "THING" or simply we're using old old HBAs and the years of heat is getting to them???
 
  • Like
Reactions: Bradford

Bradford

Active Member
May 27, 2016
223
50
28
I'm willing to buy a different HBA to try, do you have any suggestions on which model to get that's not too expensive? Likewise, I'm willing to get a different hot swap bay if this one is too cheap to be reliable. I just want this damn issue to go away, it's really hard to troubleshoot.
 

T_Minus

Build. Break. Fix. Repeat
Feb 15, 2015
7,640
2,057
113
I've been acquiring 3008 for new builds, and hopefully replacement soon of 2008s.

The deals for these go in waves... one month you can get them for $75-95 the next month $150+ it seems!
 

nthu9280

Well-Known Member
Feb 3, 2016
1,628
498
83
San Antonio, TX
One other item to check is the airflow / temp of the SAS chip & drives. If you are in server chassis, airflow should not be a concern unless you are replaced the fans with less airflow/static pressure.
smartctl should show the drive temps. Not sure how to get the temps for IT mode SAS2008. MegaCLI -status will not work.
May be you can secure a small fan blowing on the heatsink.

Sent from my Nexus 6 using Tapatalk
 
  • Like
Reactions: Bradford

Bradford

Active Member
May 27, 2016
223
50
28
One other item to check is the airflow / temp of the SAS chip & drives. If you are in server chassis, airflow should not be a concern unless you are replaced the fans with less airflow/static pressure.
smartctl should show the drive temps. Not sure how to get the temps for IT mode SAS2008. MegaCLI -status will not work.
May be you can secure a small fan blowing on the heatsink.

Sent from my Nexus 6 using Tapatalk
The drives are cool every time I check them with smartctl, but I hadn't thought of the SAS chip. My chassis is pretty small but the case has lots of holes right next to the heatsink for the sas card. There's no fan directly on it but it has lots of opportunity for inductive airflow
 

Bradford

Active Member
May 27, 2016
223
50
28
I've been acquiring 3008 for new builds, and hopefully replacement soon of 2008s.

The deals for these go in waves... one month you can get them for $75-95 the next month $150+ it seems!
Is the 3008 the best current model for JBOD SAS? I don't care about RAID since I'm all-in on ZFS, but I do love the breakout cables the 2008 offers.
 

pricklypunter

Well-Known Member
Nov 10, 2015
1,709
517
113
Canada
I have seen this behaviour reported several times with a myriad of different enclosures and what folks have tried to do to overcome or work around the issue. I don't actually have any to test with, but my best guess is that these enclosures are incorrectly loading down the controller i/o, which is causing it to excessively heat. If I'm correct, just merely cooling the controller chip with a fan will not help and it will always be an unreliable solution :)
 

nthu9280

Well-Known Member
Feb 3, 2016
1,628
498
83
San Antonio, TX
The drives are cool every time I check them with smartctl, but I hadn't thought of the SAS chip. My chassis is pretty small but the case has lots of holes right next to the heatsink for the sas card. There's no fan directly on it but it has lots of opportunity for inductive airflow
SAS chips can get really hot and looks like SAS2008 doesn't have a temp sensor. You may need to get it using an external IR thermometer.
LSI SAS2008 Temp monitoring in Ubuntu
 

natelabo

Member
Jun 29, 2016
64
3
8
54
Are you using a server enclosure and power supply? Or a Homebrew PC case?
The hardest storage problem I ever troubleshot was power related... Spent months trying different cables, cards power supplies. Ended up being the stupid ATX to IDE power splitters I bought from China...

Don't know your situation exactly but something to check. My drives would only drop during very heavy load, rebuilds and consistency checks. Typically a single drive. Different all the time. So extremely nerve racking.

Sent from my Nexus 6P using Tapatalk
 
  • Like
Reactions: Bradford

Bradford

Active Member
May 27, 2016
223
50
28
Are you using a server enclosure and power supply? Or a Homebrew PC case?
The hardest storage problem I ever troubleshot was power related... Spent months trying different cables, cards power supplies. Ended up being the stupid ATX to IDE power splitters I bought from China...

Don't know your situation exactly but something to check. My drives would only drop during very heavy load, rebuilds and consistency checks. Typically a single drive. Different all the time. So extremely nerve racking.

Sent from my Nexus 6P using Tapatalk
Thanks for your thoughts. Mine seemed to drop very randomly - when they worked, they worked through rebuilds and scrubs. I did think about power - my enclosure takes two SATA power connectors. Originally I had a 430 Watt consumer grade PSU supplying both ports from one line. I hadn't considered the spin-up power draw, which could have been the problem as it could have exceeded the output of what I was supplying.

I swapped out the PSU for my 800 Watt modular (Corsair consumer) supply and supplied both power ports from different lines. Even after doing that, my problem persisted, so I felt like I was able to eliminate a power supply issue. But I couldn't eliminate the enclosure being the problem, which is why I posted.

After the advice I received above I bought a Perc H310 and flashed it in IT mode. It's been going strong for a couple weeks now, so I pray that was the problem. I absolutely hate problems like this that are difficult to troubleshoot.