CSE-847 Drive Disconnecting Issues

Discussion in 'Chassis and Enclosures' started by Carvel, Jul 19, 2019.

  1. Carvel

    Carvel New Member

    Joined:
    Jul 19, 2019
    Messages:
    13
    Likes Received:
    0
    I'm struggling a bit getting my server running well and was hoping you could answer some questions that I have.

    Current setup:
    - Supermicro CSE-847 case
    - Supermicro SAS2-846EL1 backplane
    - Supermicro SAS2-826EL1 backplane
    - Supermicro AOC-SAS2LP-MV8 HBA
    - Intel i7-4770k CPU
    - ASRock Z97 Fatality motherboard
    - 16GB Corsair Dominator Platinum DDR3
    - Samsung 960 Evo 250GB m.2
    - 22 assorted SATA drives (WD Red, WD Green, Seagate, etc., making a 64.1TB pool)
    - Windows 10 Pro w/ Stablebit Drivepool and attempting to get SnapRAID going currently

    When I was first setting it up I saw that sometimes I would get a drive disconnection during heavy load. Windows Event Viewer would show it disconnecting and then reconnecting later. I checked the SFF-8087 cables and tried replacing them. The problem seemed to go away.

    I wanted to check the firmwares of the backplanes, but whenever I tried to use the ExpanderXTools it doesn't show anything. Is this because I'm not using a Supermicro motherboard or is there another reason that I can never see the backplane?

    Now it's probably a year later and I'm trying to get SnapRAID working. Whenever I perform the first sync it gets about a half-day to day in (estimated 53 hours total to perform the first sync) and then all the drives start disconnecting.

    Questions:
    - Does anyone have any idea why my drives keep disconnecting under load?
    - Do you think a firmware update on my HBA or backplanes would help? If so, do you have any idea how I get my backplanes to show up in ExpanderXTools?
    - Would a Supermicro motherboard help? If so can anyone recommend one either for my existing hardware (i7-4770k, non-ecc ddr3, etc) or for a modern system (Xeon, DDR4, built-in HBA, etc)?

    Thanks in advance.
     
    #1
  2. Carvel

    Carvel New Member

    Joined:
    Jul 19, 2019
    Messages:
    13
    Likes Received:
    0
    After a lot of reading it seems like maybe this could be an HBA/backplane conflict and that an LSI 2308 based card might be a better choice? Should I grab an HP H220, they seem like a pretty good deal? Or go with a 3008 based one like the Supermicro 3008?
     
    #2
    Last edited: Jul 19, 2019
  3. nthu9280

    nthu9280 Well-Known Member

    Joined:
    Feb 3, 2016
    Messages:
    1,418
    Likes Received:
    358
    How is the airflow over the drives and HBA and you may need check / monotor the drive temps during the heavy load. I'm assuming you have drives populated just in the front bays. Heard drives in the rear bays tend to run warmer due to less airflow.
     
    #3
  4. Carvel

    Carvel New Member

    Joined:
    Jul 19, 2019
    Messages:
    13
    Likes Received:
    0
    Hmm, I had turned the fans down for noise reasons and the temps are fine during regular usage. This is a good question though so I just cranked the fans up and started another sync. I'll let you know how it goes. I can hear it clearly from the main floor of our house now (server is in the basement mechanical room). :)
     
    #4
  5. Carvel

    Carvel New Member

    Joined:
    Jul 19, 2019
    Messages:
    13
    Likes Received:
    0
    Temps seem fine so far. We'll see how this goes.
     

    Attached Files:

    #5
  6. BLinux

    BLinux cat lover server enthusiast

    Joined:
    Jul 7, 2016
    Messages:
    2,346
    Likes Received:
    812
    i am totally clueless on Windows, but if this were on Linux, I would be looking at the kernel logs to see what the driver tells me when the drives go offline. that might help identify the problem. i'm guessing there's a equivalent to that in Windows event logs perhaps?
     
    #6
  7. Carvel

    Carvel New Member

    Joined:
    Jul 19, 2019
    Messages:
    13
    Likes Received:
    0
    Yeah, I can check when it fails again to get the exact message from the Windows event log. But it basically just says that the drive has disconnected from the system.

    The Snapraid error says that it's a Windows error 1167 which according to System Error Codes (1000-1299) - Windows applications means ERROR_DEVICE_NOT_CONNECTED which makes sense.
     
    #7
    Last edited: Jul 20, 2019
  8. Carvel

    Carvel New Member

    Joined:
    Jul 19, 2019
    Messages:
    13
    Likes Received:
    0

    Attached Files:

    #8
  9. Carvel

    Carvel New Member

    Joined:
    Jul 19, 2019
    Messages:
    13
    Likes Received:
    0
    I also noticed that the speed of the Snapraid sync was really jumping around while it was going. From a bit over 1GB/s to as low as 50MB/s. Is that normal or weird?
     
    #9
  10. EffrafaxOfWug

    EffrafaxOfWug Radioactive Member

    Joined:
    Feb 12, 2015
    Messages:
    1,073
    Likes Received:
    355
    Firstly, is your PSU beefy enough to support this setup? I've seen random crashes and disconnects before when too much power was drawn and the voltage dropped enough to crash various subsystems. An easy way to test is if you can trigger the behaviour by running something CPU-intensive (e.g. some random all-core CPU benchmark) to see if anything marginal breaks.

    Secondly, you're using a Marvell SATA controller? Personally I wouldn't trust those things as far as I could spit them for anything other than occasional usage. More specifically, there was definitely a bug in the 9200 controller series that required a firmware update to fix spurious disconnects and SMART errors - it might be worthwhile chasing down whether there was anything similar afflicting the 9400 series and if there's any updates for them.

    If you find yourself running out of ports, lots of people on these forums, myself included, will point you in the direction of the LSI-based HBAs which many of us here have been running for years - they're generally cheap, easily available and very reliable workhorses.

    I don't know snapraid and whether this is normal behaviour or not, but my discovery of the LSI HBAs on this site stemmed out of my frustration of finding something better than the Marvell SATA ports I was using in my file server at the time; they'd often suffer inexplicable problems and the performance was mediocre at best - they held back my whole RAID array.
     
    #10
  11. Carvel

    Carvel New Member

    Joined:
    Jul 19, 2019
    Messages:
    13
    Likes Received:
    0
    It's a redundant 1280W PSU so I think it's got plenty of juice there.

    I haven't heard anything but I am leaning towards this possibly be due to my HBA. It's a Supermicro SAS2LP-MV8 which I figured would be good with Supermicro backplanes but I guess maybe not.

    Yeah, I don't need any more ports. I have expanders in the backplanes so I have 36 bays which should be enough for me.

    Thanks, yeah I might do this.
     
    #11
  12. Carvel

    Carvel New Member

    Joined:
    Jul 19, 2019
    Messages:
    13
    Likes Received:
    0
    Would you guys gets get an HP H220 (LSI SAS2308) or a Supermicro AOC-S3008-L8e (LSI SAS3008) for use with Supermicro SAS2-846EL1/826EL1 backplanes?
     
    #12
  13. EffrafaxOfWug

    EffrafaxOfWug Radioactive Member

    Joined:
    Feb 12, 2015
    Messages:
    1,073
    Likes Received:
    355
    Hmm I didn't make the spot that the AOC-SAS2LP-MV8 HBA you're using is a Marvell controller; SM are notoriously reticent about changelogs but are you able to check it's running the latest firmware (v4.0.0.1812 according to their site)?

    In terms of the HBA (and I'm not saying that's definitely the cause of your woes), replacing it with the newer one based on the SAS3008 would be better and more future-proof, but there's many other models using the same chips underneath, so there might be better/cheaper cards available - you might not want to spend too much money on replacing something that might not even be broken.
     
    #13
  14. Carvel

    Carvel New Member

    Joined:
    Jul 19, 2019
    Messages:
    13
    Likes Received:
    0
    It finished successfully after changing the power management settings. Hopefully that fixes the flakiness for good now. Thanks guys.
     
    #14
  15. Carvel

    Carvel New Member

    Joined:
    Jul 19, 2019
    Messages:
    13
    Likes Received:
    0
    Nevermind, it's still flaky. Blah. Just tried to run another sync and all the drives disconnected again.
     
    #15
  16. Carvel

    Carvel New Member

    Joined:
    Jul 19, 2019
    Messages:
    13
    Likes Received:
    0
    Yup, I flashed it to the latest and it's still flaky.
     
    #16
  17. Carvel

    Carvel New Member

    Joined:
    Jul 19, 2019
    Messages:
    13
    Likes Received:
    0
    So I got my Supermicro AOC-S2308-8Le today and I still can't see my backplanes in ExpanderXTools. Does anyone have any ideas why I've never been able to use that software to see/flash them?

    I'm going to kick off another sync now and see if it's fixed my stability issues.
     
    #17
Similar Threads: CSE-847 Drive
Forum Title Date
Chassis and Enclosures Supermicro CSE-847 dead?? Sep 16, 2019
Chassis and Enclosures SuperMicro Performance hell - CSE-847E1C-R1K28JBOD Jun 20, 2017
Chassis and Enclosures Looking for recommendations for both 12/24 drive 3.5" disk shelf Nov 25, 2019
Chassis and Enclosures Looking for blu ray drive for Supermicro 2U chassis Oct 23, 2019
Chassis and Enclosures 30-40x 2.5" drive bay case? Aug 2, 2019

Share This Page