I have a fairly non-standard setup here, using an Intel RS3DC040 to control a local 4-drive RAID10 for video editing work, and hope someone can offer some guidance with frequent troubles: It has repeatedly killed a multitude of drives, most recently a number of WD Gold 6TB, and I am struggling to determine what the root cause is. There are 4 drives, originally all were 4TB WD Blacks (not ideal, I recognize this), but it was killing the Port0 drive frequently enough, I moved to WD Gold 6TBs in the hopes the enterprise drive would prove to help resolve this issue (and I could eventually migrate all 4 drives to 6TB golds and up my capacity).
No backplane (though there had been one at one point), simply 4 drives and always Port0 fails.
Tonight, in less than 36 hours- I had not even ordered a new spare yet. (a 4TB will now arrive on Saturday). The log shows Unexpected Sense reset and CRC errors after about 17 hours uptime and light drive use
No indication of overheating in the log, this controller reportedly has an operating temperature range up to 105C and at the time the alarm sounded, it was sitting at 89C (and I have never seen it higher - though I am now looking into ways to improve airflow to it, just in case)
Intel RWC3 shows three of the drives at ~22C, but the "failed" WD gold at ~45C - it is in a slightly higher location than the others.
Since these issues started, I have:
Removed the backplane
Replaced the controller SAS to SATA cable
Replaced the PSU power cable
Replaced the PSU
Changed from WD Black to WD Gold enterprise drives.
Each time the drive is failed by the controller, I replace it - it is becoming almost comical how frequently. I am fairly new at maintaining this RAID, and hope someone here can perhaps point me in the right direction and mindset for troubleshooting this frustrating scenario. I have 3 (now 4) "failed" drives sitting on my desk now, waiting for me to work out what the best method might be for even determining how/why they have failed - at least one refuses to mount on another PC here.
No backplane (though there had been one at one point), simply 4 drives and always Port0 fails.
Tonight, in less than 36 hours- I had not even ordered a new spare yet. (a 4TB will now arrive on Saturday). The log shows Unexpected Sense reset and CRC errors after about 17 hours uptime and light drive use
No indication of overheating in the log, this controller reportedly has an operating temperature range up to 105C and at the time the alarm sounded, it was sitting at 89C (and I have never seen it higher - though I am now looking into ways to improve airflow to it, just in case)
Intel RWC3 shows three of the drives at ~22C, but the "failed" WD gold at ~45C - it is in a slightly higher location than the others.
Since these issues started, I have:
Removed the backplane
Replaced the controller SAS to SATA cable
Replaced the PSU power cable
Replaced the PSU
Changed from WD Black to WD Gold enterprise drives.
Each time the drive is failed by the controller, I replace it - it is becoming almost comical how frequently. I am fairly new at maintaining this RAID, and hope someone here can perhaps point me in the right direction and mindset for troubleshooting this frustrating scenario. I have 3 (now 4) "failed" drives sitting on my desk now, waiting for me to work out what the best method might be for even determining how/why they have failed - at least one refuses to mount on another PC here.