C2100 + H700 + WD Red Drives and Backplane Jumper Question

Sielbear

Member
Aug 27, 2013
31
2
8
I'm looking to put together a home media server. I have had significant success with the C2100 chassis and the H200 board, flashed into IT mode for a linux backup server at work. For my home case, I'm looking at running ESXi as the hyper visor and running Windows Server 2012 R2 Essentials in a VM. I found that the H700 can be had for not much money, and it's fully supported in VMWare / ESXi.

Secondly, I have a lot of 3 TB WD Red drives in the office, some of which have been used previously. That makes for pretty inexpensive storage for a home media server.

I installed 12 WD Red drives in a RAID 10 on the H700 card and did a fast initialize. This simply removes the beginning and end of the RAID array data, and from what I can tell, relies on a background initialization? These drives had previously been used on a ZFS pool for mass storage. I started watching the drive activity lights, and the pattern was very odd. Every now a then a drive (any one of the 12 - random in pattern) would illuminate with solid activity. It would stay solid for maybe 3 seconds? Maybe a little longer, then the carrier would illuminate RED for 1 - 2 seconds, then it would go back to flashing happily.

I opened MegaRAID Storage Manager and connected to the ESXi box. I found a LOT of messages related to CRC error found. I'm thinking this has something to do with not doing a full initialization out of the gate. From what I've read, given these drives have been used previously, I really should have done a full initialization on them... Does this make sense? Am I on the right track? I know WD Red have consumer level URE, but I'm running RAID 10, this is for a home server, and most importantly, the WD Reds support TLER.

My second question relates to the expansion backplane. At the bottom of the backplane, there are 4 jumpers. J15, J16, J17, and J18. J15 is clearly identified - the jumper is to be put in place when a PERC card is used and removed if an LSI branded card is used. Easy enough. Every photo I've seen of this backplane has J16, J17, and J18 closed (jumpered). My backplane had J18 removed / open.

I received an odd message at boot up regarding direct access mode and discovered my backpland did NOT have a jumper on J15 in spite of using a PERC card. I installed a jumper, but also jumpered J18 (as that's what was in every photo I could find).

Does ANYONE know what J16, J17, and J18 do? I'd really love to know what setting this changes. :)
 

pricklypunter

Well-Known Member
Nov 10, 2015
1,605
469
83
Canada
If I were you, I would be running badblocks and smartctrl on those disks before I put them into service. If I remember correctly badblocks will run 4 patterns default, leaving the disks zeroed ready for use, all else being well with the reported smart data :)
 

Sielbear

Member
Aug 27, 2013
31
2
8
Is that because of the reputation of the disks or the CRC errors? My hope was that the drives were seeing old data that didn't match between the mirrors. The CRC errors were not limited to one drive- it was all over the board.

Also, I did upgrade the firmware before running a slow initialize this morning. One of the MANY fixes between the original 12.1 firmware and the new 12.6 was erroneous reports of single bit errors (which would trigger a CRC alert I believe).

Just clarifying your thought process with mine. Thanks!!!
 

pricklypunter

Well-Known Member
Nov 10, 2015
1,605
469
83
Canada
You'll want to run your tests (smart/ badblocks/ smart) first and foremost. It takes some time, but there's no point in pointing fingers at the firmware or card until you know for sure that the disks are free of defects and are definitely clean. If you have a known good card you can use to test the disks with, even better, but failing that, test them with on board ports. It doesn't matter which manufacturer's disks you are using, despite any reputation issues, you should always ensure your disks are in good health to start with :)
 

Terry Kennedy

Well-Known Member
Jun 25, 2015
1,060
500
113
New York City
www.glaver.org
Also, I did upgrade the firmware before running a slow initialize this morning. One of the MANY fixes between the original 12.1 firmware and the new 12.6 was erroneous reports of single bit errors (which would trigger a CRC alert I believe).
The latest is 12.10.7, but the only public change from 12.10.6 is a fix for RAID striping on insanely large single virtual drives (like a 1.125PB virtual drive with a 256K stripe size). 12.10.5 was a bad release as it reverted a number of previous fixes.

I second the suggestion of running SMART diagnostics and surface scans to see if the drives are good before proceeding.
 

Sielbear

Member
Aug 27, 2013
31
2
8
Sounds good. I'll run scans this weekend.

One thing that's sticking out like a sore thumb is the link speed. My understanding is that the h790 and backplane was rated for 6 Gbps on SAS but only 3 Gbps for sata. These WD reds are 6 Gbps drives and I'm showing a negotiated link speed of 6 Gbps. Based on the MANY communication type errors I'm seeing, it feels like the link speed could be contributing.

Is there a way with megacli to force a 3 Gbps link speed instead of 6?
 

pricklypunter

Well-Known Member
Nov 10, 2015
1,605
469
83
Canada
I think there are a couple of members here running c2100's, maybe they will have a better answer for you. I have little experience of them, but I do recall that they came with an expander backplane as standard, with an option of ordering them with a 1:1 backplane. Perhaps the expander doesn't support SATA II, although I'm sure I read that it does somewhere in the spec sheet.
 

Sielbear

Member
Aug 27, 2013
31
2
8
Ok - I'm making progress. I ordered 5 of these chassis for a project we're looking at. I built up a second one of these boxes with the WD Reds and it seems to be running damn near perfectly! NO CRC errors, etc. I have uncovered a couple of read errors logged that are later reported as "corrected during background initialization". I'm hammering on the drives during background initialization with iometer and I'm not seeing the issues like I saw previously.

My faith in my solution is being restored quickly. There's obviously something "not right" about the backplane / cables / controller / cache that is super unhappy in the other chassis. I'll have to test later on down the road. Overall I'm feeling 100% better now. I had never run into anything like this before.

For the record, the new chassis is reporting a full 6 Gbps for SATA connection speed.