R720xd SAS controller not always detecting SATA drives during boot up

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

dominatorstang

New Member
Jan 6, 2017
3
0
1
42

If the images do not come through, then the here is the link. http://imgur.com/a/kn0XS

Please, any suggestions are welcome. Even if they sound offensive. I know the simple things are missed more often than we like to admit.

I am chasing an issue with my recently purchased R720xd 3.5” intermittently not recognizing hard drives being connected (during initial boot). Results have been inconsistent, which has made it very challenging to troubleshoot.

During boot, when everything works correctly (typically when first powered on and/or cold) The Hard Drive Backplane briefly flickers all orange LED’s, then a green HDD led with flash a time or two where drives are connected at and then stay on solid. The drives can then be seen in the SAS controller BIOS

During boot, when the drives are not recognized (typically after it has been powered on for a few minutes to warm up), The orange LED’s will either not come on at all, stay on continuous, or stay on for a minute or two then go off. The SAS controller then does not list any drives in the BIOS.

The things I have done are listed below, but I still have results of it working and not working with each thing I have tried. Telling me they are not the cause.

-Swap between the OEM H710 mini and an LSI PCIe HBA

-Swapped out SAS cables

-Only connected one SAS cable at a time

-Only connecting a single HDD (2TB SATA)

-reset CMOS

-disable and enable the OEM H710 RAID controller in BIOS while connected to LSI controller

-Swap out backplane

The only consistent change I get is that it works every time when I have it out in my cold work shop (50degF currently). When the server is in room temperature, it is hit or miss if it shows any drives. I can directly connect only the rear 2.5” drive backplane with a single SAS cable and it registers drives each and every time I try. At that point, I strongly suspected this cause to be a broken solder joint on the SAS expander BGA chip on the BP, which would be effected by the temperature of air being pulled across the backplane. But replacing the backplane yesterday proved the same thing as did swapping cables and such. Still was not working each time.

I also tried to figure out what each and every cable connected to the backplane is for, so I can better understand why the issue appears to be localized to the BP. From what I can tell, the ribbon cable from the motherboard to the BP is just passing through to the left and right rack ears for VGA, power button, and indicators. The USB from the motherboard is also passing through to a rack ear. The two SAS cables go to the HBA controller, and the 3rd SAS cable just joins in the rear BP for the two 2.5” drives along with a multi conductor cable joining the rear BP to the front BP. The BP has two power cables from the motherboard.

Now what I cannot figure out on the BP connection, there is a signal cable (connection labeled “SIG”) with about 10 or 12 wires going from the motherboard to the front BP. Another identical cable goes from the motherboard to the rear BP (maybe it just jumpers through the motherboard from front BP to rear BP, but not sure why it would). I tried booting without this cable connected, but no LED’s flashed or anything on the BP and of course no drives were detected.

Please, any suggestions are welcome. Even if they sound offensive. I know the simple things are missed more often than we like to admit.
 

Dk3

Member
Jan 10, 2014
67
21
8
SG
Not sure if my problem is same as yours but just sharing with you and hopes it helps.

My previous R720 had encounter intermittent harddisk connection problem. After a mth or 2 the server failed to startup and showing "LT0240 System BP1 5v pg voltage ...".

The solution for mine is to change the backplane (Part no. 0J2C2D) and all the signal cables (Part no. 0G95P6,0KV109).
 

dominatorstang

New Member
Jan 6, 2017
3
0
1
42
Thank you Dk3. I do have a replacement front backplane I tried, it was a pull from a brand new machine. I have not tried new I2C signal cables, but I have inspected them for damages. I will probably test resistance on the I2C cables while flexing them to see if they have a partial open that is just not visible.

I have not measured my 5VDC supply, but my 12VDC supply is spot on at 11.99 to 12VDC. I also measured the AC voltage coming off the 12VDC, it was only 2-5mV which was way lower than what I expected. All this was measured with a Fluke 87V while the server was idle and measured at the rear backplane which was easier to access when running.
 

dominatorstang

New Member
Jan 6, 2017
3
0
1
42
I must have overlooked something when I tried the replacement backplane. So I continued plugging away at this issue and doing more verifications in testing to get the constants locked in solid. I then swapped out to the replacement backplane once again, and it has not had an issue since...

P.S. I still do not fully understand the SIG I2C cable, but it would make sense for it to be atleast used for the firmware update and iDRAC communications.