IBM M5015 overheating due to bad resistor soldering

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

clcorbin

Member
Feb 15, 2014
38
7
8
So after reading everything on this form about the assorted LSI based controllers, I picked up an IBM M5015 and battery on eBay. When I first tested it out, in my old server (which is an Intel SC5299-E server chassis), the card seemed to work fine and sat there running (doing pretty much nothing) for about eight hours.

However, when I moved it to my current home server (which is in a desktop case with much lower airflow than the Intel chassis), the card went through boot just fine, but stopped responding (heart beat LED stopped) about 3/4 of the way through boot and started giving a three beep alert. If I rebooted right away, the card came up beeping. However, if I waited some time (5 to 10 minutes or more), it would once again come up, boot into BIOS, then fail before Windows 2012 could finish loading.

After trying all three slots on the motherboard for the card as well as two slots in my personal desktop computer, I finally realized it was probably heat related, so I pulled the heatsink off and found some old compound. A good cleaning and new AS5 and I had the heatsink back on and reinstalled. This time, the card ran for about 10 minutes (I JUST got the raid manager software installed) before it shutdown once more.

The next attempt was to improve cooling further by mounting a small fan on to of the heatsink and have it blow down through it. That seemed to get things up to about 6 to 10 hours before it shut down.

The next step is when things got strange. After pulling the heatsink off, I looked at it through my 5X magnifying light and found that one resistor next to the LSI chip (bottom right resistor of the "matrix" of resistors and capacitors right next to the chip) was sitting at a 45 degree angle and was NOT making any contact with it's pad. That couldn't be right, so I put the SMD tip on my iron and repositioned it so that it was were it was supposed to be.

That was 2 days ago and it has not acted up since. There is no way this happened in the field, so probably the large airflow in the server it came out of kept it alive. Anyone else seen this type of defect?

Clint
 

MiniKnight

Well-Known Member
Mar 30, 2012
3,073
974
113
NYC
I've never seen this. Do you have pictures? Hard to visualize without one in front of me.
 

mrkrad

Well-Known Member
Oct 13, 2012
1,244
52
48
I always suggest using a lens to go over any used boards - you'd be surprised what i've received off ebay. It is very easy to knock an SMd chip off a card during handling!
 

cilek

New Member
Jul 21, 2014
1
0
1
37
In many cards these days, they put temperature sensors on. The cards general have a temperature range of 0C to 80C for industrial standards. The resistive temperature sensors can sense from -20C and so on. So any loss of contact, loss of a pull-up resistor results in erroneous reading from the temperature sensor. It happens because temperature resistors are general passive components and require a pull up. In any case, your card was not getting the correct temperature reading from the sensor making it shut down again and again.

printed circuit board assembly
 
Last edited: