Seems like I bricked a Supermicro H8QG6...

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

PoopsPoops

New Member
Sep 16, 2017
7
0
1
42
4P board with 6272 chips installed and 8 dimms. Testing OC's with ocng5. Was running a test load for hours no problem at refclock 240 (2.88GHz), then tried to overclock to 245 with ocng-cu.

I just accepted the defaults in ocng-cu - only changed the refclock didn't fiddle with the other settings. Then I go to poweroff the machine, run the Linux command for it, and while Linux halted the motherboard did not power down like it always had.

I was worried - powered down via the PSU and started back up. Nothing now - fans spin up, DP3 the power led on the motherboard never comes on, but the DP1 IPMI led does. IPMI doesn't get an IP however. Another LED not labelled in the manual between DP3 and DP1 momentarily blinks red when hitting the power switch...

Things tried:
Clearing CMOS by removing battery and then shorting those contact pads for 30 seconds.
Taking it out of the case and onto wooden board.
Trying different power supplies, all known working.
Unplugging all RAM, USB.

Power meter shows usage does jump up to 80W from 5W standby when I hit power switch, so something is happening. Really think I bricked it... was always able to recover from bad OC by just waiting for the 3 failed posts before it reset everything.
 

evolucian911

Member
Jun 24, 2017
36
6
8
35
You don't mention of you tried a single processor at a time. Did you?

Sent from my LG-LS997 using Tapatalk
 

PoopsPoops

New Member
Sep 16, 2017
7
0
1
42
No, I did not. I got the IPMI to get an IP (cable wasn't shoved in enough d'oh) and got on the web interface, but nothing in the logs that said 'not recoverable'. Just some low voltage warning for VBAT, but a multimeter told me 2.99 for the CMOS battery.

What I also did was plug in a LED into JOH1 to see if an overheat sensor is faulty, it didn't light up.

I am going to remove cpus next leaving just CPU1 in and moving each CPU in turn in that slot to see which it doesn't like. Would be sad if one of them is the problem... During OC'ing CPU2 was reaching hi temps before I figured out I needed to tighten the heatsink on more. But, then I did and it reported low temps and ran for hours. Really hope that brief period of high temps before I noticed didn't toast it (machine never misbehaved during the overheats tho).

Clearing the CMOS wouldn't set the firmware back would it? I know these boards need an updated firmware to support the 62xx chips. But, that doesn't explain why it wouldn't come back on right after the powerdown event.
 

evolucian911

Member
Jun 24, 2017
36
6
8
35
No. Clearing it cannot downgrade the firmware. Only resets it. And what's odd is that without voltage increases to CPU, I don't think CPU should be affected.

But try them one after the other with various sticks of ram to rule things out.

Sent from my LG-LS997 using Tapatalk