The nightmare before Christmas! (or MB/PSU failure - how can i test?)

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

el_pedr0

Member
Sep 6, 2016
44
1
8
47
Hi all,

My shiny new home server has suddenly packed up. I've narrowed it down to the Supermicro X11SSM-F MB (because I've managed to boot another machine using the Seasonic S1211-430 PSU). What test can I run to diagnose further?

The symptoms:
I noticed that I couldn't log on via SSH. So I went to the box and saw that the power-button LED on the case would blink for a second and the case fan would spin for a second, then everything stops. And after a short pause, the same thing happens.
When I took the side of the case off, I could see that the green power LED on the MB turns on for a couple seconds, and the CPU fan spins for a second. Then everything stops. After a 2 second pause, the same thing happens. The BMC led on the MB stays on all the time.
I can log in to the MB via IPMI, but the console unsurprisingly doesn't connect.

I've now disconnected everything from the MB bar the main power supply connector and the CPU power supply connector. The CPU and the CPU fan are both still plugged in. But RAM, drives, USB cables, cards, and now even the connections for the case front panel are removed.

My understanding is that motherboard should beep in order to complain that there's no RAM. But instead there is no change to the behaviour described above.

The only other PSUs and MBs I've got are in a :
* Dell Dimension 3000,
* Dell Dimension 9200
* Dell XPS 8500
None of PSUs have the second 8-pin CPU power cable so I don't think I can test the Supermicro board using those PSUs. I have, however, managed to boot the Dell XPS using the Seasonic PSU (albeit using the 4-pin CPU power connector, rather than the 8-pin CPU connector that I use for the Supermicro MB).

I've written to the Supermicro vendor. But is there anything I can do in the meantime to narrow down the problem?

(edit 1: to clarify LED status)
(edit 2: corrected to reflect that I've booted another system using the PSU)
 
Last edited:

ttabbal

Active Member
Mar 10, 2016
743
207
43
47
You might be able to boot with a single CPU without the second power connection. Or just get a "Y" cable to test with so you can use a different PSU.

The power supply might be marginal. I have one here that works, until the system loads the 12V bus just a little more. Then it falls over. It will boot up, work fine, then try to spin up a CD drive and crash. Replaced the PSU, no problem since. The fact that IPMI works, but the main system can't boot makes me wonder if it might be something like this.
 

Scott Laird

Active Member
Aug 30, 2014
312
144
43
I've had motherboards fail to do anything at all with 0 RAM installed before. They should beep, but weird stuff happens. I usually test with 1 DIMM installed anyway.

If you can connect to IPMI, then what do the sensors and event log say? Are the voltage levels okay?
 

Terry Kennedy

Well-Known Member
Jun 25, 2015
1,140
594
113
New York City
www.glaver.org
I've had motherboards fail to do anything at all with 0 RAM installed before. They should beep, but weird stuff happens. I usually test with 1 DIMM installed anyway.
I've never had a Supermicro board beep at me for any of missing memory / CPU / EPS12V. Makes it a pain to troubleshoot those "no picture, no sound" (as we used to say in the TV repair business) problems.
If you can connect to IPMI, then what do the sensors and event log say? Are the voltage levels okay?
It may be powering itself down if it isn't happy, so you'd have to catch it at just the right time.
 

lukyjay

New Member
Jan 5, 2017
1
0
1
32
Did you find out what the problem was? Is this something you could fix?

I have this motherboard and am having a similar problem. Is this an RMA on the motherboard?
 

el_pedr0

Member
Sep 6, 2016
44
1
8
47
TLDR: RMA'ed & received new board yesterday. Server up. Happy days.

On the advice above, I went back into IPMI and saw that even though I could log on, I couldn't get any sensor readings (though maybe I wasn't catching it at the right time, as Terry suggested above).

Running the various tests to try and figure out the problem took a chunk of time (e.g. hooking PSU to different MB, trying to power up with various configs of RAM and cables connected/disconnected). Had to pay for postage, which is no biggy, but a bitter pill. And then when I got the replacement, proxmox GUI wasn't available because linux had given the NICs new names (eth2 and eth3 rather than eth0 and eth1) which meant that I had to change a ref from eth0 to eth2 in the /etc/network/interfaces - very simple fix, but again time consuming trying to figure it out.

I had taken the trouble of numbering my SATA connectors before unplugging them from the old board, so I could plug them in in exactly the same order in the new board. My theory is that this would make the ZFS pools accessible instantly - which they were. Not sure if there's anything behind this theory.

Supermicro did not provide any explanation about what was wrong.