Sudden server reboot troubleshooting

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

gtech1

Member
May 27, 2019
78
6
8
So I built out a brand new Supermicro server. Everything was brand new except the chassis: X11SRM-F motherboard, W2225 CPU, 128GB DDR4 2400 Kingston RAM.
The motherboard was meant for a desktop but I placed it in a 1U with 500W power supplies.

Power consumption is at 96W, IPMI showed absolutely nothing about the reset event, all sensors are fine, temperature is fine, didn't see any errors anywhere.
The OS showed no kernel core dumps, no logs whatsoever.

It simply rebooted out of the blue which is kind of worrying.

I assembled it about 1 week ago and it ran fine for 7 days and then this happened. The server is not very busy, it's a stand-by backup server for another 'master' server so CPU usage barely goes over 10-15%.

How can I troubleshoot this ? What should I change / inspect first ?
 

zhianguo

New Member
Jun 21, 2015
1
0
1
58
I once had this kind of thing for a desktop when I had a defective nvme ssd. The kernel could not dump error msgs to log after fault because of intermitten disk failures.
So one way to eliminate this possibility is to clone the disk and move the boot disk. In my case I moved the os from the nvme to a sata drive and the reboot was gone.
How is power quality on your outlet?