Help: Supermicro x11dpg-qt powers on and shuts down immediately

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

Bert

Well-Known Member
Mar 31, 2018
845
399
63
45
I just got this motherboard. It comes with 2x Silver 4109T CPUs. Physically looks perfect. IPMI works but cannot login. Seller does not know the IPMI password and I am not sure where the default IPMI password is located.

I am about to thrash it but I wanted to check here if there are any ideas on how to salvage it. Symptoms indicate dead motherboard:

- CPU is not powering on. It stays cool.
- MB starts and shuts down immediately or starts and hangs there. That keeps on changing. MB is responsive to the power button.
- No display no beeps.
- Network chip powers on.
- SB felt cold but that may be normal.
- Tried all the usual combos: Reseat CPUs, remove one CPU, change DIMMs. Didn't try to clear CMOS etc, not sure if that would help.

It is sad to see it getting being thrashed. Any ideas on what to check and try?
 

nexox

Well-Known Member
May 3, 2023
678
282
63
Yeah it requires some level of function but I have heard it fix some awfully broken sounding systems.
 

Bert

Well-Known Member
Mar 31, 2018
845
399
63
45
I was able to login to IPMI and check out the health logs. Bunch of errors from the past. All sensor readings are N/A. Event log does not say so much:


479​
Critical
3/26/2024 23:41​
Processor(CPU1 Temp)Thermal Trip has occurred - Assertion
480​
Critical
3/26/2024 23:41​
Processor(CPU1 Temp)Thermal Trip has occurred - Assertion
481​
Critical
3/26/2024 23:41​
Processor(CPU1 Temp)Thermal Trip has occurred - Assertion
482​
Critical
3/27/2024 2:10​
Processor(CPU1 Temp)Thermal Trip has occurred - Assertion
483​
Critical
3/27/2024 2:10​
Processor(CPU1 Temp)Thermal Trip has occurred - Assertion
484​
Critical
3/27/2024 5:14​
Processor(CPU1 Temp)Thermal Trip has occurred - Assertion
485​
Critical
3/27/2024 5:15​
Processor(CPU1 Temp)Thermal Trip has occurred - Assertion
486​
Warning
3/27/2024 5:15​
ACPowerOn(OEM)First AC Power on - Assertion
487​
Information
3/27/2024 5:22​
Session AuditInvalid Username or Password - Assertion
488​
Information
3/27/2024 5:22​
Session AuditInvalid Username or Password - Assertion
489​
Information
3/27/2024 5:22​
Session AuditInvalid Username or Password - Assertion
490​
Information
3/27/2024 5:22​
Session AuditInvalid Username or Password - Assertion
 

Bert

Well-Known Member
Mar 31, 2018
845
399
63
45
Is this a dead board or should I try a few other things like changing the CPU? I cannot flash the BIOS because it asks for activation code.
 

nexox

Well-Known Member
May 3, 2023
678
282
63
Looks like it is mistakenly reading the CPU temperature too high and refusing to start, I assume that sensor is on the CPU itself, so trying another might be worth it. You can pretty easily find instructions to generate a IPMI key if you want to go that way.
 

Chriggel

Member
Mar 30, 2024
49
17
8
Indeed, I'd try to remove the CPU that causes the Thermal Trip. This suggests that the system does in fact power on, but then immediately shuts down again as a safety measure.

CPU1 might or might not be the one you've already removed during your tests. I assume you made sure to remove the correct CPU during your tests, which is the 2nd one.

If they're numbered CPU0 and CPU1, then you've already removed the possibly faulty one. In this case it would be worth to do it again and check if there are other log entries when trying to power the system on like that.
If they're numbered CPU1 and CPU2, then remove CPU1 and put CPU2 into the first socket and try powering it on.
 

Bert

Well-Known Member
Mar 31, 2018
845
399
63
45
I have extra CPUs that I know they work. On the other hand, it is possible motherboard is tripping because it has bad comms with CPU. I will circle back in a few weeks.
 
  • Like
Reactions: nexox