Supermicro H11SSL-I-O only heartbeat light, no system power light

Root.ed

New Member
Feb 6, 2020
15
0
1
Yep, that's the password for IPMI. The username should be ADMIN.
Great - I'm in IPMI on my home machine (but that one works fine... just wanted to confirm the username/password combination to pass that on to them on the machine that has the heartbeat). If anyone else reads this the username is still ADMIN and the password is the one under 'PWD' for newer motherboards produced in late 2019 and after. Since I'm able to login successfully, I'll be able to give them the same instructions and see what we can do from there. I'll update this thread in about 12 hours or so as soon as I get ahold of them.

ADMIN/ADMIN will NOT work for the newer motherboards, don't drive yourself crazy :).

Thanks guys for helping out with that so far... I need them to do the same on their end so we can see things. I'm guessing I should report back anything in the Miscellaneous -> Snooping section and any post codes there? Anything else I should be grabbing from the IPMI to diagnose it further?

Shall I have them re-install the CPU, insert 24-pin and 4+4 pin, and cable into the IPMI ethernet port and power it on and then try to get into the IPMI from another machine and report back?

@yesoos - got it will consider those and will report back. The contact for these damn CPUs/boards is very hit or miss.... even when using my 1/4 adjustable torque screw driver. Crazy how finicky these things are.

Will report back ASAP after I hear back from them.

Thanks everyone - yesoos, tesla and freebit for all the help thus far... hope I find something in the IPMI on the machines they have over there that were working just fine before they were shipped.
 

yesoos

Member
Mar 10, 2020
33
3
8
PL
got it will consider those and will report back. The contact for these damn CPUs/boards is very hit or miss.... even when using my 1/4 adjustable torque screw driver. Crazy how finicky these things are.
I could say those big chips are fragile , do not tight screw nr 1 to max then nr2 and nr 3 (you could destroy CPU) ,go in loop 1-2-3, 1-2-3 .. with tighten every screw one after another a bit to the point you hit the needed pressure.
 

tesla100

Member
Jun 15, 2016
223
20
18
40
Shall I have them re-install the CPU, insert 24-pin and 4+4 pin, and cable into the IPMI ethernet port and power it on and then try to get into the IPMI from another machine and report back?
You don't need to turn the system on in order to get into IPMI, if that's what you meant. You can access it even if the motherboard is turned off, but obviously, the PSU which powers the motherboard must be turned on.
 

Root.ed

New Member
Feb 6, 2020
15
0
1
I could say those big chips are fragile , do not tight screw nr 1 to max then nr2 and nr 3 (you could destroy CPU) ,go in loop 1-2-3, 1-2-3 .. with tighten every screw one after another a bit to the point you hit the needed pressure.
Got it. Unfortunately, they don't have a torque screw driver with the adjustable pounds... so I guess we'll have to try to do the best we can without that (for now, as they can't get one for at least 7-10 days).

You don't need to turn the system on in order to get into IPMI, if that's what you meant. You can access it even if the motherboard is turned off, but obviously, the PSU which powers the motherboard must be turned on.
I meant that if we kept the CPU in, wouldn't it help with the post codes and possibly give insight as to whether the CPU or Motherboard is failing (or even both)? Or what exactly am I looking for in the IPMI's snooping section and is there anywhere else I should be checking to see what is or isn't working? As I'm pretty sure to get the motherboard light (LED1) you need a WORKING CPU in there ... and with the CPU in there... it wasn't giving us any motherboard power (LED1)... I think we even stuck a working CPU (from the 1 machine that was working) and still no power on the motherboard... except the CPU fan actually was spinning with a 'working' CPU. Was not working on the CPU that was attached to the motherboard that doesnt work.

Hope that makes some sense.
 

tesla100

Member
Jun 15, 2016
223
20
18
40
I meant that if we kept the CPU in, wouldn't it help with the post codes and possibly give insight as to whether the CPU or Motherboard is failing (or even both)? Or what exactly am I looking for in the IPMI's snooping section and is there anywhere else I should be checking to see what is or isn't working? As I'm pretty sure to get the motherboard light (LED1) you need a WORKING CPU in there ... and with the CPU in there... it wasn't giving us any motherboard power (LED1)... I think we even stuck a working CPU (from the 1 machine that was working) and still no power on the motherboard... except the CPU fan actually was spinning with a 'working' CPU. Was not working on the CPU that was attached to the motherboard that doesnt work.
Hopefully Yesoos can help you with that, given that he has the same motherboard...
 

yesoos

Member
Mar 10, 2020
33
3
8
PL
PS ON (JF1 header power button or power on from BMC: remote control -> power control) is a must to see any CPU bios Q-codes. When powered off only operations like BIOS update or BMC firmware update are available. Server Health -> Event Log is the place to start to seek some answers. Also CMOS battery failure is subject to power on fail. I could suggest to remove motherboard from case and try to power on, also seek for possible visual damages, this is waht I will do. Please go through entire troubleshooting section page 51 https://www.supermicro.com/manuals/motherboard/EPYC7000/MNL-2085.pdf
 
Last edited:

freebitflow

New Member
Dec 31, 2019
22
0
1
Any changes, when you hit Refresh? I had to hit it like a 100 times to get all post codes. From my point of view, the implementation of the snooping is really poor. I would have expected that I I could see some kind of log showing every post code one after each other. On my board I had the problem that it hang in some kind of bootloop, so I was getting the same post codes all the time.
 

yesoos

Member
Mar 10, 2020
33
3
8
PL
For real time codes you need to buy PC diagnostics mini card and put into TPM port like TL460s plus (becasue this motherbaord doesn't have any builtin)
 
  • Like
Reactions: freebitflow

freebitflow

New Member
Dec 31, 2019
22
0
1
Got it. Unfortunately, they don't have a torque screw driver with the adjustable pounds... so I guess we'll have to try to do the best we can without that (for now, as they can't get one for at least 7-10 days).
Is having a torque screw driver really required? I just screwed them down in the order that was shown on the metal bracket holding the socket till the screws bottomed out. My board is working without any issues for 2 weeks now. I guess that I would have to loosen the screws again, if this is a real issue?!
 

TXAG26

Active Member
Aug 2, 2016
274
81
28
Any chance the machines were powered on without the heatsinks installed when they arrived at the destination? This could have toasted the CPU's if they forgot to install the heatsinks first?
 

TXAG26

Active Member
Aug 2, 2016
274
81
28
Throttling will apply to prevent thermal damage or complete shut down
I know most modern CPU's have throttle functions, but without a heatsink installed, would they be able to throttle fast enough before too much heat was dumped into the cpu upon bootup without a heatsink installed? I'm genuinely curious, not trying to be argumentative.
 

yesoos

Member
Mar 10, 2020
33
3
8
PL
No, boot up is at minimum clock , way below base clock ,for Epyc probably around 400Mhz and Tmax is as I remeber around 94 degrees C / throttle at 100 C. No way to burn it without heatsink by mistake, but I think you could do some eggs or beef steaks on it. I can check on my broken epyc :D well, I do not know if this question was asked to topic author but did you test your rigs before sending?
 
  • Like
Reactions: TXAG26

Root.ed

New Member
Feb 6, 2020
15
0
1
PS ON (JF1 header power button or power on from BMC: remote control -> power control) is a must to see any CPU bios Q-codes. When powered off only operations like BIOS update or BMC firmware update are available. Server Health -> Event Log is the place to start to seek some answers. Also CMOS battery failure is subject to power on fail. I could suggest to remove motherboard from case and try to power on, also seek for possible visual damages, this is waht I will do. Please go through entire troubleshooting section page 51 https://www.supermicro.com/manuals/motherboard/EPYC7000/MNL-2085.pdf
So right now here's what we have, we are logged into IPMI on oen of the machines that only blinks with a heartbeat. Currently it has 1 CPU, 1 CPU Fan/Heatsink, and 1 Stick of ram in C1 slot. ATX (24 pin) plugged in and a 4+4 PIN plugged into the CPU slot, in a case and without a case. Also switched power supplies to the one that we know is working from an exact identical setup of EPYC/H11SSL-I-O.

We logged into IPMI successfully but don't see too much in here:
Post Snooping: Screenshot - f4a56723b5e1c5776a188f5a7a25f1ab - Gyazo
Server Health -> Event https://i.gyazo.com/23d90e27e82ad671bf1ccf2f8b142525.png
Hardware Info: Screenshot - d4b9112e6ab0111e54fee4bdce024429 - Gyazo
Irrelevant info but fan speed is set to full because supermicros dont like noctuas very much.

I am not sure if I can get CPU bios Q-codes because all I get is heartbeat power? There is no motherboard (LED1) light turning on currently. Does it need to be jumped? As we tried that too, did not jump.. <-- This is one of my bigger questions as we are definitely able to get heartbeat coming on. Would using the reset pins do anything?

Also tried "Power On" under Power Control on IPMI - but didn't work, said "performing power action failed. Please check."

Do we need VGA monitor hooked up to VGA port on motherboard as well for any of this?
If the CMOS battery was dead, would we still get the heartbeat light (LEDM1) on or no? Also IPMI link is working fine as well.

Any chance the machines were powered on without the heatsinks installed when they arrived at the destination? This could have toasted the CPU's if they forgot to install the heatsinks first?
Nope - but like yesoos said, it would turn itself (and make a loud sounding single beep as it did), I've done it before on a different machine before. It will save itself.
 
Last edited:

yesoos

Member
Mar 10, 2020
33
3
8
PL
FF means is not even started, there will be none if you cannot power on
Extend, Hardware Info: CPU from ipmi when powered on,
You can check IF CPU is recognized, if not there will be some default entries
Did you check CMOS battery voltage? (you cannot perform power on with dead battery , also you cannot update bios , this is checked)
You do not need monitor attached , you can even disable ipmi and VGA on jumpers and try with this set up to power on.
Also check the SP3 socket very closely, if you short some pins by mistake and put CPU the results may be catastrophic , but most of the pins are power distribution so it's not so easy.
IPMI is comlpletely seperate system that will be working if enabled ( jumpers on motherboard) for easy administer the server remotely.
 
Last edited:

Root.ed

New Member
Feb 6, 2020
15
0
1
FF means is not even started, there will be none if you cannot power on
Extend, Hardware Info: CPU from ipmi when powered on,
You can check IF CPU is recognized, if not there will be some default entries
Did you check CMOS battery voltage? (you cannot perform power on with dead battery , also you cannot update bios , this is checked)
You do not need monitor attached , you can even disable ipmi and VGA on jumpers and try with this set up to power on.
Also check the SP3 socket very closely, if you short some pins by mistake and put CPU the results may be catastrophic , but most of the pins are power distribution so it's not so easy.
IPMI is comlpletely seperate system that will be working if enabled ( jumpers on motherboard) for easy administer the server remotely.
I'll check if the CPU is recognized, I don't think it was though. I'll double check.
How can we check the CMOS battery voltage? These were brand new motherboards when they were sent, but I'll check those if I can find a reliable way to do so, or swap them out with a motherboard 'that is working'.

I have no clue if the pins were shorted by mistake, or how I could validate that/prove that. Here's the tests we've done so far:

The broken 2 motherboards:
BROKEN MOBO #1
1) With a WORKING CPU from the system that works: Heartbeat + the CPU fan seems to be getting power and spinning. No power on the Motherboard LED1. Tried to jump it as well - no luck. Strange that the fan is working though. IPMI works.
2) With the 'broken' CPU it came with: Heartbeat only, no fan power. No power on the Motherboard LED1. Tried to jump it as well - no luck.
3) With the OTHER 'broken' CPU from mobo #2: Heartbeat only, no fan power. No power on the Motherboard LED1. Tried to jump it as well - no luck.

BROKEN MOBO #2
1) With a WORKING CPU from the system that works: Heartbeat ONLY, NO FAN POWER. No power on the Motherboard LED1. Tried to jump it as well - no luck.
2) With the 'broken' CPU it came with: Heartbeat only, no fan power. No power on the Motherboard LED1. Tried to jump it as well - no luck.
3) With the OTHER 'broken' CPU from mobo #2: Heartbeat only, no fan power. No power on the Motherboard LED1. Tried to jump it as well - no luck.

WORKING MOBO (#3):
1) With working CPU - of course it works fine.
2) With the 2 'BROKEN' CPUs: Heartbeat only, no motherboard power (I need to double this but pretty sure they didn't get any motherboard power LED1).

I find it's VERY Strange that the WORKING CPU on the broken motherboard is giving it power to the CPU Fan (it spins at full speed too it looks like), but still, no motherboard power... It's hard to tell if this is a motherboard problem, CPU problem, or BOTH. The issue I have here is a WORKING CPU on the 'broken' motherboard didnt work.... and the 'BROKEN CPUs' (2 of them) didn't WORK on the WORKING motherboard... At the end, we plugged in the 'working cpu' back into the 'working motherboard' and it works fine again...

I'm not 100% sure what im looking for as far as the pins go or what pictures could show any visible damage... if we can't get any "motherboard LED1" light, i dont think we can get any CPU error codes right? So we're stuck here.

Any ideas?

Next we'll swap the CMOS batteries from the working motherboard to the 'broken' motherboard.
 

yesoos

Member
Mar 10, 2020
33
3
8
PL
I find it's VERY Strange that the WORKING CPU on the broken motherboard is giving it power to the CPU Fan (it spins at full speed too it looks like), but still, no motherboard power...
Wait..if you powered on and fans spin how's no motherboard power possible, makes no sense...please explain. What is status in ipmi?
 

Root.ed

New Member
Feb 6, 2020
15
0
1
Wait..if you powered on and fans spin how's no motherboard power possible, makes no sense...please explain. What is status in ipmi?
Yes! The WORKING CPU in the "broken" motherboard... The heartbeat light will turn on ... and the CPU FAN (Noctua) thats connected will spin up... but the motherboard LED1 isn't turning on... I'll try to jump it as well, but the CPU Fan is spinning, why? I'll get a video of it exactly. Tested inside and outside of case.

Utterly confused.
 

Root.ed

New Member
Feb 6, 2020
15
0
1
Wait..if you powered on and fans spin how's no motherboard power possible, makes no sense...please explain. What is status in ipmi?
For the post code? It was FF I think.

Also, what can I use to diagnose the CPUs? As we don't get "motherboard LED1" to turn on, so I don't know how we can get any diagnosis on the CPUs itself... or even the motherboard. Is there some part we can use? Surely there is another way to help debug this / rule things out.

Any ideas? As without motherboard power (LED1), we can't tell anything from IPMI right?