Mystery CPU overheating and RAM speed Q (Supermicro X12SPL-LN4F)

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

Moonshine

New Member
Mar 26, 2022
5
2
3
Hello,

I've done quite a few sever builds over the years, but this is my first Supermicro and first Xeon 3rd gen. This is a new home NAS/App server build and I'm still at base level (no cards or drives added) so I've got:
  • Supermicro X12SPL-LN4F (LGA-4189)
  • Intel Xeon Silver 4310 (12 Core)
  • Dynatron N8 passive CPU cooler
  • 2 x Noctua NF-R8 80mm PWM fans (less than 2" away exhausting heat)
  • 128GB (4x32G) NEMIX DDR4-3200 ECC RDIMM 2Rx4 in slots A,C,E,G
This is in a Chenbro RM31616 chassis that has 4 x 80mm fans behind the backplane.

My main mystery -- CPU overheating:

The CPU will overheat and literally hit thermal shutdown just sitting in setup (!) on the Supermicro. Fans are all operational and I feel good airflow. The CPU cooler has a good interface w/ thermal paste -- it definitely is pulling heat and gets hot. The two 80mm fans are under 2" away and exhausting fine.

For some reason I don't see any CPU temp stats in setup (am I missing something?) and they don't seem to be available via the BMC while in setup -- so basically I'm blind, but it will thermal and shutdown if left for several minutes, with an event logged in the BMC.

So, to get sensor stats available via the BMC I decided to just get it to boot up to something, so I connected TrueNAS install media and just let it sit at the main install menu. As soon as that boots up, I can see the temperature start (very very high like >90C) but then quickly fall back down to reasonably normal idle temps (44-46C with case cover off idling).

So it's acting CPU PM is basically non-existent until booted into some form of OS controlling it(?).

What I've tried:
  • Pulled the CPU just to make sure the interface was good - seemed fine and re-seated everything very carefully
  • Starting with optimized defaults in Supermicro setup (reset several times)
  • Updated the BMC firmware to latest
  • Updated the BIOS firmware to latest
  • A few settings related to CPU PM, although this is definitely beyond my knowledge
In setup I do see settings related to CPU PM being controlled by the BIOS vs OS, etc. I've been going with optimized defaults figuring that should be safe. Obviously there shouldn't be any load, and I have to imagine the setup defaults are sane, so some other interaction with something? (ram?)

Anyway, any thoughts would be really appreciated here as I'm out of ideas. Obviously I didn't buy it to sit in setup and it appears it should work fine when booted, but the idea that I have a CPU heat time bomb outside a booted OS doesn't sit well at all. :(

Minor question:

In trying to figure this out, I noticed the RAM was running at some lower speed (rather than 3200Mhz) when set to auto -- which I've seen before on other builds. So I tried setting it to 3200Mhz manually. In the BMC I see it running at 2666Mhz though. Not a huge deal, but I'm curious why that might be. Just falling back because it doesn't see it as capable or potentially some other setting?

Sorry this got long, and thanks in advance for any thoughts.

-James
 

RolloZ170

Well-Known Member
Apr 24, 2016
5,363
1,612
113
first the Dyn.N8 is a Vapor Chamber type Cooler. function begins at over 55 degree, this needs (very) high Airflow to proper function, like in a Server Case with Air-shroud. the cooling fins are to close to the CPU base. you will not keep this cooler.
For some reason I don't see any CPU temp stats in setup (am I missing something?)
the BIOS is not responsible for sensors, the BMC is the Hypervisor and controls the FANs.
There are two areas of FANs, CPU and Drives. CPU FANs should be on FAN1,2,3...
 

Moonshine

New Member
Mar 26, 2022
5
2
3
Thanks! Feel stupid for overlooking that on the RAM, but that's settled.

The fans are on 1,2,3...(not A,B) and there is good flow through (front to back). I'm still just puzzled on the CPU heat though. Don't get me wrong, I'm fine buying another cooler if it won't cool things under load. But why would it load up so much only when not booted into anything -- just sitting in setup? Are you saying it's because the fans aren't being actively controlled by the BMC while in setup? I know the four fans behind the backplane are still running, but will have to check the exhaust fans.

I've had it running TrueNAS all day, not under a lot of stress, but it's stayed a solid 56C with cover on. If I reboot into the Supermicro setup it will ramp up to shutdown temp in a matter of minutes. (Although there doesn't appear to be any way to see the temp while in there.) I've just never seen a CPU heat up so much while it should be doing so little. It makes me suspicious of the power settings like "Power Performance Tuning" which defaults to "OS Controls EPB" vs "BIOS Controls EPB". But I can't imagine people run into this with the defaults.
 

RolloZ170

Well-Known Member
Apr 24, 2016
5,363
1,612
113
I've just never seen a CPU heat up so much while it should be doing so little.
how many CPU have you seen on this setup ?
there is also a chance of defective Cooler and/or CPU. noctua FANs running to slow if only 80mm ones. server FANs running typical 3500 RPM in this size, first at max. RPM in setup.
 

i386

Well-Known Member
Mar 18, 2016
4,245
1,546
113
34
Germany
Passive heatsinks require a lot of static pressure to push air through the heatsink fins, that's why server chassis have fans with high rpm and certain fan blade designs and use air shrouds to guide/force the air to the heatsink
 
  • Like
Reactions: RolloZ170

Moonshine

New Member
Mar 26, 2022
5
2
3
Thanks for the input guys! It sounds like my cooler choice (and exhaust fans) are more suspect than the fact it behaves differently in setup (and not booted in general). Still a little curious why it would load up so heavily with basically no activity, but no problem swapping the cooler out, as I'd love to get this racked up. :)

I haven't found a lot of interesting LGA 4189 cooler options for my 3U case, but I ordered a Dynatron N6-DYN to swap and try. I'm definitely open to other suggestions though, if that doesn't look like it will do it.
 

RolloZ170

Well-Known Member
Apr 24, 2016
5,363
1,612
113
Still a little curious why it would load up so heavily with basically no activity
the power saving funtions like EIST are not active in BIOS setup thought. or you got a unlocked ICE lake by mistake, who knows.
 

Moonshine

New Member
Mar 26, 2022
5
2
3
Dynatron nightmare again ?
Hopefully not... But honestly, if you have a better suggestion that's commercially available and fits a 3U case my Visa is ready! :)

Supermicro had 3 active coolers for 4189, two are 4U one is 2U (which I considered, but felt something a little larger could do better). Other than that I found some 4U Noctua's and not much else. That N6 claims support up to 270w, so I'd like to think it can handle a 120w cpu. Should know in a couple days.
 

RolloZ170

Well-Known Member
Apr 24, 2016
5,363
1,612
113
That N6 claims support up to 270w, so I'd like to think it can handle a 120w cpu.
N6 has 3 heatpipes(even the N11 has 5 Heatpipes for 250W 60mm FAN at 8500RPM), 270W at 3000RPM ? no way: this is crap.
the 2U supermicro's FAN goes up to 10000 RPM.
go to 4U CASE or live with much noise.
 

nabsltd

Well-Known Member
Jan 26, 2022
421
283
63
Supermicro had 3 active coolers for 4189, two are 4U
I bought a Noctua NH-U12 for a 4U case (Intel P4000M), and it was about 3mm too tall at 158mm. The problem is that almost every cooler with a 120mm fan is over 155mm tall, which makes it too high for most 4U cases. All the "4U" Supermicro heatsinks have a 92mm fan, so they fit easily. If you had a different socket, you'd have a wide choice of 92mm fan variants, or you could even use a 120-140mm low profile cooler (where the fan points downward).
 

Moonshine

New Member
Mar 26, 2022
5
2
3
Well for anyone who finds themselves in a similar situation, tonight I swapped things out. The Dynatron N6-DYN **barely** fit in my 3U Chenbro RM31616 -- we're talking a mm or two to spare. :oops: However swapping that in and replacing the two 80mm Noctua with a couple higher speed fans ended up dropping the CPU temp by ~32C. So thanks to everyone for their help!
 
Last edited: