Bought some enterprise NVMe SSDs, keep hitting temperature errors

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

AugustaLemke

New Member
Sep 6, 2021
20
3
3
I just bought some Micron NVMe SSDs for my Supermicro server motherboard. The server motherboard sits in a full ATX case with decent airflow.

Upon power up, I boot ESXi. Within 10 minutes of ESXi booting, one of the SSDs fail out. I pop over to the BMC web GUI and it's filled with errors like:

Code:
Temperature    [IPMI-1011] M2NVMeSSD Temp2, Upper Non-recoverable - going high - Assertion
Temperature    [IPMI-1011] M2NVMeSSD Temp1, Upper Non-recoverable - going high - Assertion
Temperature    [IPMI-1009] M2NVMeSSD Temp2, Upper Critical - going high - Assertion
Temperature    [IPMI-1009] M2NVMeSSD Temp1, Upper Critical - going high - Assertion
This is before I even put a filesystem on the drives. So there shouldn't be any disk I/O happening.

I didn't have this issue when I used my previous consumer Samsung NVMe SSDs in the same exact same slots for years.

Is there something about enterprise NVMe SSDs that make them more temperature intolerant?

Is the BMC the thing that is shutting down the port because of some threshold value? I'm very shocked to see this and don't know what to make of it.

My theory is that both the consumer SSDs and my new enterprise SSDs had the same temperature values BUT the enterprise SSDs now support the feature of reporting temps to the BMC and making it freak out, causing it to shut down the port. Could that be possible, or is it more likely that the SSD controller is shutting itself down from the hot temps?
 
Last edited:

ano

Well-Known Member
Nov 7, 2022
654
272
63
which drives? what cooling/setup? most enterprise nvme needs airflow
 

AugustaLemke

New Member
Sep 6, 2021
20
3
3
which drives? what cooling/setup? most enterprise nvme needs airflow
I previously had 1TB 970 Samsung EVOs and replaced them with 2TB Micron 7450s.

For cooling, I just have two intake fans in the front of the case and an exhaust fan in the back. It's just a generic full ATX desktop case, nothing fancy.

Seeing that the enterprise drives need extreme cooling and the consumer drives don't, is there a good solution for this with my ATX case? The drives are 22110 NVMe so I'm not sure what I can do there.
 

ano

Well-Known Member
Nov 7, 2022
654
272
63
good drives, what temp are they reporting? smarctl -a should tell you

they dont need extreme, they just cannot handle still air. but you sound like your good, pics would help
 

DarkServant

Member
Apr 5, 2022
53
53
18
...but dont forget to use thermal-pads of different thickness. The Controller and DRAM have a different height than the NAND-Packages and PLP-Capacitors (some 0,5mm probably). Most of these Heatsinks are made for single-sided m.2 SSD's --> risk to damage some SMD-parts on the backside. custom heatsink for the optane 905p
Of course check the temperature as already mentioned via SMART.

And update the BIOS & BMC & SSD F/W to the latest version available. But can be that the BMC makes the mess (everything is possible...).
 
  • Like
Reactions: Tickety-boo