Supermicro H12SSL-i & Noctua

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

Stofarius

Member
Aug 11, 2023
33
5
8
Hello everyone,

I've been running a server with this motherboard + Epyc 7513 +512GB of RAM for a while and recently I've changed one of the case cooler I had (Chieftec) with a Noctua Redux NF-P12.

I am using now the second Noctua, I thought that the first one is damaged and I returned it, but I have the same issue.

The operating state in the IPMI console is red and the status is critical. After reading on these forums I changed its lower speed, but even so, no effect!

For a while, I noticed that if I set the Fan Mode to Full Speed it is working ok, but not anymore. If I return to the original cooler, I have no issue.

I am running some LLM queries so the load is pretty high, that's why I upgraded the cooler.

Any ideea why this issue, what should I do? Any other cooler which you're using?
 

Attachments

i386

Well-Known Member
Mar 18, 2016
4,249
1,547
113
34
Germany
Edit: after swapping fans you have to reset the bmc to "detect" the new fans
Noctua Redux NF-P12
Did you read the specs?
min rpm: 450rpm(+- 20%) @ 20% pwm -> 360 to 540 rpm @ 20% pwm
360 rpm is lower than what you have set as low ct

The bmc in supermicro boards runs the fans @ 30% in the "optimal" fan settings.
30% of 1700rpm = 510 rpm > +-20% thats 408 to 612 rpm.
408 rpm is still lower than what you have set as low ct and will trigger an alarm
 

Stofarius

Member
Aug 11, 2023
33
5
8
Edit: after swapping fans you have to reset the bmc to "detect" the new fans

Did you read the specs?
min rpm: 450rpm(+- 20%) @ 20% pwm -> 360 to 540 rpm @ 20% pwm
360 rpm is lower than what you have set as low ct

The bmc in supermicro boards runs the fans @ 30% in the "optimal" fan settings.
30% of 1700rpm = 510 rpm > +-20% thats 408 to 612 rpm.
408 rpm is still lower than what you have set as low ct and will trigger an alarm
Thank you for your explanation. But, now, I've set the FANs to "full speed" and I still have the warning. At the beginning, when using this setting it was ok...
 

Stofarius

Member
Aug 11, 2023
33
5
8
Do you have more details like an error message or a screenshot from ipmi?
Can you see the screenshot from my first post? That's all I have.

I also changed the values from the console like this:

ipmitool -I lan -U ADMIN -P ADMIN -H 192.168.10.60 sensor thresh FAN2 lower 800 900 1000
The health status is still Critical.

Output from ipmitool is below (no error as I can see, here):
sudo ipmitool sensor
CPU Temp | 31.000 | degrees C | ok | 5.000 | 5.000 | na | na | 100.000 | 100.000
System Temp | 35.000 | degrees C | ok | 5.000 | 5.000 | na | na | 85.000 | 90.000
Peripheral Temp | 37.000 | degrees C | ok | 5.000 | 5.000 | na | na | 85.000 | 90.000
M2_SSD1 Temp | na | | na | na | na | na | na | na | na
M2_SSD2 Temp | na | | na | na | na | na | na | na | na
CPU_VRM Temp | 41.000 | degrees C | ok | 5.000 | 5.000 | na | na | 100.000 | 105.000
SOC_VRM Temp | 36.000 | degrees C | ok | 5.000 | 5.000 | na | na | 100.000 | 105.000
VRMABCD Temp | 39.000 | degrees C | ok | 5.000 | 5.000 | na | na | 100.000 | 105.000
VRMEFGH Temp | 42.000 | degrees C | ok | 5.000 | 5.000 | na | na | 100.000 | 105.000
P1_DIMMA~D Temp | 33.000 | degrees C | ok | 5.000 | 5.000 | na | na | 85.000 | 90.000
P1_DIMME~H Temp | 32.000 | degrees C | ok | 5.000 | 5.000 | na | na | 85.000 | 90.000
FAN1 | 2800.000 | RPM | ok | 280.000 | 420.000 | na | na | 35560.000 | 35700.000
FAN2 | 1680.000 | RPM | ok | 840.000 | 840.000 | na | na | 35560.000 | 35700.000
FAN3 | 1680.000 | RPM | ok | 280.000 | 420.000 | na | na | 35560.000 | 35700.000
FAN4 | na | | na | na | na | na | na | na | na
FAN5 | na | | na | na | na | na | na | na | na
FANA | na | | na | na | na | na | na | na | na
FANB | na | | na | na | na | na | na | na | na
12V | 11.984 | Volts | ok | 9.680 | 9.936 | na | na | 14.480 | 14.736
5VCC | 4.930 | Volts | ok | 3.940 | 4.030 | na | na | 5.920 | 6.010
3.3VCC | 3.293 | Volts | ok | 2.613 | 2.681 | na | na | 3.922 | 3.990
VBAT | 0x4 | discrete | 0x0400| na | na | na | na | na | na
VDDCR | 0.823 | Volts | ok | 0.400 | 0.499 | na | na | 1.732 | 1.804
VMEMABCD | 1.219 | Volts | ok | 0.979 | 1.003 | na | na | 1.465 | 1.489
VMEMEFGH | 1.228 | Volts | ok | 0.976 | 0.997 | na | na | 1.466 | 1.487
VDD_5_DUAL | 5.039 | Volts | ok | 4.019 | 4.139 | na | na | 6.029 | 6.149
VDD_33_DUAL | 3.310 | Volts | ok | 2.613 | 2.681 | na | na | 3.922 | 3.990
SOCRUN | 0.855 | Volts | ok | 0.297 | 0.489 | na | na | 1.143 | 1.341
SOCDUAL | 0.893 | Volts | ok | 0.711 | 0.725 | na | na | 1.068 | 1.082
Chassis Intru | 0x0 | discrete | 0x0000| na | na | na | na | na | na
 

i386

Well-Known Member
Mar 18, 2016
4,249
1,547
113
34
Germany
low nr (not readable?): 560rpm
low ct (critical?): 700rpm

you increased the lower end :D
 

NPS

Active Member
Jan 14, 2021
147
44
28
low nr (not readable?): 560rpm
low ct (critical?): 700rpm

you increased the lower end :D
NR is "non recoverable". So I guess if you get below that value, the BMC could hold up the red flag "forever" (until you reset/reboot the BMC).
 
  • Like
Reactions: i386 and nexox

nexox

Well-Known Member
May 3, 2023
695
283
63
After enough dips below the various thresholds the BMC may just mark a fan as failed until you reset everything, at least that's how my older Supermicro systems work. There should be an option in the web console to reboot the BMC (leaves the system running still but interrupts IPMI for a minute) that will reset the fan state, or you can unplug the whole system from the wall for a minute.
 
  • Like
Reactions: i386

lemonhead94

New Member
Mar 30, 2024
2
0
1
I'm facing the same issue "sudo ipmitool sensor thresh FAN2 lower X X X" outputs a message that the fans have a new threshold, but this is not reflected in "sudo ipmitool sensor list all". The output remains the same.
The same if do "sudo ipmitool sensor get FAN2".

Also, I'm running the newest bios and bmc for the H12SSL-i.
And I have reset the bmc using: "sudo ipmitool mc reset warm".

Another thing I've read is that one can only used FAN1-5 not A & B which I have not used.

Am I simply misunderstanding how this is supposed to work?