Temperature reporting in iLO vs lm-sensors on DL385 G10

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

gradinaruvasile

New Member
Jun 29, 2017
23
5
3
44
So we have a DL385 G10 server with a single CPU (EPYC 7281). We have Proxmox (Debian 9 based virtualuzation "distro"). The server works just fine it's stable.

Now, i compiled a kernel on it (to be used on desktop computers, very generic with thoudands of modules) and i was lazy to wait for completion on my 4 core desktop so i compiled it on the server with 30 threads. I took the chance to check temperatures and powet usage via iLO.

To my surprise iLO, after the whole cpu was fully loaded for a few minutes (compilation plus the 18 VMs running on it), it showed 40 something degrees for the cpu with the fans staying on 20% (fan profile was the "optimal" one).

I check "sensors", surprise: there are 4 temps reported (one for each die?), all between 70 and 75 degrees.

I redid the check with "increased cooling", something similar, fans almost constant at about 25%, 'sensors' report about 70 C(compilation finished about the time 70C was reached).

Here are my reported temps and fan speeds, first "sensors" then the iLO readings via ipmitool:
Code:
Every 5.0s: sensors;ipmitool sdr |grep 'CPU\|DutyCycle'                                                                      

k10temp-pci-00db
Adapter: PCI adapter
temp1:        +65.5°C  (high = +70.0°C)

k10temp-pci-00cb
Adapter: PCI adapter
temp1:        +64.0°C  (high = +70.0°C)

k10temp-pci-00d3
Adapter: PCI adapter
temp1:        +63.8°C  (high = +70.0°C)

k10temp-pci-00c3
Adapter: PCI adapter
temp1:        +65.5°C  (high = +70.0°C)

02-CPU 1         | 40 degrees C      | ok
03-CPU 2         | disabled          | ns
Fan 1 DutyCycle  | 25.87 percent     | ok
Fan 2 DutyCycle  | 25.87 percent     | ok
Fan 3 DutyCycle  | 25.87 percent     | ok
Fan 4 DutyCycle  | 25.87 percent     | ok
Fan 5 DutyCycle  | 25.87 percent     | ok
Fan 6 DutyCycle  | 25.87 percent     | ok
CPU Utilization  | 0 unspecified     | ok
CPU_Stat_C1      | 0x00              | ok
CPU_Stat_C2      | 0x00              | ok
Eventually after the sensors-reported CPU temps were at about 70C the fans did spin up a little, up to 26.66%.

Also i checked cpu frequencies with "cpufreq-aperf", they were all above 2.5 Ghz up to 2.68 (top boost is 2.7 Ghz) so they don't seem to clock down internally.

So what's the deal here?

I read up on it and it seems this 40C iLO reporting affected older Intel systems too. Additionally i'm not sure about lm-sensors accuracy either since some reported artificially increased temperatures for Ryzen systems but i'm unsure this affected EPYC aswell.

Should i be concerned?

I have iLO, BIOS, "power management" firmwares to the latest version.