MinisForum MS-01 : heating problem

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

plonka2000

New Member
May 20, 2024
28
14
3
So it's all built up and in use......

CoreTemp reports at idle most cores 56-58c.
Max temps, most cores 72c, but the odd few cores hit 92c (like 2).
The seems good, because that's what I believe I was seeing out of the box as well but you might have some hotspots.
Personally, I think the paste is degrading quickly, which is why the PTM7950 replacement discussed here seems to be such a winner.

However, related to recent discussion on nct6798 board sensors, what the "System" temperature you'e getting?
Are you able to post screenshots of what you see?
 

plonka2000

New Member
May 20, 2024
28
14
3
That's easy, you can download HWMonitor or HWInfo for free.
These are official links to their sites, beware of non-official links.
Personally, I use the "portable" versions so I don't need to install them.

HWMonitor is a bit easier to use than HWInfo, but HWInfo gives a lot of sensors info.
There are plenty of others like FanControl and MSI Afterburner (My preferred on desktop).

Have fun, let us know how you get on.
 

ProximusAl

New Member
Jan 17, 2023
14
5
3
I've got HWMonitor now, but I can't see "System" temperature anywhere.

Anything specific you want the value of?
 

wadup

Active Member
Feb 13, 2024
138
101
43
Hey thanks, I decided to order a replacement while it was at a good sale price on Amazon, and just arrived today.
Thankfully, I ordered through Amazon which is cheaper (739 euros vs 779 on their own site), faster (Next day with Prime) and I think better support (Amazon processed my refund request even after 4 months). I'd highly recommend using Amazon for this machine, not to big them up but their support network is so much better.

Anyway, the PTM7950 and TGP8000PT from MODDIY is arriving this week, so I'll be ready to set install it immediately and migrate to that. I'll take pics for anyone interested.
Then I'll send the current unit back to Amazon for the refund.

I applied the same CPU fan curve as you, mine was a bit different:
Temp1: 40
Temp2: 50
Temp3: 65
Temp4: 85
PWM1: 50
PWM2: 70
PWM3: 90
PWM4: 150

VS your curve:
Temp1: 20
Temp2: 30
Temp3: 50
Temp4: 60
PWM1: 50
PWM2: 80
PWM3: 110
PWM4: 255

I may look at applying a similarly more aggressive curve to the other fans.

Also thanks for the powersave script, I applied the script to run at array start on my Unraid and I've seen low temps, even as the summer temps rose a little bit over the last few days in the heatwave, and its been very good.
View attachment 38347

Idle temps are much lower, even with all my services "running".
The MB temp is still alarmingly high at 117c, which wasn't the case when I got the unit at first... Not sure what's going on there.

If I look at all the sensors, I see the that the components including NVMEs are not this hot, but the board temps are sky high:
View attachment 38348View attachment 38350

Next weekend I expect I'll be able to have everything done, and will be able to see the difference.
So you experienced the same thing I did with performance instead of powersave my temps could get nuts under big load. Once you apply PTM7950 my idle temps dropped like 5C and under load I saw 15C improvement. Yeah your mb temps are high mine are at 35C but I don't have intensive things running just a dozen dockers. Do you have anything in the PCI slot?
 

plonka2000

New Member
May 20, 2024
28
14
3
I've got HWMonitor now, but I can't see "System" temperature anywhere.

Anything specific you want the value of?
Hey @ProximusAl, I don't have windows on my MS-01 but I'm sure HWMonitor and HWInfo64 support reading the nct6798 chip, as this old pic I found online shows, it should show up. I don't have that chip in my laptop but the best thing to do is take a screenshot and post it here. :)
I'd also advise try using HWInfo64 as well and check the SENSORS panel, the temps you want should show up as "Motherboard" and a couple extra sensors. Both tools are free so won't cost you anything.
For reference this is a screenshot from my laptop, not from an MS-01 so yours will look a little different:
1723664472763.png

So you experienced the same thing I did with performance instead of powersave my temps could get nuts under big load. Once you apply PTM7950 my idle temps dropped like 5C and under load I saw 15C improvement. Yeah your mb temps are high mine are at 35C but I don't have intensive things running just a dozen dockers. Do you have anything in the PCI slot?
@wadup I'm gtting ridiculous temps under load, and with powersave its saved my bacon big time.
I haven't seen any performance issues while using powersave, all services seem to run without any issues or hiccups.
But it looks like the PTM7950 itself arrived today (the TGP8000PT appears to have shipped separately and will arrive in a couple days):
1723664864927.png

So as it is, almost everything is arrived so I can at least get started and take some pics.
I'll probably open the packages tonight and measure the 13900H die, and see about cutting the PTM7950 to fit.
 

plonka2000

New Member
May 20, 2024
28
14
3
Well I couldn't wait, so I decided to get right into preparing the CPU. More details are also available in this post by @wadup.
As it is, I'm pretty disappointed with the thermal paste application on this "brand new" unit.

Got the machine out and ready:
1723669094878.png1723669226843.png

Exposing the CPU die already showed the paste is applied thin and most of it appears to be on the edge.
In my opinion, this is very bad, I can see the die right through this layer:
1723669349462.png1723669409376.png1723669451739.png

I have no idea how my current MS-01 is prepared, but if it like this I can see why the temps went up and up over 4 months.
The paste was already drying and flaky when I cleaned it off:
1723669594836.png1723669632931.png

Got everything cleaned off and now its ready for the PTM7950 and TGP8000PT when that arrives:
1723669789512.png1723669849664.png1723669887770.png

And I measured the die, I'd call it 25x10mm:
1723669949875.png1723669971518.png

On another note, the thermal pads don't seem to be making as much VRM contact as I would like.
You can see by the way the light reflects off the pads, that there isn't much contact.
I ordered the 1mm thick pads, but I might advise using 2mm pads if you choose to do so... :confused:
Or in the very least, tightening the screws on under the adhesive cover, to increase VRM contact:
1723670109826.png1723670140795.png

So I've put the unit aside for now, and as soon as the pads arrive, I'll have them all installed.
 

plonka2000

New Member
May 20, 2024
28
14
3
I dont seem to have a motherboard option.
That's very strange, you don't see it anywhere in either HWMonitor or HWInfo64?
If all else fails, you can look in your boot BIOS, and it will tell you right there...
At some point I might have to get a test Windows install on mine and test myself, but I believe you should be able to see the temps in HWMonitor and HWInfo64.
I don;t know if anyone else can confirm this.

Edit: Looking closer again, I see that the NCT5585D is listed, not the NCT6798, that is odd.
I'm not sure, hopefully someone else can confirm.
You can check the BIOS screen though, and you'll see it there.
 

plonka2000

New Member
May 20, 2024
28
14
3
Well the PTM7950 and TGP8000PT pads are now here, the TGP8000PT arriving today.
I had a couple of hours unplanned, so I decided to get right into installing it and beginning...

Both arrived well packed and undamaged, so good job on MODDIY:
1724094151318.png1724094195097.png

I measured out the PTM7950 and TGP8000PT and cut to size:
1724094383761.png1724094409186.png1724094439788.png1724094533334.png

The installation was honestly very fiddly but took my time and it wasn't the worst.
The recommendation I've seen is to put it in the fridge/freezer for an hour before use to harden it up, I didn't do this but next time I think I would its definitely hard to work with, just ook my time with it. The PTM7950 tears and can flake off very easily, but in the end it will melt when in use.
So I'm not concerned about an imperfect install, I believe it will fill correctly as I've seen in examples of tests using PTM7950:
1724094623641.png

Something that occured to me after installing the cooler back, is that there is too much overlap for the VRMs (Pics in my previous post).
I noticed this when looking closer at the imprints on the factory pads, and I've double checked here using the screws as a rough guide.
So for anyone thinking to attempt this, you don't need to install along the entire heatsink by the heatpipe like I did:
1724095255143.png

Mine is all installed and setup now, so I'm not going to remove the heatsink unless there's a problem.
I haven't migrated my server yet, so no idea of temps yet.

I'll post back probably later once I know more.
 
  • Like
Reactions: Novulon

Dexogen

New Member
Mar 14, 2024
1
0
1
Just wanted to chime in with my two cents. I've been struggling with overheating issues on two MS-01 servers for a while. I tried a bunch of different thermal pastes and application methods, and even used liquid metal at one point (but I ditched that because it started corroding the aluminum). The heat profile was really odd—after the CPU hit 80 degrees under load, the temperature wouldn't drop even after the load was gone. Surprisingly, disabling the E-cores solved the problem. Not only did it boost the performance of my virtual machines, but it also fixed the overheating issue. Now, as soon as the load is gone, the CPU temperature drops right back to a stable 50 degrees.
 

plonka2000

New Member
May 20, 2024
28
14
3
Been real busy but I've migrated my server to the new MS-01 with the PTM7950 and TGP8000PT installed.
I've been monitoring my temps over the last few days and its pretty shocking, actually.

Considering that this was what I was looking at to begin with, even with case off:
1724241883735.png

And then after adjustments with powersave script and fan curve by @wadup, the temps dropped quite well:
1724242053689.png
1724242166280.png

Now I've migrated to a brand new box, with PTM7950 and TGP8000PT installed, and I'm sending my "old" system off today.
Temps I'm now getting are significantly lower again. Now this is what I'm seeing under "normal load".
For me normal load is AT MINIMUM running 18 dockers, 2 Windows VMs:
1724242509791.png

From what I've seen the CPU temps do not rise above ~48c at this point, and this is with THE CASE BACK ON.
I monitored it for over a day with the case off as before, and CPU temps didn't get above about ~43c.
As for the MB/MotherBoard temp, that still seems pegged in the ~114-117c region, which is still alarming, but then @PigLover pointed out that this is some kind of sensor error, and I'm inclined to agree with him after seeing this between 2 separate units. I'm not sure how to adjust this, but I will need to chase this up inside Unraid.

Something especially interesting, is when I reboot and go into BIOS and look there, I see these temps.
Left, with normal stock fan curve: Right, with adjusted fan curve:
1724243055107.png1724243099121.png

What's interesting to observe is that the MB temp is lower yes, but the also the CPU fan speed is ramped up.
So at least in Unraid, the fan speed on the right would appear to be the CPU fan.
This of course has no effect from the powersave script function, as that is applicable from inside Linux OS (No idea how to apply this in Windows yet). Seems a bit off that temps would be higher outside OS without load, but sensor readings are questionable at this point.

Lastly, I've noticed a SIGNIFCANT power usage drop between the two MS-01 units, I don't have exact numbers for individual items, because I measure "total system power" from my MS-01 and all attached storage devices as my measurement. The drop in power is hard to explain.
Suffice to say, I'm running 8 external disks and the MS-01 itself, and the power usage has dropped from ~102-112w to ~83-91w, and again this is total system power. I have changed no other hardware, this is just trading MS-01 unit itself to new one with PTM7950 and TGP8000PT installed:
1724245396468.png

In conclusion, I would call the MS-01 unit swap and fresh PTM7950 and TGP8000PT installation a success. :D
I think its important to state I am now using a new unit, with paste change BEFORE I USED IT, from my 4-month old MS-01 with heat issues.
I'd highly recommend others do the same, I hope this helps to explain to others.

Just wanted to chime in with my two cents. I've been struggling with overheating issues on two MS-01 servers for a while. I tried a bunch of different thermal pastes and application methods, and even used liquid metal at one point (but I ditched that because it started corroding the aluminum). The heat profile was really odd—after the CPU hit 80 degrees under load, the temperature wouldn't drop even after the load was gone. Surprisingly, disabling the E-cores solved the problem. Not only did it boost the performance of my virtual machines, but it also fixed the overheating issue. Now, as soon as the load is gone, the CPU temperature drops right back to a stable 50 degrees.
To add something to this, in Unraid under my use case, I DO NOT use E-cores in VMS. I make sure to pin any VMs to 6 P-cores only, and all Dockers use the 8 E-cores basically exclusively. I've played around with this setting over time, and I have 1 VM at this point that uses 6-P-Cores and 4 E-cores, and a couple dockers that use 4 P-cores and all 8 E-cores (making sure that P-cores are not shared between VMs and Docker containers).

I have 5 VMs in total, but mainly 2 are running and 30 Docker containers, but mainly 18 are running.

Optimising P/E core usage is good for performance, and I find it lowers temps overall.
This is hard to describe perhaps, but here is sample images to help from Unraid OS.
Note that there is a much longer list of Docker containers, MOST of which use no P-cores at all, just these top few:
1724244229023.png
1724244618712.png
 
Last edited:

urbanduck

New Member
Jul 25, 2024
1
0
1
The seems good, because that's what I believe I was seeing out of the box as well but you might have some hotspots.
Personally, I think the paste is degrading quickly, which is why the PTM7950 replacement discussed here seems to be such a winner.

However, related to recent discussion on nct6798 board sensors, what the "System" temperature you'e getting?
Are you able to post screenshots of what you see?
I see high temps for some sensors as well. I think the high temps are measured from near the VRMs, because they were relatively stable until I pulled off the heatsink, and replaced the thermal paste with the Thermal Grizzly Phase Change stuff. I didn't redo the pads on the VRMs.

The CPU temps are much better, though.

So, if I get super concerned, I'll pull the heatsink off again, redo the pads on the VRMs, redo the CPU. But for the moment, I'm not too fussed, since I've had no more CPU throttling (and getting stuck at throttle till reboot).


Code:
nct6798-isa-0a20
Adapter: ISA adapter
in0:                   440.00 mV (min =  +0.00 V, max =  +1.74 V)
in1:                     1.05 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
in2:                     3.36 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
in3:                     3.36 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
in4:                     1.09 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
in5:                   144.00 mV (min =  +0.00 V, max =  +0.00 V)
in6:                   128.00 mV (min =  +0.00 V, max =  +0.00 V)  ALARM
in7:                     3.36 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
in8:                     2.13 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
in9:                     1.02 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
in10:                  144.00 mV (min =  +0.00 V, max =  +0.00 V)  ALARM
in11:                  112.00 mV (min =  +0.00 V, max =  +0.00 V)  ALARM
in12:                  1000.00 mV (min =  +0.00 V, max =  +0.00 V)  ALARM
in13:                  144.00 mV (min =  +0.00 V, max =  +0.00 V)  ALARM
in14:                    1.30 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
fan1:                  2153 RPM  (min =    0 RPM)
fan2:                  1520 RPM  (min =    0 RPM)
fan3:                     0 RPM  (min =    0 RPM)
fan4:                     0 RPM  (min =    0 RPM)
fan5:                     0 RPM  (min =    0 RPM)
fan7:                     0 RPM  (min =    0 RPM)
SYSTIN:                +120.0°C  (high = +80.0°C, hyst = +75.0°C)
                                 (crit = +95.0°C)  sensor = thermistor
CPUTIN:                 +39.0°C  (high = +80.0°C, hyst = +75.0°C)
                                 (crit = +95.0°C)  sensor = thermistor
AUXTIN0:               +111.0°C  (high = +80.0°C, hyst = +75.0°C)  ALARM
                                 (crit = +100.0°C)  sensor = thermistor
AUXTIN1:               +114.0°C  (high = +80.0°C, hyst = +75.0°C)  ALARM
                                 (crit = +100.0°C)  sensor = thermistor
AUXTIN2:               +115.0°C  (high = +80.0°C, hyst = +75.0°C)  ALARM
                                 (crit = +100.0°C)  sensor = thermistor
AUXTIN3:                -52.0°C  (high = +80.0°C, hyst = +75.0°C)
                                 (crit = +100.0°C)  sensor = thermal diode
PECI Agent 0:           +39.0°C  (high = +80.0°C, hyst = +75.0°C)
AUXTIN4:               +109.0°C  (high = +80.0°C, hyst = +75.0°C)  ALARM
                                 (crit = +95.0°C)
PCH_CHIP_CPU_MAX_TEMP:   +0.0°C
PCH_CHIP_TEMP:           +0.0°C
PCH_CPU_TEMP:            +0.0°C
PCH_MCH_TEMP:            +0.0°C
intrusion0:            ALARM
intrusion1:            ALARM
beep_enable:           disabled

nvme-pci-5800
Adapter: PCI adapter
Composite:    +36.9°C  (low  = -273.1°C, high = +76.8°C)
                       (crit = +78.8°C)
Sensor 1:     +36.9°C  (low  = -273.1°C, high = +65261.8°C)
Sensor 2:     +40.9°C  (low  = -273.1°C, high = +65261.8°C)

acpitz-acpi-0
Adapter: ACPI interface
temp1:        +27.8°C

mt7921_phy0-pci-5a00
Adapter: PCI adapter
temp1:        +34.0°C

coretemp-isa-0000
Adapter: ISA adapter
Package id 0:  +37.0°C  (high = +100.0°C, crit = +100.0°C)
Core 0:        +32.0°C  (high = +100.0°C, crit = +100.0°C)
Core 4:        +33.0°C  (high = +100.0°C, crit = +100.0°C)
Core 8:        +32.0°C  (high = +100.0°C, crit = +100.0°C)
Core 12:       +32.0°C  (high = +100.0°C, crit = +100.0°C)
Core 16:       +37.0°C  (high = +100.0°C, crit = +100.0°C)
Core 20:       +36.0°C  (high = +100.0°C, crit = +100.0°C)
Core 24:       +32.0°C  (high = +100.0°C, crit = +100.0°C)
Core 25:       +32.0°C  (high = +100.0°C, crit = +100.0°C)
Core 26:       +32.0°C  (high = +100.0°C, crit = +100.0°C)
Core 27:       +32.0°C  (high = +100.0°C, crit = +100.0°C)
Core 28:       +36.0°C  (high = +100.0°C, crit = +100.0°C)
Core 29:       +36.0°C  (high = +100.0°C, crit = +100.0°C)
Core 30:       +36.0°C  (high = +100.0°C, crit = +100.0°C)
Core 31:       +36.0°C  (high = +100.0°C, crit = +100.0°C)

nvme-pci-0100
Adapter: PCI adapter
Composite:    +35.9°C  (low  = -273.1°C, high = +81.8°C)
                       (crit = +84.8°C)
Sensor 1:     +35.9°C  (low  = -273.1°C, high = +65261.8°C)
Sensor 2:     +36.9°C  (low  = -273.1°C, high = +65261.8°C)
VRM area temperatures:
1724982927730.png

Avg Temp across all CPU cores:
1724983450656.png