DaveInTexas,
Thanks for helping the newbie out. I thought I had quoted this post.
https://forums.servethehome.com/ind...9-x10-x11-fan-speed-control.10059/post-143279 That describes a SM tech note.
Ah, gotcha. Thanks for explaining.
Incidentally, here is a link to your mobo's manual. Supermicro tends to make these difficult to find:
https://www.supermicro.com/manuals/motherboard/C606_602/MNL-1306.pdf
Here again are my questions/comments
This is a very interesting paper. I am afraid I do not understand.
How are the thresholds determined? For the 4 different settings?
The plots are? Optimal (Orange?), Standard (Green?), Heavy I/O (not described) and Full (not described). Can RdPkgConfig(), COA and IndexValue16 be modified somehow?
Well, my understanding/experience is the exact %age of fan speed can vary depending on which BMC chip your board has. However, generally speaking they seem to approximately break down like this:
- Optimal: 30% or higher
- Standard: 50%
- Heavy I/O: 70%
- Full: 100%
For "optimal" I say "or higher" because in Optimal mode, the BMC will adjust the fan speed up based on temperature sensor readings. However, it will not adjust it below ~30% or so fan speed. In fact, this is why so many Supermicro geeks are interested in fan controller software; because, the built-in fan control thresholds are rather rudimentary.
It's possible some fan controllers may split the fan power %age between fan zones when Heavy I/O is selected and there are at least 2 fan zones.
Is it possible to convert DTS to degrees C? I assume DTS is the raw uncalibrated sensor output?
Yes and no. I find the wording of that article confusing, at least relative to my understanding of DTS, CPU temperatures, and what a user is able to discern when polling CPU temperature as reported by IPMI or other tools that gather metadata from the CPU and report its temperature in a user-friendly manner.
DTS = Digital Thermal Sensors. In practical terms, it's a terrible name. It is an Intel thing. Go figure on Intel + terrible naming conventions, and Intel's innate ability to make simple concepts complicated for no particular reason. I'm not sure which company is worse: Supermicro or Intel. LoL.
The purpose of DTS is to inform the BMC of the CPU's maximum operating temperature. The DTS calculation is how far away the current CPU temp is from breaching the CPU's maximum design temp. There are actually two (2) thresholds reported by modern Intel CPUs to the BMC: a High temp and a Critical temp, though it's possible some chips only report one (critical). The Critical temperature is the one you need to pay most attention to, and that seems to be what this graph is referring to.
I don't agree with some of the comments/language in that tech note. Not that anyone cares, but I think the author makes it sound as if the temperatures involved are not measured in Celsius/Centrigrade, which is not true. Semantics aside, let's focus on how this works.
As shown in the graph, the BMC is going to look at the CPU temperature metadata, and compare it to the DTS indicated Critical CPU Temperature threshold. Here, the "DTS reading" is a calculation of how far the current CPU temp is (in Celsius) from its critical threshold, expressed as a negative number (degrees Celsius below the DTS limit). That's how the BMC is looking at these numbers internally. However, if you poll the CPU temp using something like IPMI, you're going to receive the actual current CPU temperature measurement in Celsius, which has nothing to do with the DTS. Let's say you have 50 degrees C for the CPU temp, and that turns out to be DTS -40 for that particular CPU. Well, the user doesn't get told -40... the user gets told the temp is +50.
I do understand the point of the article and the DTS graph. While users tend to focus on how hot a CPU or hard drive temps are, etc. in reality the BMC fan controller doesn't care about how hot the CPU is directly, nor whether you as user think any given temperature is good or bad. The BMC only cares about the DTS reading relative to the DTS itself. In other words, the BMC just wants to know, "How far is the temperature from what I've been told is its maximum operating temperature?" Users tend to be far less comfortable with their CPUs operating at critical temperature than the BMC is. However, the BMC fan controller is going to do everything in its power to keep the CPU temp from
exceeding that critical temp.
When the BMC is controlling fan speeds automatically, it will ramp up the fan speeds as the CPU temp approaches the DTS limit. The graph seems to demonstrate in part, that if the CPU temp comes within -10 or -15 degrees C of the DTS, the fans are going to be ramped up to 100%. This will happen regardless of what mode the fans are in when the BMC is controlling the fan speeds. I'm not sure what the box that says, "5 DTS Counts Hystereses" is supposed to represent, since the fan hysteresis (correct spelling) is the interval of RPMs the BMC fan controller tracks fan speeds in. You may think of the fan hysteresis as a rounding number. For example, if the fan hysteresis is 100 RPM and the actual RPM were say, 65 RPM, then the BMC would report the fan speed as 100 RPM (65 rounded up to 100). If the raw fan speed was actually 40 RPM, the BMC would round it down to 0 and report a speed of 0 RPM. That is a very basic, but practical explanation of fan hysteresis.
The %fan that you can control using the IPMI is the lower left starting point?
Yes and no. When controlling the fan speeds manually - presuming your board allows it - you can set the fan speed to 0% or 1% on the low end (depending on the BMC controller minimum is either 0 or 1... for most, it's zero). However, if you set the CPU fans to 0% then it's going to quickly go into panic mode and ramp up the fans to maximum whether you want it to or not. So, what I am surmising the graph is trying to show on the left side is some minimal fan speed percentage (i.e. 20% on this graph).
Does that make sense? To answer your question a bit more straight-forwardly, in manual fan mode you can set the starting point to whatever you like, but if it's too low then the BMC is going to force the fans higher. At what point that happens depends on another group of settings; namely, the Lower Fan Speed Thresholds.
You can set this for both the I/O fans or the CPU zone fans?
Yes, if your board has 2 fan zones, and its BMC controller and motherboard allow manual fan speed control.
I should probably add another point of clarification on that: In some (rare) circumstances, the BMC controller on a board CAN support manual fan speed manipulation via IPMI, however it's disabled by the board manufacturer. Supermicro has quote a few boards that don't allow manual fan speed control. Most of them do not have a BMC controller, which would explain the cause pretty clearly, but there are a small number of their boards with BMC controllers where manual fan speed control is disabled by Supermicro on purpose (I have no idea why).
For optimal and standard settings, the default is the same but for Heavy I/O the threshold is split?
Depends on the individual motherboard, but for the most part no. As mentioned earlier, those settings are usually different. You might find some board controllers where the Standard and Heavy I/O, or Optimal and Standard settings start out at the same base level fan speed, and then ramp up more aggressively for one mode versus the other. However, you're always going to find that Optimal is the least aggressive and Heavy I/O is the most aggressive in terms of fan speed ramp-up.
The mother board I am having the most trouble with is a X9DRH-7F. It has passive CPU coolers in a 2U 12 drive chassis, The 3 chassis fans run at 4500 RPM at idle. it is very loud. I have it in Optimal setting. I am looking at some hardware solutions since I cannot get the motherboard to respond to the IPMI command and the syntax of the IPMI command structure is not clear.
According to the manual for the X9DRH-7F board, it has 2 distinct fan zones (FAN1-6 and FANA/B). It doesn't matter what Supermicro calls the fan zones. In this board's case they call them "CPU/System" and "Slot I/O" but that is irrelevant for our purposes. They are all 4-pin PWM fan headers. The question is, are they addressable from the BMC via manual fan control?
This board has the Nuvoton WPCM450R BMC controller. You may find this to be a marginally useful resource on that chipset:
Nuvoton WPCM450R IPMI Chip with ATEN-Software - Thomas-Krenn-Wiki
The Thomas Krenn article does not refer to your board specifically, but that page will set your expectations on what you likely can or cannot do with IPMI on that board.
There's no way to know for sure if you'll be able to manually control the fans on your board or not until you try poking around with direct commands using IPMI from a command line terminal. More on that below. If that isn't an option for you, I would suggest you investigate swapping out your fans. There are a few posts on this forum from folks who have done it. I have personally experimented with a 4U Supermicro chassis and had some success with substituting Noctua fans for the front mounted jet engine fans in that particular case. However, it's not all gravy. There are some downsides to this approach, including the fact you may need to run after-market quiet fans at full speed all the time, and they may not provide sufficient cooling, depending on your use case.
Now, to probe whether or not you are able to manually control your fans. Supermicro X9 boards - when they allow manual fan control - only allow it at the fan Zone level, and not the individual fan level. My suggestion is to try setting your fan zones to 50% power, and listen and/or monitor your fan sensors to determine if the changes take hold or not.
You must first set the fan mode to FULL. It's an odd quirk of Supermicro.
Fan zone 0 should be the CPU/System fan zone for this board (FAN header IDs FAN1-FAN6), and fan zone 1 should be the other zone (FANA/FANB). However, in order to prevent potentially damaging your CPU and/or triggering the fan too-low thresholds in the BMC, I would recommend you start by trying to force the fans to 50% speed and see if you can detect a change or not. You will know whether it works or not, because since the first step is to place the fans in FULL mode, they will be at 100% power and stay there if the commands to control them are unsuccessful, so you'll know if it's going to work or not right away.
- Set IPMI fan mode to FULL: raw 0x30 0x45 0x01 0x01
- Set fan zone 0 to duty cycle 7F (50%): raw 0x32 0x91 0x00 0x7f
- Set fan zone 1 to duty cycle 7F (50%): raw 0x32 0x91 0x01 0x7f
Hopefully, all this info is helpful!