EPYC 9654QS unknown source of TDP/PPT power limit on T2SEEP motherboard

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

Chiliben

New Member
Nov 3, 2024
2
0
1
Hello everyone,

not sure if this is the correct spot to post this, as I honestly am unsure what angle to properly approach this problem from:

I have a AMD EPYC 9654QS CPU ( Model No. 100-000000894-04 ) identical to the CPU which is being used in this thread, where a fellow user was/is attempting to overclock the very same CPU.

I run this CPU on a T2SEEP motherboard, which seems to be a slightly modified clone of the supermicro H13SSL-N

My problem is, that SOMETHING is limiting the total power draw of my CPU to 230W, no matter what I do.

I currently run Proxmox on the system but have also tried ESXi and even a straight bare-metal Win10 install, but the issue shows up in every single OS/Hypervisor, and no matter which tools I use, I end up with the same performance limiter, even when I write data directly to the relevant registers.

The weird thing is, that the AMD e-smi-tool reports that the CPU itself is not locked, as can be seen in the output of
Code:
====================== EPYC System Management Interface ======================

--------------------------------------
| CPU Family            | 0x19 (25 ) |
| CPU Model             | 0x11 (17 ) |
| NR_CPUS               | 192        |
| NR_SOCKETS            | 1          |
| THREADS PER CORE      | 2 (SMT ON) |
--------------------------------------

-----------------------------------------------------
| Sensor Name                    | Socket 0         |
-----------------------------------------------------
| Energy (K Joules)              | NA (Err: 1 )     |
| Power (Watts)                  | 122.919          |
| PowerLimit (Watts)             | 360.000          |
| PowerLimitMax (Watts)          | 400.000          |
| C0 Residency (%)               | 0                |
| DDR Bandwidth                  |                  |
|       DDR Max BW (GB/s)        | 154              |
|       DDR Utilized BW (GB/s)   | 0                |
|       DDR Utilized Percent(%)  | 0                |
| Current Active Freq limit      |                  |
|        Freq limit (MHz)        | 3700             |
|        Freq limit source       | Refer below[*0]  |
| Socket frequency range         |                  |
|        Fmax (MHz)              | 3700             |
|        Fmin (MHz)              | 400              |
-----------------------------------------------------

-----------------------------------------------------------------------------------------------------------------
Failed: to get CPU energies, Err[1]: Energy driver not present
-----------------------------------------------------------------------------------------------------------------

-----------------------------------------------------------------------------------------------------------------
| CPU boostlimit in MHz:                                                             |
| cpu [  0] : 3500  3500  3500  3500  3500  3500  3500  3500  3500  3500  3500  3500  3500  3500  3500  3500    |
| cpu [ 16] : 3500  3500  3500  3500  3500  3500  3500  3500  3500  3500  3500  3500  3500  3500  3500  3500    |
| cpu [ 32] : 3500  3500  3500  3500  3500  3500  3500  3500  3500  3500  3500  3500  3500  3500  3500  3500    |
| cpu [ 48] : 3500  3500  3500  3500  3500  3500  3500  3500  3500  3500  3500  3500  3500  3500  3500  3500    |
| cpu [ 64] : 3500  3500  3500  3500  3500  3500  3500  3500  3500  3500  3500  3500  3500  3500  3500  3500    |
| cpu [ 80] : 3500  3500  3500  3500  3500  3500  3500  3500  3500  3500  3500  3500  3500  3500  3500  3500    |
-----------------------------------------------------------------------------------------------------------------

-----------------------------------------------------------------------------------------------------------------
| CPU core clock in MHz:                                                             |
| cpu [  0] : 3500  3500  3500  3500  3500  3500  3500  3500  3500  3500  3500  3500  3500  3500  3500  3500    |
| cpu [ 16] : 3500  3500  3500  3500  3500  3500  3500  3500  3500  3500  3500  3500  3500  3500  3500  3500    |
| cpu [ 32] : 3500  3500  3500  3500  3500  3500  3500  3500  3500  3500  3500  3500  3500  3500  3500  3500    |
| cpu [ 48] : 3500  3500  3500  3500  3500  3500  3500  3500  3500  3500  3500  3500  3500  3500  3500  3500    |
| cpu [ 64] : 3500  3500  3500  3500  3500  3500  3500  3500  3500  3500  3500  3500  3500  3500  3500  3500    |
| cpu [ 80] : 3500  3500  3500  3500  3500  3500  3500  3500  3500  3500  3500  3500  3500  3500  3500  3500    |
-----------------------------------------------------------------------------------------------------------------
*0 Frequency limit source names:
 OPN Max


Err[1]: Energy driver not present

============================= End of EPYC SMI Log ============================
when the system is not under load.

The relevant things to note form this output:

PowerLimit is set to 360W
PowerLimitMax is 400W
Current active frequency limit is 3700 MHz and the source of said limit is OPN Max

However, if I run stress --cpu 192 , the output from the e-smi-tool changes:

====================== EPYC System Management Interface ======================

--------------------------------------
| CPU Family | 0x19 (25 ) |
| CPU Model | 0x11 (17 ) |
| NR_CPUS | 192 |
| NR_SOCKETS | 1 |
| THREADS PER CORE | 2 (SMT ON) |
--------------------------------------

-----------------------------------------------------
| Sensor Name | Socket 0 |
-----------------------------------------------------
| Energy (K Joules) | NA (Err: 1 ) |
| Power (Watts) | 226.282 |
| PowerLimit (Watts) | 360.000 |
| PowerLimitMax (Watts) | 400.000 |
| C0 Residency (%) | 100 |
| DDR Bandwidth | |
| DDR Max BW (GB/s) | 154 |
| DDR Utilized BW (GB/s) | 0 |
| DDR Utilized Percent(%) | 0 |
| Current Active Freq limit | |
| Freq limit (MHz) | 2276 |
| Freq limit source | Refer below[*0] |
| Socket frequency range | |
| Fmax (MHz) | 3700 |
| Fmin (MHz) | 400 |
-----------------------------------------------------

-----------------------------------------------------------------------------------------------------------------
Failed: to get CPU energies, Err[1]: Energy driver not present
-----------------------------------------------------------------------------------------------------------------

-----------------------------------------------------------------------------------------------------------------
| CPU boostlimit in MHz: |
| cpu [ 0] : 3500 3500 3500 3500 3500 3500 3500 3500 3500 3500 3500 3500 3500 3500 3500 3500 |
| cpu [ 16] : 3500 3500 3500 3500 3500 3500 3500 3500 3500 3500 3500 3500 3500 3500 3500 3500 |
| cpu [ 32] : 3500 3500 3500 3500 3500 3500 3500 3500 3500 3500 3500 3500 3500 3500 3500 3500 |
| cpu [ 48] : 3500 3500 3500 3500 3500 3500 3500 3500 3500 3500 3500 3500 3500 3500 3500 3500 |
| cpu [ 64] : 3500 3500 3500 3500 3500 3500 3500 3500 3500 3500 3500 3500 3500 3500 3500 3500 |
| cpu [ 80] : 3500 3500 3500 3500 3500 3500 3500 3500 3500 3500 3500 3500 3500 3500 3500 3500 |
-----------------------------------------------------------------------------------------------------------------

-----------------------------------------------------------------------------------------------------------------
| CPU core clock in MHz: |
| cpu [ 0] : 2250 2250 2275 2275 2275 2275 2275 2275 2275 2275 2275 2250 2250 2250 2250 2250 |
| cpu [ 16] : 2275 2275 2275 2275 2275 2275 2275 2275 2275 2250 2250 2250 2250 2275 2275 2275 |
| cpu [ 32] : 2275 2275 2275 2275 2275 2275 2275 2250 2250 2250 2250 2275 2275 2275 2275 2275 |
| cpu [ 48] : 2250 2250 2250 2250 2250 2275 2275 2275 2275 2275 2275 2275 2275 2275 2275 2250 |
| cpu [ 64] : 2250 2250 2250 2250 2275 2275 2275 2275 2275 2275 2275 2275 2275 2250 2250 2250 |
| cpu [ 80] : 2250 2250 2275 2275 2275 2275 2275 2275 2275 2275 2275 2275 2250 2250 2250 2250 |
-----------------------------------------------------------------------------------------------------------------
*0 Frequency limit source names:
PPT Limit


Err[1]: Energy driver not present

============================= End of EPYC SMI Log ============================


Relevant changes:
The Frequency limit is now 2276 MHz and the source is cited as "PPT Limit"

The motherboard I'm using is rated for 400W TDP and online benchmarks of the EXACT SAME CPU on the EXACT SAME motherboard such as this lead me to believe that it should be possible to achieve the full 400W TDP this CPU is designed for on my setup. During all of these tests, the CPU does not exceed 45°C in temperature, so thermal throttling is not the issue.

I have tried poking around in the BIOS settings, because it really seems to me like the motherboard is somehow limiting the CPU, but the available menus are extremely limited:

AMD CBS.png
CPU common.png
Performance.png

On my current setup, I have not been able to unlock any aditional options or to find anything related to eTDP TDP PPT or similar.

I have determined the UEFI to be an AMI APTIO V UEFI running the GenoaPI-SP5 1.0.0.A AGESA

I have also poked around a bit inside the UEFI files to see if there is perhaps a hidden option which I could somehow unlock or which is blocking me. I've tried using the UEFITool, iptrextractor and UEFIEditor, but as far as I can tell, all the options related to PPT, PBO and TDP are hidden but their defaults are set to "auto" or there are no real limits on them, especially none that set the PPT to 230W

Bear in mind though, that I have never modified, worked on a BIOS/UEFI and am very much stumbling in the dark here, so it is entirely possible that I overlooked something really obvious here.

The one thing I am unsure about, is my PSU. I am feeding the system with a Supermicro PWS-2k04A-1R 2000W PSU, which DOES support PMBus 1.2, and which I have connected, but it is not being properly detected by the BMC, and weirdly being mapped to PSU2 instead of PSU1:

Discrete_Sensors.png
PSU1_Warning.png
Power data.png

However, the documentation of the motherboard is REALLY sparse non-existant.

There is a port specifically labeled PMBUS on the motherboard, which I have confirmed the pinout of, and which is properly connected to the PSU. Beyond that, I do not know what I could reasonably do to adress this problem, and from the BMC page it seems that the software limit is 4000W anyways, so I do not even know with certainty if fixing this issue is worth pursuing or if I am again barking up the wrong tree here. I will admit that it is suspicous as hell though, but I have not found a way yet to modify/verify if something is going on here.

The board supports Redfish and ipmi, but from what I have managed to figure out so far, none of these can be used to configure CPU parameters.

I am a long-time PC enthusiasts and know my way around computers, but servers are relatively new to me, so it bears repeating that it might be entirely possible that I am overlooking something really obvious here.

I'm sure I have forgotten to mention a bunch of other things I have tried thus far, as I have been banging my head against this for quite a while now already, but I have reached the point where I have to admit that I am in way over my head and I do not know which of the many available paths I should possibly pursue next.

Any and all input about what to try and/or which lines of troubleshooting to pursue would be greatly appreciated.

Kind Regards,

Chiliben
 

Zhang

Member
Sep 18, 2018
52
48
18
It might be the BMC that limits the total power. You can try resetting the BMC while running `stress` (e.g., `sudo ipmitool bmc reset cold`) and see if the power numbers reported by `e-smi` will temporarily recover to 360W during the BMC reboot.
 

sam55todd

Active Member
May 11, 2023
186
55
28
There is a port specifically labeled PMBUS on the motherboard, which I have confirmed the pinout of, and which is properly connected to the PSU.
Some manufacturers design/program PMBus in a way to detect only specific headers and response IDs/addresses in host-slave protocol negotiation.
I have SuperMicro board and Corsair AX1200i PSU (which has I2C bus)
which will only detect limited number of PSUs connected via I2C/PMBus (closely matching SDA/SCL bus)
but SuperMicro IPMI will not show PSU details of Corsair because Voltage IDs are different from what BIOS is programmed to read
(so it can not allocate reading to right lines in BMC web interface).

P.S. BTW - layout of some T2SEEP motherboards is looking suspiciously similar to SM
(not only colour although there are some limitation on how you design it in an optimal way on a restricted eATX MB size with airflow
and predefined back-panel and PCIe slot locations - so most brands start look the same on many aspects,
hence component position convergence and layout similarities) possibly having slightly cheaper components and minor redesigns,

there's a possibility what they either use warehouse leftovers from SM or SM changed OEM factory
and T2SEEP just occupied same facility/subcontractor without re-purposing assembly lines much
(or even former architect employees changed workplace while carrying USB stick in a pocket).
Just my random guess.
 
Last edited:

RolloZ170

Well-Known Member
Apr 24, 2016
7,185
2,240
113
maybe the platform can/want only redundancy PSU mode, missing one PSU will result in throttling until the 2nd is working/returned.
 

RolloZ170

Well-Known Member
Apr 24, 2016
7,185
2,240
113
I run this CPU on a T2SEEP motherboard, which seems to be a slightly modified clone of the supermicro H13SSL-N
the BMC Firmware is universal to all AST2600 motherboards. it reads the motherboard model (e.g. SMBios/DMI)
and matched the wanted behave of that model(database in BMC FW)

T2SEEP vs H13SSL a.jpg
T2SEEP vs H13SSL.jpg
 

RolloZ170

Well-Known Member
Apr 24, 2016
7,185
2,240
113
Shenzhen TongTaiYi information technology co., LTD
Address
深圳市南山区高新南七道深圳市数字技术园b2栋2楼
Shenzhen, Guangdong Sheng, 518000
China
T2SEEP
 

RolloZ170

Well-Known Member
Apr 24, 2016
7,185
2,240
113
T2SEEP = OEM TU229V2
server TU329V3 uses T2SEEP


 
Last edited:

RolloZ170

Well-Known Member
Apr 24, 2016
7,185
2,240
113

BIOS

稳定可靠 智能管理
系统关键部件均采用冗余、热插拔设计,同时支持免工具拆装,提升故障维护效率,提升系统的可用性;
集成智能管理芯片,提供开放的管理平台,支持IPMI2.0、Redfish、SNMP等多种管理协议;
支持远程KVM、虚拟媒介、关键部件状态监控、异常报警等各种管理功能。

Stable, reliable and intelligent management
The key components of the system are designed to be redundant and hot-swappable, and support tool-free disassembly and assembly, which improves the efficiency of fault maintenance and the availability of the system.
Integrated intelligent management chip, providing an open management platform, supporting multiple management protocols such as IPMI2.0, Redfish, and SNMP;
Supports various management functions such as remote KVM, virtual media, key component status monitoring, abnormal alarm, etc.
 
Last edited:

sam55todd

Active Member
May 11, 2023
186
55
28
Well, there's that too, T2SEEP T3DE could be just one of OEM subcontractors for other brands (like Foxconn)
those umbrella companies may even toss factories to each other time to time via internal agreements.
sorry, sadly I've started this speculation, and it's getting way too extensive.
Mixed up naming, way back I was looking into different one not T2SEEP but into T3DE and those look in a way similar to each other.
1738673699479.png
 

RageBone

Active Member
Jul 11, 2017
680
175
43
Lets be realistic, just because boards look similar really does not mean they are the same or similar enough design.

AMD and Intel of course have "reference designs" and platforms that can be used as bases for OEM designs.

Also don't ignore that the requirements and limitations are pretty similar for them all.
With Dram and pcie, where else to put BMCs, VRs, alll the other stuff?

Still, nothing keeps designers from changing anything in significant ways, but still have it look like the other board.

@RolloZ170 Would you please help me with a source for "all AST2600 firmwares are universial"?

The dual PSU thing to me sounds like something to investigate.
I am also wondering about the Core VR controllers.
Though that would be weird with the board rated for 400W.
 

RolloZ170

Well-Known Member
Apr 24, 2016
7,185
2,240
113
Would you please help me with a source for "all AST2600 firmwares are universial"?
e.g. gigabyte BMC Firmware for all AST2600 models
GBAST2600src.jpgGBAST2600srcA.jpg

Code:
<AST2600 version="1705457057" model="MS33-AR0-000">
    <I3C chnl="0">
        <ADJUST_VOLTAGE addr="0x90">
            <SENSOR no="0x102" sdr="I3C_DDR5" scan="ON,5" />
        </ADJUST_VOLTAGE>
        <ADJUST_VOLTAGE addr="0x92">
            <SENSOR no="0x103" sdr="I3C_DDR5" scan="ON,5" />
        </ADJUST_VOLTAGE>
        <ADJUST_VOLTAGE addr="0x94">
            <SENSOR no="0x104" sdr="I3C_DDR5" scan="ON,5" />
        </ADJUST_VOLTAGE>
        <ADJUST_VOLTAGE addr="0x96">
            <SENSOR no="0x105" sdr="I3C_DDR5" scan="ON,5" />
        </ADJUST_VOLTAGE>
        <ADJUST_VOLTAGE addr="0x98">
            <SENSOR no="0x106" sdr="I3C_DDR5" scan="ON,5" />
        </ADJUST_VOLTAGE>
        <ADJUST_VOLTAGE addr="0x9A">
            <SENSOR no="0x107" sdr="I3C_DDR5" scan="ON,5" />
        </ADJUST_VOLTAGE>
        <ADJUST_VOLTAGE addr="0x9C">
            <SENSOR no="0x10A" sdr="I3C_DDR5" scan="ON,5" />
        </ADJUST_VOLTAGE>
    </I3C>
    <I3C chnl="1">
        <ADJUST_VOLTAGE addr="0x90">
            <SENSOR no="0x122" sdr="I3C_DDR5" scan="ON,5" />
        </ADJUST_VOLTAGE>
        <ADJUST_VOLTAGE addr="0x92">
            <SENSOR no="0x123" sdr="I3C_DDR5" scan="ON,5" />
        </ADJUST_VOLTAGE>
        <ADJUST_VOLTAGE addr="0x94">
            <SENSOR no="0x124" sdr="I3C_DDR5" scan="ON,5" />
        </ADJUST_VOLTAGE>
        <ADJUST_VOLTAGE addr="0x96">
            <SENSOR no="0x125" sdr="I3C_DDR5" scan="ON,5" />
        </ADJUST_VOLTAGE>
        <ADJUST_VOLTAGE addr="0x98">
            <SENSOR no="0x126" sdr="I3C_DDR5" scan="ON,5" />
        </ADJUST_VOLTAGE>
        <ADJUST_VOLTAGE addr="0x9A">
            <SENSOR no="0x117" sdr="I3C_DDR5" scan="ON,5" />
        </ADJUST_VOLTAGE>
        <ADJUST_VOLTAGE addr="0x9C">
            <SENSOR no="0x120" sdr="I3C_DDR5" scan="ON,5" />
        </ADJUST_VOLTAGE>
    </I3C>
    <ME>
        <ME_MEM_TEMP __link__="1" cpu="0" mem="0" />
        <ME_MEM_TEMP __link__="1" cpu="0" mem="1" />
        <ME_MEM_TEMP __link__="1" cpu="0" mem="2" />
        <ME_MEM_TEMP __link__="1" cpu="0" mem="3" />
        <ME_MEM_TEMP __link__="1" cpu="0" mem="4" />
        <ME_MEM_TEMP __link__="1" cpu="0" mem="5" />
        <ME_MEM_TEMP __link__="1" cpu="0" mem="6" />
        <ME_MEM_TEMP __link__="1" cpu="0" mem="7" />
        <ME_MEM_TEMP __link__="2" cpu="0" mem="8" />
        <ME_MEM_TEMP __link__="2" cpu="0" mem="9" />
        <ME_MEM_TEMP __link__="2" cpu="0" mem="10" />
        <ME_MEM_TEMP __link__="2" cpu="0" mem="11" />
        <ME_MEM_TEMP __link__="2" cpu="0" mem="12" />
        <ME_MEM_TEMP __link__="2" cpu="0" mem="13" />
        <ME_MEM_TEMP __link__="2" cpu="0" mem="14" />
        <ME_MEM_TEMP __link__="2" cpu="0" mem="15" />
        <ME_UPDATER>
            <UPDATER scan="ON,3" />
        </ME_UPDATER>
        <MAX __taker__="1">
            <SENSOR name="DIMMG0_TEMP" no="0x04" scan="ON,1" sdr="DIMM_TEMP_DDR5" />
        </MAX>
        <MAX __taker__="2">
            <SENSOR name="DIMMG1_TEMP" no="0x05" scan="ON,1" sdr="DIMM_TEMP_DDR5" />
        </MAX>
        <ME_PCH_TEMP>
            <SENSOR no="0x03" scan="ON,1" sdr="PCH_TEMP" />
        </ME_PCH_TEMP>
        <ME_STATUS>
            <SENSOR no="0x203" scan="ON,3" sdr="ME_STATUS" />
        </ME_STATUS>
    </ME>
    <PECI chnl="0">
        <PECI_CPU_DTS __link__="5" cpu="0">
            <SENSOR no="0x0c" sdr="DTS_TEMP" name="CPU0_DTS" scan="ON,3" />
        </PECI_CPU_DTS>
        <PECI_CPU_INFO __link__="5">
            <ACCESS_PT no="0x100" sdr="CPU_INFO" />
        </PECI_CPU_INFO>
        <MINUS __taker__="5">
            <SENSOR no="0x01" sdr="CPU_TEMP" name="CPU0_TEMP" scan="ON,3" />
        </MINUS>
    </PECI>
    <ADC chnl="0">
        <SENSOR name="P_12V" no="0x40" scan="ON,2" sdr="P_12V_2600" />
    </ADC>
    <ADC chnl="1">
        <SENSOR name="P_5V" no="0x41" scan="ON,2" sdr="P_5V" />
    </ADC>
    <ADC chnl="2">
        <SENSOR name="P_3V3" no="0x42" scan="ON,2" sdr="P_3V3_2600" />
    </ADC>
    <ADC chnl="3">
        <SENSOR name="P_5V_STBY" no="0x43" scan="ON,2" sdr="P_5V_2600" />
    </ADC>
    <ADC chnl="4">
        <SENSOR name="P_VNN_PCH_AUX" no="0x4b" scan="ON,2" sdr="V0.97_1923" />
    </ADC>
    <ADC chnl="5">
        <SENSOR name="P_1V05_PCH" no="0x44" scan="ON,2" sdr="P_1V05_2600" />
    </ADC>
    <ADC chnl="6">
        <SENSOR name="P_1V8_PCH_AUX" no="0x4d" scan="ON,2" sdr="V1.8_1015" />
    </ADC>
    <ADC chnl="7">
        <GPIO_SWITCH chnl="141">
            <SENSOR name="P_VBAT" no="0x45" scan="ON,2" sdr="P_VBAT_2600" />
        </GPIO_SWITCH>
    </ADC>
    <ADC chnl="8">
        <SENSOR name="P_VCCIN_CPU0" no="0x46" scan="ON,2" sdr="V1.74_1923" />
    </ADC>
    <ADC chnl="10">
        <SENSOR no="0x4a" sdr="P_1V08_2600" name="P_VCCINFAON_CPU0" scan="ON,2" />
    </ADC>
    <ADC chnl="12">
        <SENSOR no="0x4c" sdr="P_1V8_2600" name="P_VCCFA_EHV_CPU0" scan="ON,2" />
    </ADC>
    <ADC chnl="14">
        <SENSOR no="0x4e" sdr="P_1V1_EGS" name="P_VCCD_HV_CPU0" scan="ON,2" />
    </ADC>
     <ADC chnl="15">
        <SENSOR no="0x4f" sdr="P_0V92_2600" name="P_0V92_PHY" scan="ON,2" />
    </ADC>
    <I2C chnl="1">
        <PCA9545 addr="0xE6">
            <PCA9545_BUS chnl="5">
                <RAID_INFO addr="0x02">
                    <SENSOR no="0x240" scan="ON,60" sdr="RAID_INFO" />
                </RAID_INFO>
            </PCA9545_BUS>
        </PCA9545>
    </I2C>
    <I2C chnl="5">
        <I2C_DEVICE addr="0xB0">
            <PMBUS_STATUS redun="1">
                <SENSOR no="0xE6" sdr="PSU_STATUS" name="PS1_Status" scan="ON,1" />
            </PMBUS_STATUS>
            <PMBUS_ANALOG cmd="0x8F">
                <SENSOR no="0x30" sdr="PSU_TEMP" name="PSU1_HOTSPOT" scan="ON,1" />
            </PMBUS_ANALOG>
            <PMBUS_ANALOG cmd="0x97" __link__="14">
                <SENSOR no="0x108" sdr="PSU_POWER" scan="ON,1" />
            </PMBUS_ANALOG>
        </I2C_DEVICE>
        <I2C_DEVICE addr="0xB2">
            <PMBUS_STATUS redun="2">
                <SENSOR no="0xE7" sdr="PSU_STATUS" name="PS2_Status" scan="ON,1" />
            </PMBUS_STATUS>
            <PMBUS_ANALOG cmd="0x8F">
                <SENSOR no="0x31" sdr="PSU_TEMP" name="PSU2_HOTSPOT" scan="ON,1" />
            </PMBUS_ANALOG>
            <PMBUS_ANALOG cmd="0x97" __link__="14">
                <SENSOR no="0x109" sdr="PSU_POWER" scan="ON,1" />
            </PMBUS_ANALOG>
        </I2C_DEVICE>
        <SUM __taker__="14">
            <DIVIDE divisor="25">
                <SENSOR no="0xE9" sdr="SYS_POWER" scan="ON,1" />
            </DIVIDE>
        </SUM>
        <PMBUS_COLD_REDUNDANT def="1">
            <ACCESS_PT no="0x1B9" sdr="PSU_REDUNDANT" />
        </PMBUS_COLD_REDUNDANT>
    </I2C>
    <I2C chnl="4">
        <NCT7802Y addr="0x50">
                <NCT7802Y_TEMP chnl="0">
                        <SENSOR name="M2_G0_TEMP" no="0x1C" scan="ON,1" sdr="GPU_TEMP" />
                </NCT7802Y_TEMP>
        </NCT7802Y>
        <PCA9545 addr="0xE2">
            <PCA9545_BUS chnl="0">
                <CPLD_CPU addr="0x2E">
          <CPLD_CPU_EVENT cpu="0">
                        <SENSOR name="CPU0_Status" no="0xE2" scan="ANY,5" sdr="CPU_STATUS" />
                    </CPLD_CPU_EVENT>
          <CPLD_MEM_EVENT cpu="0">
                        <SENSOR name="CPU0_MEMTRIP" no="0xE4" scan="ANY,5" sdr="CPU_STATUS" />
                    </CPLD_MEM_EVENT>
        </CPLD_CPU>
                <TMP75 addr="0x9A">
                    <SENSOR no="0x09" sdr="MB_TEMP" name="X710_TEMP" scan="ON,1" />
                </TMP75>
                <TMP75 addr="0x9C">
                    <SENSOR no="0x0A" sdr="MB_TEMP" name="REAR_TEMP" scan="ON,1" />
                </TMP75>
                <TMP75 addr="0x2E">
                    <SENSOR no="0x194" sdr="MB_CPLD_VER" scan="ON,1" />
                </TMP75>
            </PCA9545_BUS>
            <PCA9545_BUS chnl="1">
            <I2C_DEVICE addr="0xC0">
               <PMBUS_PAGE page="0">
                   <PMBUS_ANALOG cmd="0x88">
                    <SENSOR no="0x50" sdr="VR_VIN_CPU" name="VCCIN_P0_VIN" scan="ON,2" hide="1" />
                 </PMBUS_ANALOG>
                 <PMBUS_ANALOG cmd="0x8C">
                     <SENSOR no="0x80" sdr="VR_IOUT_VCCIN" name="VCCIN_P0_Io" scan="ON,2" hide="1" />
                  </PMBUS_ANALOG>
                  <PMBUS_ANALOG cmd="0x8D">
                     <SENSOR no="0x0E" sdr="VR_TEMP_CPU" name="VCCIN_P0_TMP" scan="ON,2" />
                  </PMBUS_ANALOG>
               </PMBUS_PAGE>
            </I2C_DEVICE>
            <I2C_DEVICE addr="0xCC">
               <PMBUS_PAGE page="0">
                   <PMBUS_ANALOG cmd="0x88">
                    <SENSOR no="0x52" sdr="VR_VIN_CPU" name="VCCD_HV_P0_Vin" scan="ON,2" hide="1" />
                 </PMBUS_ANALOG>
                 <PMBUS_ANALOG cmd="0x8C">
                     <SENSOR no="0x82" sdr="VR_IOUT_VCCIO" name="VCCD_HV_P0_Io" scan="ON,2" hide="1" />
                  </PMBUS_ANALOG>
                  <PMBUS_ANALOG cmd="0x8D">
                     <SENSOR no="0x0F" sdr="VR_TEMP_CPU" name="VCCD_HV_P0_TMP" scan="ON,2" />
                  </PMBUS_ANALOG>
               </PMBUS_PAGE>
            </I2C_DEVICE>
            <I2C_DEVICE addr="0xCE">
               <PMBUS_PAGE page="0">
                   <PMBUS_ANALOG cmd="0x88">
                     <SENSOR no="0x56" sdr="VR_VIN_CPU" name="VCCINFAON_P0_Vin" scan="ON,2" hide="1" />
                 </PMBUS_ANALOG>
                  <PMBUS_ANALOG cmd="0x8C">
                     <SENSOR no="0x87" sdr="VR_IOUT_1V8" name="VCCINFAON_P0_Io" scan="ON,2" hide="1" />
                  </PMBUS_ANALOG>
                  <PMBUS_ANALOG cmd="0x8D">
                     <SENSOR no="0x12" sdr="VR_TEMP_CPU" name="VCCINFAON_P0_TMP" scan="ON,2" />
                  </PMBUS_ANALOG>
               </PMBUS_PAGE>
               <PMBUS_PAGE page="1">
                   <PMBUS_ANALOG cmd="0x8C">
                    <SENSOR no="0x88" sdr="VR_IOUT_VCCANA" name="VCCFA_EHV_P0_Io" scan="ON,2" hide="1" />
                </PMBUS_ANALOG>
                <PMBUS_ANALOG cmd="0x8D">
                    <SENSOR no="0x13" sdr="VR_TEMP_CPU" name="VCCANA_P0_TMP" scan="ON,2" />
                </PMBUS_ANALOG>
               </PMBUS_PAGE>
            </I2C_DEVICE>
         </PCA9545_BUS>
        </PCA9545>
    </I2C>
    <I2C chnl="7">
        <ECFW addr="0x24">
            <SENSOR no="0x210" sdr="EC_VERSION" scan="ANY,1" />
        </ECFW>
    </I2C>
    <TACH chnl="0">
        <SENSOR no="0xb8" sdr="OB_FAN" name="CPU0_FAN1" scan="ON,2" />
    </TACH>
    <TACH chnl="1">
        <SENSOR no="0xb9" sdr="OB_FAN" name="CPU0_FAN2" scan="ON,2" />
    </TACH>
    <TACH chnl="2">
        <SENSOR no="0xba" sdr="OB_FAN" name="SYS_FAN1" scan="ON,2" />
    </TACH>
    <TACH chnl="3">
        <SENSOR no="0xbb" sdr="OB_FAN" name="SYS_FAN2" scan="ON,2" />
    </TACH>
    <TACH chnl="4">
        <SENSOR no="0xbc" sdr="OB_FAN" name="SYS_FAN3" scan="ON,2" />
    </TACH>
    <TACH chnl="5">
        <SENSOR no="0xbd" sdr="OB_FAN" name="SYS_FAN4" scan="ON,2" />
    </TACH>
    <TACH chnl="6">
        <SENSOR no="0xbe" sdr="OB_FAN" name="SYS_FAN5" scan="ON,2" />
    </TACH>
    <PWM chnl="0">
        <ACCESS_PT no="0x110" sdr="PWM" />
    </PWM>
    <PWM chnl="1">
        <ACCESS_PT no="0x111" sdr="PWM" />
    </PWM>
    <PWM chnl="2">
        <ACCESS_PT no="0x112" sdr="PWM" />
    </PWM>
    <PWM chnl="3">
        <ACCESS_PT no="0x113" sdr="PWM" />
    </PWM>
    <PWM chnl="4">
        <ACCESS_PT no="0x114" sdr="PWM" />
    </PWM>
    <PWM chnl="5">
        <ACCESS_PT no="0x115" sdr="PWM" />
    </PWM>
    <PWM chnl="6">
        <ACCESS_PT no="0x116" sdr="PWM" />
    </PWM>
</AST2600>
 
Last edited:
  • Like
Reactions: RageBone

sam55todd

Active Member
May 11, 2023
186
55
28
...AMD and Intel of course have "reference designs" and platforms that can be used as bases for OEM designs.

Also don't ignore that the requirements and limitations are pretty similar for them all.
With Dram and pcie, where else to put BMCs, VRs, alll the other stuff?

Still, nothing keeps designers from changing anything in significant ways, but still have it look like the other board...
Haven't I mentioned the same?
...although there are some limitation on how you design it in an optimal way on a restricted eATX MB size with airflow
and predefined back-panel and PCIe slot locations - so most brands start look the same on many aspects,
hence component position convergence and layout similarities...
 

Chiliben

New Member
Nov 3, 2024
2
0
1
It might be the BMC that limits the total power. You can try resetting the BMC while running `stress` (e.g., `sudo ipmitool bmc reset cold`) and see if the power numbers reported by `e-smi` will temporarily recover to 360W during the BMC reboot.

You seem to be correct. Resetting the BMC while under load lets the power consumption of the CPU slowly creep up to 270W before quickly dropping back to 250W once the BMC reboots.


Small update since my last post was a while ago:

I've since inquired with multiple resellers of the board what PSU they recommend to run this hardware with, and they usually just refer me to generic 1250W ATX-PSUs.


Some manufacturers design/program PMBus in a way to detect only specific headers and response IDs/addresses in host-slave protocol negotiation.
I have SuperMicro board and Corsair AX1200i PSU (which has I2C bus)
which will only detect limited number of PSUs connected via I2C/PMBus (closely matching SDA/SCL bus)
but SuperMicro IPMI will not show PSU details of Corsair because Voltage IDs are different from what BIOS is programmed to read
(so it can not allocate reading to right lines in BMC web interface).
The existing single Supermicro PSU is not being detected on the PMBUS, I tried talking to it, but I'm getting nothing, so you're likely correct. I have no idea what kind of other PSUs/brands I could try though. Besides the Corsair you mentioned, it seems REALLY hard to find any kind of PSU that advertises itself to support PMBUS, and even for the corsair one I have not managed to confirm definitively if it even speaks PMBus. From what I read online, they are doing their own protocol?


@RolloZ170 Would you please help me with a source for "all AST2600 firmwares are universial"?
Aspeed sells a EVB for the AST2600 which has an OpenBMC implementation. I don't want to put words into his mouth here, but I'm pretty sure that is what he is referring to.


I'm not all that familiar with OpenBMC, maybe someone who is more experienced with it can chime in here.

I've poked around a bit in there but doing ipmitool dcmi get_powerlimit returns a limit of 500W
I've raised that to 800W, thinking that maybe GPU and other peripherals might be eating up some of the headroom, but I've seen no change, so I'm assuming there is a different safeguard in the BMC firmware overriding that setting.

I'm already really glad to have a new promising lead on this, so I guess I'll start looking into ipmitool documentation, OpenBMC documentation and maybe the redfish API

Does anyone know which of these three is the most likely to have control over this?

Cheers and thanks to all for their already phenomenal input
 

RolloZ170

Well-Known Member
Apr 24, 2016
7,185
2,240
113
Aspeed sells a EVB for the AST2600 which has an OpenBMC implementation. I don't want to put words into his mouth here, but I'm pretty sure that is what he is referring to.
i meant Gigabyte has only one BMC Firmware binary File for all motherboards use the AST2600
 

RolloZ170

Well-Known Member
Apr 24, 2016
7,185
2,240
113
and finaly there is similar in the supermicro BMC Firmware....
Zwischenablage_02-04-2025_02.jpgZwischenablage_02-04-2025_01.jpg
 
Last edited: