For those with an Error in
dmesg
like this:
Code:
ACPI FADT declares the system doesn't support PCIe ASPM, so disable
I wrote a small
Script to deploy on Multiple Systems to "patch" it.
I'm also trying to write a small Script to Debug ASPM since I am getting Stuck a lot left and right and I'd like to keep some
Notes on how to properly Troubleshoot as I learn on this journey.
Note that right now it only patches the FACP Table by incrementing the Oem Revision Number (needed to make sure that the Patched ACPI Tables are loaded instead of the Default ones !) and set the
PCIe ASPM Not Supported (V4) : 0
Attribute
Check that everything is working with:
Code:
# Check that ASPM is not doing something weird
dmesg | grep -i aspm
And:
Code:
# List PCIe Devices with ASPM Disabled
lspci -vvv | grep --color -B40 -A40 -i "ASPM Disabled"
And:
Code:
# Display Statistics using turbostat every 0.5 Seconds (poll even every 0.05 Seconds if you want, polling every 5 Seconds might always "fall" into Spots where PC6/PC7 are 0%)
turbostat --show Avg_MHz,Busy%,Bzy_MHz,TSC_MHz,POLL,POLL%,C1%,C1E%,C3%,C6%,C7s%,CPU%c1,CPU%c3,CPU%c6,CPU%c7,Pkg%pc2,Pkg%pc3,Pkg%pc6,Pkg%pc7,PkgWatt,CorWatt,IPC,IRQ,SMI --interval 0.5
Also make sure to ACTIVATE ASPM in Linux after boot
Code:
# PowerSave Mode
powertop --auto-tune
cpupower frequency-set --governor ondemand
# Activate ASPM
echo -n powersave > /sys/module/pcie_aspm/parameters/policy
Important Caveats:
- Sometimes no Matter what you do, everything seems right (ASPM Forcefully Enabled, nothing weird in
lspci
/
dmesg
/ etc, but CPU does NOT enter Package State PC6/PC7 at all. This happened on a Supermicro X10SLL-F (Intel Xeon E3-1275 v3 - Graphics is not enabled/powered anyways so it's NOT that !) with a Mellanox ConnectX-4 NIC in one of the 2 CPU-connected PCIe Slots. The PCH Connected PCIe Slot (PCIe 2.0 x4 in x8) was allowing ASPM to somewhat work, not hugely, but maybe 10% PC6 and 10% PC7). This is weird since a brief testing on a Supermicro X11SSL-F (Intel Xeon E3-1230 v5) showed PC6/PC7 in both the CPU-connected PCIe Slot as well as the PCH-connected PCIe Slot
- Do NOT set the governor to [IQUOTE]powersave[/IQUOTE] unless this is a really lightly used System, as it might cause the load to go up since the Frequency is always low. Governor
ondemand
and
conservative
yield similar Results and seemed OK in my brief Testing
- You might need to DISABLE the ACPI Tables temporarily (see
this), boot the System, then regenerate the ACPI Tables, should you change any Setting in the BIOS or maybe just move a PCIe Card from one Slot to another Slot (or install a new PCIe Card, etc)
EDIT 1: On Proxmox VE Systems, wait at least 2-5 Minutes after Boot (as reported by
uptime
) since the CPU Package is unlikely to even reach PC2 in the early Phases due to all VMs spinning up (let alone PC6 or PC7).
EDIT 2: On Supermicro X10SLL-F, it seems that setting the following helped re-enable ASPM with an Intel X710-DA2 in the CPU-connected PCIe slot:
- C1 Auto Demotion -> Disabled
- C3 Auto Demotion -> Disabled
- C-State Pre-Wake -> Disabled
- Package C-State limit -> C7
- LakeTiny Feature -> Enabled
I suspect it's more the "Demotion" and "Pre-Wake" Settings that did the Trick.
Furthermore the Governor conservative in a very short Test seemed now to work better than the ondemand Governor with regards to PC6 (PC7 disabled as soon as the Intel X710-DA2 NIC is in one of the CPU-connected PCIe Slots)
EDIT 3: for more Advanced Troubleshooting and Possibilities, my
other Repository might be of Help. It outlines a "Checklist" to check the most common Issues as well as more advanced Stuff such as Disassembling the BIOS and setting hidden BIOS Options from UEFI Shell.