Dual Xeon Platinum 8570 ES Cinebench 2026 Performance Drop / Frequency Stuck Around 1.9–2.8 GHz

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

nodpc

New Member
May 13, 2026
7
1
3
Hi everyone,


I am testing a dual-socket Intel Xeon Platinum 8570 ES system and would like to share my Cinebench 2026 results and ask for advice.


System configuration:


  • Motherboard: Gigabyte MS73-HB2
  • BIOS: R22, dated 03/19/2026
  • CPU: 2 × Intel Xeon Platinum 8570 / Emerald Rapids-SP / ES
  • Total cores/threads: 112 cores / 224 threads
  • Memory: DDR5 RDIMM, currently 48 GB installed
  • GPU: NVIDIA GeForce RTX 5090
  • OS: Windows 11 Pro for Workstations, 64-bit, build 22000

Cinebench 2026 results:


  • GPU score: 141,197 pts
  • CPU Multi Thread score: around 18,929 pts
  • CPU Single Thread test was running around 186 pts

The GPU score looks normal, but the CPU score seems much lower than expected for this dual Xeon 8570 setup.


Previously, the same system scored around 22,000+ in CPU benchmarks, but now the score is much lower even after restoring BIOS defaults. With Hyper-Threading disabled, the CPU score actually improves, which is strange.


During the Cinebench CPU test, Task Manager shows most cores running around 1.9 GHz, and HWINFO/CPU-Z reports that the maximum observed frequency is often only around 2.8 GHz. The CPUs do not seem to boost properly under full load.


Current observations:


  • Turbo Mode is enabled.
  • Hardware P-States set to Out of Band Mode improved performance somewhat.
  • CPU C-states / C1E / Package C-State settings were tested.
  • Windows 11 power plan was changed using powercfg, but the CPU still appears limited.
  • The system often stays around x19 under load.
  • I cannot find an obvious AVX Ratio Offset setting in the Gigabyte MS73-HB2 BIOS.
  • Power supply is 1600W, but I am not sure if power delivery or motherboard power limits could still be involved.
  • CPU-Z reports:
    • Base frequency: 1.9 GHz
    • Max turbo frequency: 4.0 GHz
    • Current observed frequency: often around 1.9–2.8 GHz

My main question is:


What BIOS or Windows settings should I check to make sure both Xeon 8570 CPUs can boost correctly under full load?


I am especially interested in:


  1. Turbo Power Limit / Package Power Limit settings
  2. TDP Configuration
  3. Intel Speed Shift / SST-CP / Hardware P-States
  4. Energy Efficient Turbo
  5. AVX frequency limits
  6. NUMA / Windows processor group behavior
  7. Any known Gigabyte MS73-HB2 BIOS setting that may limit dual CPU boost

I attached screenshots of Cinebench, HWINFO during the test.


Any suggestions would be appreciated. Thank you!
 

Attachments

CyklonDX

Well-Known Member
Nov 8, 2022
1,820
659
113
Its not strange, the behavior is correct.

There's a TDP limit each cpu has its the main factor; next on block
the amount of cores it can keep boosted under load (depends on base, avx2, avx512)
the more cores are utilized under feature set the more or less cpu cores can sustain the clocks. (you typically can only tune it down, if temps become a problem.)

ex. full load on X amount of cores
base = 1-24 = 4GHz, 25-30 = 3.8GHz, 31-35 = 3.6GHz 35-40 = 3.4GHz ... and so on
avx2 = 1-12 = 4GHz, 13-16 = 3.6GHz, 17-24 = 3.2GHz ... and so on
avx512 = 1-8 = 4GHz, 9-12 = 3.5GHz, 13-16 = 3.0GHz ... and so on


allowing p-states (typically native is best option), running in balanced power config on windows typically allows you get highest clocks when load comes in *and doesn't eat whole cpu. (i.e. allow cores to downclock to allocate more power to few cores that are being used.)

sometimes cpu's have different tdp configs - that you can change in bios (you should try them all if its not described, as sometimes its pre-configured defaults for certain countries with "green" compliance to be lower, and changing it to different config might produce better or worse tdp limits and cpu boost scaling.)

windows scheduler is bad once you go over 8 cores; I recommend trying process lasso for anything over 8cores and windows.

next

if you are not using it for virtualization, you should disable sr-iov, and vtd etc - and turn off windows memory protections (meant for keeping vms secure, they can do a significant hit on cpu perf).

above 4G decoding for BAR addresses to be above 32bits (i.e. you can assign more memory to your pcie devices - they can potentially perform better)

all cpu settings - is more depending on what you are doing/trying to do, so its something you need to play around with.
if you use windows you should disable sub numa cluster function, unless you use process lasso or linux (then you should see latency between cores, and decide if you want split it by 2 or 4; This can yield some benefits sometimes.)
(to figure out if you need sub-numa clustering in windows, get sysinternals, CoreinfoEx64.exe will show you distance between threads/cores. If latency gets big on same numa node, you might benefit from using sub-numa clustering. It will tell windows scheduler to treat it as 'almost' another cpu.)
 
Last edited:
  • Like
Reactions: nodpc

RolloZ170

Well-Known Member
Apr 24, 2016
10,062
3,227
113
germany
What BIOS or Windows settings should I check to make sure both Xeon 8570 CPUs can boost correctly under full load?
try CPU-Z stress-cpu and check clocks with HWinfo summary window, and package power (sensors window)
all core should be x28 (nonAVX)
AVX2 all core is x27
BUT long duration TDP limit is in any case 330W per cpu.
you can set PL1 Time Window to e.g. 256 seconds, allows the CPU exceed TDP for that time, fused clock multi is limit then.
 
  • Like
Reactions: nodpc

RolloZ170

Well-Known Member
Apr 24, 2016
10,062
3,227
113
germany
What BIOS or Windows settings should I check to make sure both Xeon 8570 CPUs can boost correctly under full load?
CBR23 had a problem with this, CBR24 was better, it seem the issue returned.
i recommend to tweak you system for your real workload, not cinebench.
 

RolloZ170

Well-Known Member
Apr 24, 2016
10,062
3,227
113
germany
With Hyper-Threading disabled, the CPU score actually improves, which is strange.
tells me there is a issue with too much threads. if you tweak something in the BIOS(numa,sub numa)
a reset of the windows scheduler is required, i don't know if there is another way than a fresh windows install.
 
  • Like
Reactions: nodpc

nodpc

New Member
May 13, 2026
7
1
3
Its not strange, the behavior is correct.

There's a TDP limit each cpu has its the main factor; next on block
the amount of cores it can keep boosted under load (depends on base, avx2, avx512)
the more cores are utilized under feature set the more or less cpu cores can sustain the clocks. (you typically can only tune it down, if temps become a problem.)

ex. full load on X amount of cores
base = 1-24 = 4GHz, 25-30 = 3.8GHz, 31-35 = 3.6GHz 35-40 = 3.4GHz ... and so on
avx2 = 1-12 = 4GHz, 13-16 = 3.6GHz, 17-24 = 3.2GHz ... and so on
avx512 = 1-8 = 4GHz, 9-12 = 3.5GHz, 13-16 = 3.0GHz ... and so on


allowing p-states (typically native is best option), running in balanced power config on windows typically allows you get highest clocks when load comes in *and doesn't eat whole cpu. (i.e. allow cores to downclock to allocate more power to few cores that are being used.)

sometimes cpu's have different tdp configs - that you can change in bios (you should try them all if its not described, as sometimes its pre-configured defaults for certain countries with "green" compliance to be lower, and changing it to different config might produce better or worse tdp limits and cpu boost scaling.)

windows scheduler is bad once you go over 8 cores; I recommend trying process lasso for anything over 8cores and windows.

next

if you are not using it for virtualization, you should disable sr-iov, and vtd etc - and turn off windows memory protections (meant for keeping vms secure, they can do a significant hit on cpu perf).

above 4G decoding for BAR addresses to be above 32bits (i.e. you can assign more memory to your pcie devices - they can potentially perform better)

all cpu settings - is more depending on what you are doing/trying to do, so its something you need to play around with.
if you use windows you should disable sub numa cluster function, unless you use process lasso or linux (then you should see latency between cores, and decide if you want split it by 2 or 4; This can yield some benefits sometimes.)
(to figure out if you need sub-numa clustering in windows, get sysinternals, CoreinfoEx64.exe will show you distance between threads/cores. If latency gets big on same numa node, you might benefit from using sub-numa clustering. It will tell windows scheduler to treat it as 'almost' another cpu.)
thank you!~! I’m just using this system for AI workloads and Python, and I’ve already disabled VT-d. I saw a slight performance improvement after that.
 

CyklonDX

Well-Known Member
Nov 8, 2022
1,820
659
113
thank you!~! I’m just using this system for AI workloads and Python, and I’ve already disabled VT-d. I saw a slight performance improvement after that.
if thats the case i would also recommend moving over to linux, and disabling most cpu sec mitigations (windows kinda doesn't offer turning off security mitigations that impacts performance) since you don't really need to protect your memory, cross-core talk here...

for boot linux i'd recommend following
pti=off,spectre_v2=off,l1tf=off,mds=off,nospec_store_bypass_disable,pci=big_io_size_mem,pci=nocrs,pci=assign-busses,pci=big_io,pci=realloc

This would turn them off, and also create all favorable env to assign enough BAR addressing to your gpu for 'ai'. (though you don't have a lot of ram in first place, certainly not enough to cover whole allowed range for 5090.)
 

CyklonDX

Well-Known Member
Nov 8, 2022
1,820
659
113
its like to run a 16core Gaming PC with single channel DDR5-2400
its kinda worse, its just 3 sticks of 16GB.
*for performance i'd advice to ensure 3rd stick is on first cpu.


OP it would be better if you had 2x 32G 2rank sticks instead. (but right now getting 4th 16G would be better.)
 
Last edited:
  • Like
Reactions: nodpc

nodpc

New Member
May 13, 2026
7
1
3
In the end, memory turned out to be the biggest key factor. After adding two more 16GB memory sticks, performance improved significantly.


After three weeks of trial and error, my final recommendation is that the motherboard should definitely have a built-in 10G Ethernet port. Because the graphics card is too large, three PCIe slots will be blocked, and the SFF connector will also be blocked.


To make sure all the front-panel USB ports on the case and wifi & Bluetooth work properly, I bought several useful adapter boards:


  • PCI-E X4/X8/X16 to USB 3.2 Gen2 Adapter — must be installed in Slot 1
  • USB 3.0 19/20-pin 1-to-2 USB 3.0 19/20-pin hub
  • Motherboard 19-pin USB 3.0 header to dual USB 2.0 9-pin female adapter

To add more NVMe drives:
  • PCIe 5.0 x16 expansion card / PCIe bifurcation riser card, which splits one x16 slot into x8 + x4 + x4.
For cooling
I used one air cooler and one Arctic Cooling Liquid Freezer WS360-4710. If using dual air coolers, CPU1 is often about 10°C hotter than CPU0.
 

Attachments

Last edited:
  • Like
Reactions: RolloZ170

nodpc

New Member
May 13, 2026
7
1
3
its kinda worse, its just 3 sticks of 16GB.
*for performance i'd advice to ensure 3rd stick is on first cpu.


OP it would be better if you had 2x 32G 2rank sticks instead. (but right now getting 4th 16G would be better.)
Thanks, The performance is much better after adding more memory.
 

TrevorH

Active Member
Oct 25, 2024
208
90
28
A Xeon Platinum 8570 has 8 memory channels and you have two of them. I suspect that means to get the maximum memory bandwidth you should be installing at least 8 DIMMs per processor socket, 16 total.
 
  • Like
Reactions: RolloZ170