Amd Epyc 7551 low boost clocks in esxi

mirol

New Member
Jan 6, 2018
19
1
3
34
Hi I just put together server with Amd 7551 Epyc cpu and Asrock romed8-2t mobo + 4x 8gb M393A1G40EB1-CPB 2133P (just for tests i will get better memory later).My issue this cpu model should boost to 3Ghz for up to 12 cores and 2.550 for all but for me its always max 2550 doesnt matter its 1 or 8 etc never go above that 2550.I tried with power setings in bios set to max performace disapble c states and ofc setup Max performance in esxi host still the same.Cpu temp never go above 50 degree.I check it on windows server 2019 and its easy boost on 8 cores to almost 3Ghz.There some issue with first gen on esxi or I miss something?
 

superempie

Member
Sep 25, 2015
58
8
8
The Netherlands
A bit like I have here. On ESXi 7.0.2 in a Ubuntu VM I get only just againt 2GHz CPU. In ESXi I haven't found out how to read out the boost clock.
Windows 10 Pro on USB stick test got me all core to around 2,55 GHz, dropping when temps got higher (>50C). This was with 4 DIMMS.
I now have all populated on ESXi. Speed of VMs in BOINC approved dramatically when I upgraded from 4 to 8 DIMMS. This was without testing Windows. YMMV, this is my experience...
Maybe with better cooling you can get higher all core. Didn't try it yet. Still experimenting with it. I did not adjust any speed settings, only setting ESXi to high power mode.
ESXi 7.0.2 seems to have a new scheduler for EPYCs I read somewhere.
 
Last edited:

mirol

New Member
Jan 6, 2018
19
1
3
34
I dont think its temp issue as even when I try to load only 4 cores and i got temp belowe 40 still max out 2550.I forgot mention all esxi tests I done on VMware ESXi, 7.0.2, 17630552 version.
 

superempie

Member
Sep 25, 2015
58
8
8
The Netherlands
Got the same ESXi version. Didn't see the 3GHz yet. Only 2,5GHz but I am pushing it more all-core. Speed drops at 50C>.
Didn't try with with lower cores yet. I use the SMT to get 64 vCores. Still playing with it.
What is your use-case?
 

mirol

New Member
Jan 6, 2018
19
1
3
34
I just got little home lab for some stuff and to play with it tbh.When I use this motherboard with my other epyc 7502 btw its boost up perfectly fine to 3.3 no prob i think the issue is mainly with esxi work the way it work with 1st gen epyc.
 

superempie

Member
Sep 25, 2015
58
8
8
The Netherlands
Could be that it is an ESXi issue with 1st gen EPYC. I don't have this setup that long, but have read a lot about NUMA and 7001 series EPYC to get a little feeling for it. Maybe newer ESXi 7 releases will make things better.
I use mine also in my homelab. When 7002 series EPYC gets a bit more cheaper second hand, I might upgrade it. I use Samsung 2666MHz memory for now, 8x 32GB - 4x M393A4K40CB2-CTD6Q and 4x M393A4K40CB2-CTD7Q.
 

superempie

Member
Sep 25, 2015
58
8
8
The Netherlands
Some thoughts:
1) Did you try enabling/disabling SMT?
2) Did you check if the VM spans NUMA nodes or is running on one?
3) Is the VM running on a NUMA node where the memory is connected to? Because you only run 4 DIMMS and the 7551 has no I/O die, but the memory controllers are on the dies itself. Maybe something is slowing down there.

I have been playing around with it today and it looks like my VMs are performing better according to the CPU graph in vSphere for the ESXi host. I am running with SMT. Did not check esxtop etc. yet. Still have to test it some more for a couple of days and when I have the time I will look into it further. This is what I did:
1) Change advanced VM parameter numa.autosize.vcpu.maxPerVirtualNode to double the value. If VM has 16 vCPU with SMT, this will default to 8. It looks like it detects the physicaI CPU's on the die. Changed it to 16 to match the vCPU assigned to the VM. CPU setting is 1 CPU, 16 sockets.
2) Pin VM to NUMA node through advanced VM parameter numa.nodeAffinity.

I am currently running 4 Ubuntu 20.04 command line only VM's with BOINC. VM on NUMA node 0 had 12 vCPU (left some for OS just in case) and the other 3 have 16 vCPU and are pinned to NUMA node 1, 2 and 3.

This is how vSphere CPU graph of host looked like before and after:
VMware_ESXi_4_VMs_on_seperate_NUMA_node_with_affinity_-_exported2021-04-05 18.41.png
To me this looked way better.

I also suspect vSphere not showing boost clocks, but only stock clocks. Found a post somewhere (not here) explaining you should check on command line of host. Will have to find that post again.

Maybe this might suite my current use case some better than yours, but maybe it helps.

Edit: By the way, how did you measure the speed of the CPU in ESXi?
 
Last edited:

superempie

Member
Sep 25, 2015
58
8
8
The Netherlands
Out of curiosity I tried to replicate your problem.

Details:
- Windows 2019 Eval VM fully updated, with VMware tools installed.
- Host has SMT still enabled. Did not try to disable it.
- Selected 'Enable virtualized CPU performance counters' on the CPU. Didn't try without.
- In order of testing: Tried with 1, 2, 4 and 3 vCPU.
- 1 core per socket.
- No NUMA node affinity testing.
- One VM running on host, no more.
- Software used to get the CPU up to speed: CineBench R23
- Results in vSphere monitor graphs:
- 1 vCPU got me around 3 GHz, CPU temps around 31c
- 2 vCPU got me around 6 GHz, CPU temps around 31c
- 4 vCPU got me max 10.005 GHz, CPU temps around 37c
- 3 vCPU got me max 8.459 GHz, forgot to look at CPU temps (6 + 2,5 ?)
- Not recorded: Tried 8 vCPU but that looked to me all cores where at 2.5-ish GHz.
- Didn't check esxtop.
- Did check the host graphs (host web interface) but looked the same. Didn't check all the time, focussed on vSphere graphs.
- In Windows task manager had 2.00 GHZ for Speed and Base speed. Did not change in any of the tests.
- Testing started just after 7:30 PM. Anything before that: ignore.
- The spike just before 7:45 PM is me hitting the wrong button in the 2 vCPU test in CineBench R23 and trying again around 7:45 PM.

Host graph - look at dark pink/purple-ish line for MHz:
Vmware ESXi 7.0.2 - Host - CPU GHz test 1,2,4,3.png

VM graph:
Vmware ESXi 7.0.2 - Win2019VM - CPU GHz test 1,2,4,3.png


As I could see it, vSphere can detect a 1 and 2 vCPU going to 3 GHz, but this info is not communicated to the guest VM.
What could be tested next is building 4 Win2019 VM's and set the NUMA node affinity for each. Then test again and see what speeds you get with 1, 2, 4 and 3 vCPU, though it might be harder to filter that out. Maybe the higher speed cores will be spread out over the different dies.

Just curious if someone with another brand motherboard and 7551 experiences the same.

I'll leave it for what it is for now.
 
Last edited: