Finally: Overclocking EPYC Rome ES

Epyc

Member
May 1, 2020
56
5
8
I'd believe 1.07V@2.4GHz, I think HWINFO64 and IPMI were reading just under a volt on my 64C ZS1406 @2.5. 2S is one stepping earlier than ZS so a slight increase in voltages to hit similar clock speeds seems plausible.

The best way to resolve the confusion is with a wattmeter - if it's actually 1.34V you'll be seeing some incredibly high power consumption at the wall, probably 600W+.
I have done the wall meter, on desktop no load its about 280 watts, under full compute load it goes a bit over 400w.
So indeed probably the low voltage is more realistic
 

bayleyw

Member
Jan 8, 2014
84
21
8
Be super careful - my understanding is that this overclock is more akin to a PBO type tweak than a standard overclock; the SMU is still alive and working in the background to adjust voltages and frequencies on the fly based on various system statuses. The voltage you set might not be the voltage you see and vice versa, likewise for package power (mine was reading something absurdly low, like 70W) and frequency (which doesn't correspond to performance).

I was using CB20 and a wattmeter to tweak mine - CB20 gives you a good idea of the frequency you're actually hitting and the wattmeter lets you estimate the volts accurately. For example if you could hit 30K CB20 at 700W you'd be doing pretty well since you'd have matched a PBO 3990X.

2x 32C has potential as a workstation though, apparently the 32C chips are way less temperamental than the 64C ones and you get twice the VRM capacity on the board to feed the cores. You could even make an argument that it is "better than a 3990X", since you have twice the package area to dissipate power and RDIMM support.

That power profile is interesting, I see 140 idle 360 load so while our load powers are similar your system is idling way higher than mine. I'd also try dropping to a manual 2.4GHz + lock frequencies overclock, it looks like that sometimes drops the volts over setting it high and letting the board throttle.
 
  • Like
Reactions: chraac

Epyc

Member
May 1, 2020
56
5
8
Be super careful - my understanding is that this overclock is more akin to a PBO type tweak than a standard overclock; the SMU is still alive and working in the background to adjust voltages and frequencies on the fly based on various system statuses. The voltage you set might not be the voltage you see and vice versa, likewise for package power (mine was reading something absurdly low, like 70W) and frequency (which doesn't correspond to performance).

I was using CB20 and a wattmeter to tweak mine - CB20 gives you a good idea of the frequency you're actually hitting and the wattmeter lets you estimate the volts accurately. For example if you could hit 30K CB20 at 700W you'd be doing pretty well since you'd have matched a PBO 3990X.

2x 32C has potential as a workstation though, apparently the 32C chips are way less temperamental than the 64C ones and you get twice the VRM capacity on the board to feed the cores. You could even make an argument that it is "better than a 3990X", since you have twice the package area to dissipate power and RDIMM support.

That power profile is interesting, I see 140 idle 360 load so while our load powers are similar your system is idling way higher than mine. I'd also try dropping to a manual 2.4GHz + lock frequencies overclock, it looks like that sometimes drops the volts over setting it high and letting the board throttle.
It does seem that the samples are really at there almost max performance, increasing the voltage to 1,2 in the utility only lets me go up to 3.20ghz software and 2.6ghz real life.
So I think the remaining gain has to come from the extra dims and some more fine tuning I fear
 

bayleyw

Member
Jan 8, 2014
84
21
8
If you leave the voltage at default (reboot, don't touch the voltage box) and try to set 3.0 GHz what happens?
 

Epyc

Member
May 1, 2020
56
5
8
If you leave the voltage at default (reboot, don't touch the voltage box) and try to set 3.0 GHz what happens?
As soon as I start CB it goes instantly black screen. and reboot.
Really have the feeling of the wall right in front of meo_O
But that's oke because the rig is just my mini render farm for me to offload projects to so it can buzz along.
 

yesoos

Member
Mar 10, 2020
38
3
8
PL
Hi, for ZS1406 sample , I got best results (perfomance per watt) with PBO hack , voltage set to 0.9v and freq to 3.2GHz with full load power consumption
of about 270W , C20 score of 19600 . Hi Pot testing right now for 7 days.
 

Attachments

Epyc

Member
May 1, 2020
56
5
8
Hi, for ZS1406 sample , I got best results (perfomance per watt) with PBO hack , voltage set to 0.9v and freq to 3.2GHz with full load power consumption
of about 270W , C20 score of 19600 . Hi Pot testing right now for 7 days.
That's not to far away from my CB score as well.
Also the seller of the chips told me my samples where able to boost a fair bit higher on single socket boards. He had the same experience on the dual socket supermicro.
 

Epyc

Member
May 1, 2020
56
5
8
Have you monitored the VRM temperature? Have you tried actively cooling it?
Yes, I already had a small fan on the middle heatsink and on the heatsink at the pcie slot. After hours of heavy work I saw in the readings the soc vrm was the only thing that became a little hot, like 55-60 degrees what is not even that warm.
Dit check and verify with a thermal imaging camera that it indeed was the correct reading and just added a fan on both soc vrm.... Just because I have them laying around.
Can see only the bottom one on the foto, other one is behind the back soc
 

Attachments

Epyc

Member
May 1, 2020
56
5
8
Its also a real shocker to me just how totally crap this mobo is.
A little rant follows, forgive me.

I know a server is no gaming machine so I don't mean rgb and fancy bells and wistles.
But it also works like a 10 dollar OEM mobo purchased on a Chinese market, combined with the feel we will never ever present proper support and updates.
I get the need for the server specific things around it, but a simple VGA with the BMC is not worth hundreds of dollars.
And even that, common I know again its a purely functional thing but it doesn't cost anymore time or effort just to make the interface look like its not from 1989.

If I had any choice in mobo's….... unfortunately not because this is the only dual socket available for separate purchase.
Would have returned this piece of crap in a heartbeat.
I find it funny because supermicro is trying to get some headway in the high end desktop and consumer space but if this board is representative for there overall quality I can tell them now its a lost cause because even the average TRX40 workstation board is so insanely more stable, robust and loaded with a ton of usefull stuff.
And you have things like the EPYC chips, that are the best binned dies around, gpus like the V100 that are also very complex and binned and it all has to connect on this piece of garbage.
Its like the memory timings of ECC memory, I fully understand it has to be error proof. But a lot of the dies from Samsung are already 3200mhz proof, its just yea.... no...… lets just run it at 2133, nobody needs speed.
Yea you know what, lets loosen up the timings to way beyond what anyone would normally accept and after that we will charge a premium for it. Server powerrrrrrrr.

Oke that was probably not very nuanced.
So let me know of the pitfalls I did not consider with this looking true my high end workstation glasses.

On a side note, I can confirm they wont boot with normal ddr4, I just had to try.

Also where in the world can I find cables for those to me insanely stupid, propriotary and unknown nvme and sata combo ports:eek:.:rolleyes:
They completely block any possibility of a long GPU I had originally intended for the system
Again supermicro probably never looked at just putting connectors at the edge rotated 90degrees so the would not stick out and block stuff.:mad: Just general disappointment about everything about this mobo, to be clear I love the dual epyc
 

Attachments

Last edited:

mirrormax

Member
Apr 10, 2020
72
30
18
Its also a real shocker to me just how totally crap this mobo is.
A little rant follows, forgive me.

I know a server is no gaming machine so I don't mean rgb and fancy bells and wistles.
But it also works like a 10 dollar OEM mobo purchased on a Chinese market, combined with the feel we will never ever present proper support and updates.
I get the need for the server specific things around it, but a simple VGA with the BMC is not worth hundreds of dollars.
And even that, common I know again its a purely functional thing but it doesn't cost anymore time or effort just to make the interface look like its not from 1989.

If I had any choice in mobo's….... unfortunately not because this is the only dual socket available for separate purchase.
Would have returned this piece of crap in a heartbeat.
I find it funny because supermicro is trying to get some headway in the high end desktop and consumer space but if this board is representative for there overall quality I can tell them now its a lost cause because even the average TRX40 workstation board is so insanely more stable, robust and loaded with a ton of usefull stuff.
And you have things like the EPYC chips, that are the best binned dies around, gpus like the V100 that are also very complex and binned and it all has to connect on this piece of garbage.
Its like the memory timings of ECC memory, I fully understand it has to be error proof. But a lot of the dies from Samsung are already 3200mhz proof, its just yea.... no...… lets just run it at 2133, nobody needs speed.
Yea you know what, lets loosen up the timings to way beyond what anyone would normally accept and after that we will charge a premium for it. Server powerrrrrrrr.

Oke that was probably not very nuanced.
So let me know of the pitfalls I did not consider with this looking true my high end workstation glasses.

On a side note, I can confirm they wont boot with normal ddr4, I just had to try.

Also where in the world can I find cables for those to me insanely stupid, propriotary and unknown nvme and sata combo ports:eek:.:rolleyes:
They completely block any possibility of a long GPU I had originally intended for the system
Again supermicro probably never looked at just putting connectors at the edge rotated 90degrees so the would not stick out and block stuff.:mad: Just general disappointment about everything about this mobo, to be clear I love the dual epyc
the minisas position is pretty annoying, but the oculink port should be fine, maybe move the gpu down so it dosnt cover it the minisas
 

Epyc

Member
May 1, 2020
56
5
8
Hey, after some good thinking about the fact that the dual soc cant go as high while the single soc does.
It made me think its got to do something with the dual socket connection.
I realized I test all the different speed settings of the intra sock speeds. 18gbps was almost instant crash and 16 was stable and my current setting to maximize bandwith and get latency as low as possible.
But I know from the 18gbps this can cause instability.
So I tuned it down to 13gbps and what do you know, can go up 200mhz without any issue.
Further reducing it down to 10.666 something does not seem to give any more improvement, so the best balance is at 13gbps for this setup. It sucks because according to the amd reference documentation we should be able to config a lot more on the intra sock link, like make it 8 link wide. But supermicro probably cheaped out at that.

So probably with some new fine tuning clocks are at 2,6-2,7ghz all core stable.
Has anyone experimented with memory interleaving size and chip select interleaving?
 

ExecutableFix

Active Member
Nov 25, 2019
114
46
28
Awesome to see everyone sharing their results :D

I'll make the software open-source soon enough so everyone can adjust their presets if they want to.
 
  • Like
Reactions: Layla

MrCake117

Member
Feb 28, 2019
32
22
8
21
Japan
I unlocked some settings in AMD CBS concerning Infinity Fabric , changing one of these settings make my H11dsi post stuck at 94 or D2 which seems to be related to Pcie Enumeration.

Here's my custom bios

H11DSi 2.0 XFR
H11DSi 2.0 XFR Fixed

-The board automaticaly boot on a offboard gpu, IOMMU is enabled by default.
-AMD CBS and XFR are unlocked (XFR settings are located in NBIO in a blank submenu)

I hope this will help
 

Attachments

Last edited:

bayleyw

Member
Jan 8, 2014
84
21
8
I unlocked some settings in AMD CBS concerning Infinity Fabric , changing one of these settings make my H11dsi post stuck at 94 or D2 which seems to be related to Pcie Enumeration.

Here's my custom bios

H11DSi 2.0 XFR

-The board automaticaly boot on a offboard gpu, IOMMU is enabled by default.
-AMD CBS and XFR are unlocked (XFR settings are located in NBIO in a blank submenu)

I hope this will help
That's pretty neat - do the XFR and PBO settings do anything to the processor clock or are they inactive on Epyc?