ES Xeon Discussion

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

mtg

Active Member
Feb 12, 2019
111
71
28
Be nice if more people running HEDT SPR complain more about it...
I think most of us are using them with Linux (+ Nvidia GPUs) for ML stuff. Why else pay the huge tax to get 112 lanes. You see similar issues with Epyc chips too. Of course, there are exceptions but then the argument is, how many people want Windows, SPR, and care enough about power they want to use efficient settings.
 

Cythisia

Member
Jul 25, 2024
42
10
8
I think most of us are using them with Linux (+ Nvidia GPUs) for ML stuff. Why else pay the huge tax to get 112 lanes. You see similar issues with Epyc chips too. Of course, there are exceptions but then the argument is, how many people want Windows, SPR, and care enough about power they want to use efficient settings.
My issue mostly pulling 700 watts at plug on CPUs alone for dual board, but pairing with 4 4090s with no rules for LLM is 2.5kw draw. Even if offshoot to V100s which are same price with worse token/s

But 3 systems drawing 7.0kw average (different workloads) for 150+ kw/h in 24 period... even on Linux performance pref from PS to balance drops generation significantly due to scheduler getting wakelocked for backend/frontend tasks that are not LLM or at forefront, even with loaded affinity.

Container/VM performance is abyssmal as well as scheduler on main OS does not handle this well at all either on SPR. This am not able to replicate on ROME/GENOAs as bad,even remotely.

SPR is not good right now unless you CAN afford energy to workload.
 
  • Like
Reactions: sam55todd

mtg

Active Member
Feb 12, 2019
111
71
28
My issue mostly pulling 700 watts at plug on CPUs alone for dual board, but pairing with 4 4090s with no rules for LLM is 2.5kw draw. Even if offshoot to V100s which are same price with worse token/s

But 3 systems drawing 7.0kw average (different workloads) for 150+ kw/h in 24 period... even on Linux performance pref from PS to balance drops generation significantly due to scheduler getting wakelocked for backend/frontend tasks that are not LLM or at forefront, even with loaded affinity.

Container/VM performance is abyssmal as well as scheduler on main OS does not handle this well at all either on SPR. This am not able to replicate on ROME/GENOAs as bad,even remotely.

SPR is not good right now unless you CAN afford energy to workload.
Yeah, but if you can afford a dual 4677 motherboard AND 4x 4090s (and the ram for that!), do you really care about saving power? At that point you are starting to creep close to the scale where I'd say just use runpod/another cloud! That's what, a 10K-20K system? So at 2$ per hour for an H100, that's ~1 year of 24/7 use. Then you also don't have to cool 7KW in your house. Cooling 1.5KW in the summer means spending 1.8KW of AC... Maybe 50% less if you have a really efficient heat pump?

These extreme HEDT really are not tuned for home use. Dual socket+ never worked great on windows, ever? Surprisingly when used in the home, they behave worse than consumer systems designed for home use.

And for Epic/Milan, the issues have been worked out on Linux/Windows. Try using FreeBSD with a Milan chip...weird issues at every turn. Basically they sit in C1 and never down clock. ~20C temp difference using Linux vs. FreeBSD. Perhaps I misconfigured something, but in general, the less eyes on something, the less easy your path will be...
 

DHamov

Active Member
Jan 12, 2024
121
31
28
Yeah, but if you can afford a dual 4677 motherboard AND 4x 4090s (and the ram for that!), do you really care about saving power?
For professional systems servers that are on 24/7 The price off the power usage for about 4 years of system use can be as high as the total value of the system. Lets take this case example 2.5kWh in germany price depending on location can be 40 cents per KWh. so 1 Euro per hour. times 24 times 365 =8760 Euro per year. Say we have a load factor of 0.7 we still get about 6100 Electricity cost per year. (Without considering Airconditioning power). My personal load factor is closer to 0.3-0.4 and wall powerddraw about 1.2kW, but still Electricity cost are really significant and generally also a reason to upgrade to more modern hardware, and to use ES cpu's. But honestly a dual cpu set up with these SPR (ES) xeons are really power hungry, if they are only used to feed gpu's. For many load cases single cpu with multi gpu is probably better. Unfortiunatly i also need the cpu power, and yes for me dual socket performance is also a bit disappointing.
 
Last edited:
  • Like
Reactions: sam55todd and mtg

H4te

New Member
Sep 1, 2024
1
1
3
the problem is you can only upgrade to Xeon W QS/PRD, QS/PRD scalables don't work.
there is currently a MS33-CPA(OEM) board from Netherlands available, i can say more about BIOS etc. once mine arrives:
edit:
MS33-CPA-V10 BIOS F01 07/12/2023 (similar to F17 of MS33-AR0)
NO warranty, NO support from gigabyte(no BIOS update etc)
you can use BMC FW from other MS33 board.

note that you need RDIMM, UDIMM don't work, UDIMM don't fit.
Thanks for the Link of the MS33-CPA i got my hands on a QYFP and the price will fit in my budget :)
 
  • Like
Reactions: RolloZ170

Cythisia

Member
Jul 25, 2024
42
10
8
@RolloZ170

Unable to find in MS73-HB0 to disable cores, other than performance level 1-4 with level 4 going from 54 cores to 48 cores to get higher all core clock.

Looking to get larger boost lock out of disabling cores, but am not finding option in Gigabyte manual or just in BIOS. Not supported?
 

RolloZ170

Well-Known Member
Apr 24, 2016
10,079
3,221
113
germany
Pre-Socket Configuration Press [Enter] to configure advanced items. CPU Socket 0 Configuration – Core Disable Bitmap(Hex) • Number of Cores to enable. 0 means all cores. FFFFFFF means to disable all cores. The maximum value depends on the number of CPUs available. Press the numeric keys to adjust desired values.
@RolloZ170
Unable to find in MS73-HB0 to disable cores, other than performance level 1-4 with level 4 going from 54 cores to 48 cores to get higher all core clock.
Looking to get larger boost lock out of disabling cores, but am not finding option in Gigabyte manual or just in BIOS. Not supported?
Per-Socket Configuration (Press [Enter] to configure advanced items)
CPU Socket 0 Configuration – Core Disable Bitmap(Hex) • Number of Cores to enable. 0 means all cores. FFFFFFF means to disable all cores. The maximum value depends on the number of CPUs available. Press the numeric keys to adjust desired values.
 

Cythisia

Member
Jul 25, 2024
42
10
8
your sockets have different bitmaps, do you run not identical CPUs ?
I noticed this, they are both QYFRs, and did not look much into it since they have identical processor ID... Could be the issue? I can try disabling socket 1 and see if anything adjusts with per-socket config.
 

RolloZ170

Well-Known Member
Apr 24, 2016
10,079
3,221
113
germany
I noticed this, they are both QYFRs, and did not look much into it since they have identical processor ID... Could be the issue?
can be different position of deactivated cores of course. all SPR XCC have 60C( 4x 15) but not all activated.
with the Bitmap it should work, try older BIOS.
 
  • Like
Reactions: Cythisia

Cythisia

Member
Jul 25, 2024
42
10
8
Okay figured out it was Intel SSP and Dynamic Intel SSP.

Reset defaults, changed bitmap. BIOS reports "16" Fs, adjusted bitmap from 16 to 11, but does not register disabled cores until 11, and registers 71 cores...

However weirdest thing, socket 1 registers 16 detected cores but socket 0 displaying 55 with same applied bitmap????

Base clock and boost remain


played around more, does not seem disabling cores in socket 0 bitmap works o_O

discovered adding 16 '0's in place of F's on socket 0 for 16 0's makes base clock 1.7 :eek:
 
Last edited:

RolloZ170

Well-Known Member
Apr 24, 2016
10,079
3,221
113
germany
However weirdest thing, socket 1 registers 16 detected cores but socket 0 displaying 55 with same applied bitmap????
if you use the bitmap shown, all cores are disabled.
ther must be 60bit = 4x 15 bit representing the XCC cores.
its a good idea to disable same count of cores in each tile.
 

Cythisia

Member
Jul 25, 2024
42
10
8
if you use the bitmap shown, all cores are disabled.
ther must be 60bit = 4x 15 bit representing the XCC cores.
its a good idea to disable same count of cores in each tile.
Bitmap not applying for socket 0, but can assign OS to detected socket 1, just.. base clock, all core, and boost clock remain the same as if all 56 were enabled...
 

RolloZ170

Well-Known Member
Apr 24, 2016
10,079
3,221
113
germany
Bitmap not applying for socket 0, but can assign OS to detected socket 1, just.. base clock, all core, and boost clock remain the same as if all 56 were enabled...
don't understand. you need to edit both socket bitmaps.
HWinfo will show you less cores per socket. if not it don't work or you done it wrong.