AMD Genoa

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

i386

Well-Known Member
Mar 18, 2016
4,241
1,546
113
34
Germany
Some german it news sites[1,2] are reporting that information about the 4th gen epyc cpus has leaked.
They quote this twitter post:
The genoa cpus will support
- up to 96 cores per cpu
- 12 channdel ddr5-5200
- 128 pcie 5.0 lanes (up to 160 in 2p configurations)
- up to 320watt tdp (some sources report 400 watt)
- a new socket (sp5 isntead of sp3)
- 5nm process

Other sources: AMD EPYC 7004 Genoa Zen 4 CPU Allegedly Sports 12-Channel DDR5, Massive LGA-6096 Socket

German it news sites/sources:
[1]: Golem.de: IT-News für Profis
[2]: Genoa: Supercomputer-Plan leakt erste Daten zu AMDs Epyc mit Zen 4
 

Syr

Member
Sep 10, 2017
55
20
8
12-channel, dunno it means 6 sticks or 12 sticks? DDR5 has 2 32bit channels on each stick.
They actually mean 12 sticks. The socket is a 6096-pin LGA (2002 pins more than lga 4094/sp3), which makes perfect sense for 12, especially given the pcie count is the same, and the number of pins consumed for stick of DDR5 and PCIe gen5 lane is identical that which was needed for a stick of DDR4 and PCIe gen4 lane respectively. Based on pinout diagrams of the SP3 socket, AMD would need at least 1200 more pins in order to achieve this memory configuration (accounting for the common pins within the memory data and control pins required for signal integrity), plus additional pins for signal integrity between each memory bus, and additional power pins to support processors running at 400W. This also lines up with Mark Papermaster's comments about how their core design is very bandwidth constrained, and given the IPC increases that zen4 and zen5 are rumored to have, it makes sense that they would go to 12 to provide enough bandwidth to reasonably run 96 zen5 cores mostly unconstrained.

Similarly, sapphire rapids is using "8 channels" (even intel refers to it as 8) of ddr5, for 8 sticks and 16 actual channels. Sapphire rapids uses a 4677 pin lga socket, with most of the additional pincount being for an increase in pcie lanes over skylake, and some being for power.
 
  • Like
Reactions: TXAG26

111alan

Active Member
Mar 11, 2019
291
109
43
Haerbing Institution of Technology
They actually mean 12 sticks. The socket is a 6096-pin LGA (2002 pins more than lga 4094/sp3), which makes perfect sense for 12, especially given the pcie count is the same, and the number of pins consumed for stick of DDR5 and PCIe gen5 lane is identical that which was needed for a stick of DDR4 and PCIe gen4 lane respectively. Based on pinout diagrams of the SP3 socket, AMD would need at least 1200 more pins in order to achieve this memory configuration (accounting for the common pins within the memory data and control pins required for signal integrity), plus additional pins for signal integrity between each memory bus, and additional power pins to support processors running at 400W. This also lines up with Mark Papermaster's comments about how their core design is very bandwidth constrained, and given the IPC increases that zen4 and zen5 are rumored to have, it makes sense that they would go to 12 to provide enough bandwidth to reasonably run 96 zen5 cores mostly unconstrained.

Similarly, sapphire rapids is using "8 channels" (even intel refers to it as 8) of ddr5, for 8 sticks and 16 actual channels. Sapphire rapids uses a 4677 pin lga socket, with most of the additional pincount being for an increase in pcie lanes over skylake, and some being for power.
Then it will ease the memory bound situation a lot if they manage to reduce the latency as well. Right now it is the biggest problem in the real world usecases for AMD especially their EPYCs, as their zero-bubble prediction is constantly sucking far more data than any other algorithms from the already inter-die memory subsystem. They're possibly bringing in more FP units or something like AVX512. Making an 2S-spec single CPU is always good to see, just hope it's a more balanced design.

There's a problem here, those 12 sticks will never fit in current 2S E-ATX(or EEB) motherboard designs, and 1S motherboards may be forced to go larger than standard ATX too(or the memory channels will squeeze out the space for PCI-e slots). May be quite interesting to see how motherboard manufacturers tackle with this.

The rumored 50% IPC uplift is a joke, unless people keep using the selected few applications as what they're doing now. This is the number if they double the amount of every single unit in the current architecture, since branch prediction is very close to its limit, and memory subsystem even application themselves are more likely to be the bottleneck now. Processor designs nowadays have 5 pipelines but usually the CPI is more than 1(or IPC is less than 1) in most situations, meaning, how good the core itself is, the programs aren't actually using them, and in most cases you can only improve the time when there are more instructions coming. Also the front-end can't be easily enlarged as the decoder units isn't fully parallel. There isn't much headroom left.
 
Last edited: