Threadripper 3990X, 64 cores, 4 memory channels?

diogin

Member
Mar 28, 2018
32
5
8
Beijing, China
AMD has announced and confirmed Threadripper 3990X which will be launched on Jan, 2020, with 64 cores and 256MB L3 cache (like Epyc 7742), but didn't reveal its working platform and price.

As indicated by the name "Threadripper", it should belong to the sTRX4 socket product line, which has 4 memory channels. So, a 64 core CPU working on 4 memory channels? This sounds crazy, 16 cores competing for 1 memory channel? What impact would happen on this design?
 

llowrey

Member
Feb 26, 2018
82
52
18
So, here's a review of RAM speed scaling on a Ryzen 9 3900X (12 cores). Single channel performance is shockingly not terrible at 3200. Perhaps RAM at 3600 or faster could do it.

AMD Zen 2 Memory Performance Scaling with Ryzen 9 3900X

I have not seen single channel benchmarked anywhere else for zen2 and have not attempted to replicate it myself. Still, I consider techpowerup to be credible.
 

blinkenlights

Active Member
May 24, 2019
132
48
28
As indicated by the name "Threadripper", it should belong to the sTRX4 socket product line, which has 4 memory channels. So, a 64 core CPU working on 4 memory channels? This sounds crazy, 16 cores competing for 1 memory channel? What impact would happen on this design?
I agree with your comment, but suspect these processors will be moving data in/out of main memory so quickly that contention does not become a performance issue. Either that or the target workload has a high cache-hit rate, so most of your cycles will be spent in the large L2 and L3.

What I would rather see is higher memory density by system integrators to reduce load on the IMC(s) and decrease power consumption.. 2 DIMMs for dual-channel, 4 DIMMs for quad-channel, etc.
 

diogin

Member
Mar 28, 2018
32
5
8
Beijing, China
So, here's a review of RAM speed scaling on a Ryzen 9 3900X (12 cores). Single channel performance is shockingly not terrible at 3200. Perhaps RAM at 3600 or faster could do it.

AMD Zen 2 Memory Performance Scaling with Ryzen 9 3900X

I have not seen single channel benchmarked anywhere else for zen2 and have not attempted to replicate it myself. Still, I consider techpowerup to be credible.
Thanks for the link, the comparisons are really meaningful. The article finally got a conclusion:

"Just for kicks this time around, we threw in a single-channel DDR4-3200 configuration. This is what you'd end up with if you're only using one module or didn't install your two modules in the proper slots. Much to our surprise, the performance hit is much less than expected. One possible explanation for this could be the "unganged" memory controller topology of AMD processors, which favors physically independent 64-bit wide paths to each memory channel instead of blindly interleaving the two channels like Intel does. We would still definitely recommend you to stick to dual-channel configurations. There's no shame in reading your motherboard's manual to find out which memory slots to use for dual channel. Another demerit of choosing one 16 GB module over two 8 GB modules in dual channel would be that dual-rank modules continue to be a problem area for AMD."

Seems like 4 memory channels may not be a problem even for 64 cores. But still looking forward to a real application comparison between a 4-channel Threadripper 3990X and a 8-channel Epyc 7742.
 

diogin

Member
Mar 28, 2018
32
5
8
Beijing, China
I agree with your comment, but suspect these processors will be moving data in/out of main memory so quickly that contention does not become a performance issue. Either that or the target workload has a high cache-hit rate, so most of your cycles will be spent in the large L2 and L3.
Yeah, if the application's working set is not large, they would fit the large L3 cache and hardly access RAM.

What I would rather see is higher memory density by system integrators to reduce load on the IMC(s) and decrease power consumption.. 2 DIMMs for dual-channel, 4 DIMMs for quad-channel, etc.
A notice given by the article linked above:

"Another demerit of choosing one 16 GB module over two 8 GB modules in dual channel would be that dual-rank modules continue to be a problem area for AMD."

I don't know what problem on a dual rank DIMM here, but if it is true, then perhaps populating dual-channel with 4 DIMMs (with 1 rank per DIMM) may be a better choice?
 

alex_stief

Active Member
May 31, 2016
610
186
43
35
That problem went away with Zen2. IMO, it was never really a problem, but that is besides the point.
Now it's pretty much the opposite. Since memory speeds on Zen2 are effectively limited to DDR4-3800 for a 1:1 RAM/IF ratio, having dual-rank DIMMs actually helps with memory-bound workloads.
On the topic of "too many cores per memory channels": you need to be aware of the kind of software you are running. The applications usually tested with consumer-grade CPUs run fine with low bandwidth per core. But at 16 cores per memory channels, you will definitely discover memory bottlenecks with software that is usually considered compute-bound.
It also depends on the application. In CFD/FEA, a 16-core Epyc Rome CPU can be faster than a 32-core Threadripper CPU.
 
Last edited:

Scott Laird

Active Member
Aug 30, 2014
251
100
43
There have also been rumors of a set of "WRX80" (or similar) chipset+socket that supports 8 channels, presumably for "W"-series Threadrippers. It didn't show up with the launch of the 3960X/3970X earlier this month, but there's a chance that it will show up next year along with the 3990X (3990WX?).
 

diogin

Member
Mar 28, 2018
32
5
8
Beijing, China
That problem went away with Zen2. IMO, it was never really a problem, but that is besides the point.
Now it's pretty much the opposite. Since memory speeds on Zen2 are effectively limited to DDR4-3800 for a 1:1 RAM/IF ratio, having dual-rank DIMMs actually helps with memory-bound workloads.
On the topic of "too many cores per memory channels": you need to be aware of the kind of software you are running. The applications usually tested with consumer-grade CPUs run fine with low bandwidth per core. But at 16 cores per memory channels, you will definitely discover memory bottlenecks with software that is usually considered compute-bound.
It also depends on the application. In CFD/FEA, a 16-core Epyc Rome CPU can be faster than a 32-core Threadripper CPU.
Thank you. As Intel CPUs often provide enough memory channels even their cores are less, for example, a six-core 6800K on X99 with 4 memory channels, I got an impression that "2 cores paired with 1 memory channel works best". So "16 cores paired with 1 memory channel" really shocked me.
 

diogin

Member
Mar 28, 2018
32
5
8
Beijing, China
There have also been rumors of a set of "WRX80" (or similar) chipset+socket that supports 8 channels, presumably for "W"-series Threadrippers. It didn't show up with the launch of the 3960X/3970X earlier this month, but there's a chance that it will show up next year along with the 3990X (3990WX?).
If it was called "3990WX", then chances exist :D
With the name "Threadripper 3990X", I believe it should run on TRX40 platform.
 

blinkenlights

Active Member
May 24, 2019
132
48
28
I don't know what problem on a dual rank DIMM here, but if it is true, then perhaps populating dual-channel with 4 DIMMs (with 1 rank per DIMM) may be a better choice?
They may be referencing this issue: AMD Ryzen - Single-Rank Versus Dual-Rank DDR4 Memory Performance - Legit Reviews

I doubt there is a tangible performance difference between 2x1R per channel and 1x2R per channel. Your performance might suffer with 2x2R per channel, if supported. It's been a while since I dabbled in desktop processor architecture, but the best answer has always been "it depends" - figure out which memory configuration performs the best and is most stable for your target workload. For my home systems (Xeon-W, Xeon-SP, E5 v4, and E3 v5 mobile) that happens to be one DIMM per channel. YMMV :)
 

alex_stief

Active Member
May 31, 2016
610
186
43
35
Worth noting here is that official memory support for Threadripper 3000, just as for other Zen-based CPUs, depends on the memory configuration.
Can't find the source right now, but with 8x64GB, DDR4-2666 seems to be the official limit.
So as usual, better stick to one DIMM per channel for maximum memory frequency while overclocking.
 

maes

Member
Nov 11, 2018
64
30
18
There have also been rumors of a set of "WRX80" (or similar) chipset+socket that supports 8 channels, presumably for "W"-series Threadrippers. It didn't show up with the launch of the 3960X/3970X earlier this month, but there's a chance that it will show up next year along with the 3990X (3990WX?).
Considering Epyc uses the same socket (mechanically speaking) and sports 8 channels, it wouldn't be surprising. If I remember correctly there was talks of both TRX80 and WRX80 platforms. No idea what the difference between them would be.

It'd be fun if it turns out both the 3960x and 3970x also have 8-channel memory controllers, sharing an IO die with the higher-core-count chips, and AMD was keeping that tidbit of information in reserve for the big 'flagship product' reveal (or because the 8-channel motherboards weren't ready yet) and to kick Intel when they're down.