Upgrade from a Xeon Gold 6138

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

nk215

Active Member
Oct 6, 2015
412
143
43
50
Hello everyone

I am looking for an upgrade from Xeon 6138 CPU. If possible, 2x performance would be great. The main use for the new computer is mostly finite element analysis work (Nastran etc).

Please give me your recommendation.

Thanks
 

Rand__

Well-Known Member
Mar 6, 2014
6,634
1,767
113
2 6138s ? Sorry:p



So as always, not enough information (budget, same board, new board/generation/memory...).

The 6138 has 20 cores @ 2GHz, so 40GHz total compute, more depending on All Core Turbo per whatever operations (Normal, Avx2, Avx512) you run, lets say 2.9Ghz base so 58GHz.

Now you want double that, so 116GHz in a single CPU...
So maybe 8380 with 40 cores ... might do 2.9GHz all Core turbo if you're lucky

Else an AMD model if your work can be optimized appropriately?
 

nk215

Active Member
Oct 6, 2015
412
143
43
50
Hi Rand_,

The number of cores didn't really help out FEA as much as I would like. For Nastran, anything above 8 cores doesn't quite scale anymore for the kind of models I am running. Another FEA code I am running is even worse, it makes no difference between 8 or 400 cores in SMP modes.

Yes, I am open to AMD. I don't really have a budget but anything less than 10k is possible. This will be a new build so I am not looking to reuse anything. I have the case and power supply and that's it.
 

Stephan

Well-Known Member
Apr 21, 2017
945
714
93
Germany
How did you test 400 cores?

Your statement of anything above 8 cores not paying back dividends smells like RAM is the bottleneck here. If that is the case, you have to look into acceleration with GPUs if software supports it (like 4x 3090 in the system), or, implement a compute cluster with more than one machine.
 
  • Like
Reactions: nk215

alex_stief

Well-Known Member
May 31, 2016
884
312
63
38
FEA can/should be bottlenecked by memory bandwidth on modern CPUs. So first order of business is checking the memory configuration of your current machine.
If that's already optimal, a second Xeon 6138 is not actually your worst option. It doubles shared CPU resources, most importantly memory bandwidth.
Prerequisite for that is that the unspecified codes you run have a distributed memory mode. From my experience, the FEA codes that don't can not take full advantage of two NUMA nodes even in one shared memory system. Latency kills performance in that case.
That all is assuming you are running everything in-memory. For out-of-core, the CPU itself has a minor performance impact compared to the storage solution.

If we just stick with the assumption that the FEA codes you use are well-behaved, and thus limited by memory bandwidth: no single CPU available right now will give you a proper 2X performance. You would have to wait for Epyc Genoa for that.
GPU acceleration might be an option in some cases, but that is a whole new can of worms. You would need to do some research first whether your codes support it, which GPUs they support, whether the simulation types you run are ready for GPU acceleration, if your models are small enough to fit...
 
  • Like
Reactions: nk215

NablaSquaredG

Layer 1 Magician
Aug 17, 2020
1,353
821
113
2 EPYC EPYC 73F3 (16C) + all channels populated with Dual-Ranked DDR4-3200 DIMMs should give you a nice performance boost!

I wouldn't go with the 72F3 (8C), that might actually be not enough cores.

Higher clock, more modern µArch, 33% more memory channels (8 instead of 6), much higher memory clock (3200 instead of 2666), two CPUs instead of one

If you want to stay with single socket, you might try Threadripper Pro (but wait for the 5000 line to be released), because with TR Pro you can overclock memory. If you're willing to work without ECC and do some work on the overclocking, you might be able to get 8 Channels @ 3600Mhz with much lower timings than ECC DIMMs (CL14 or CL16 instead of CL22)
 
Last edited:
  • Like
Reactions: nk215

alex_stief

Well-Known Member
May 31, 2016
884
312
63
38
Since the the 16-core Epyc 73F3 has a higher list and street price than the otherwise similar 24-core Epyc 74F3, I would reserve the former for some ultra-niche applications.
 

NablaSquaredG

Layer 1 Magician
Aug 17, 2020
1,353
821
113
Since the the 16-core Epyc 73F3 has a higher list and street price than the otherwise similar 24-core Epyc 74F3, I would reserve the former for some ultra-niche applications.
Probably because it's (much) better binned than the 74F3
 

nk215

Active Member
Oct 6, 2015
412
143
43
50
How did you test 400 cores?

Your statement of anything above 8 cores not paying back dividends smells like RAM is the bottleneck here. If that is the case, you have to look into acceleration with GPUs if software supports it (like 4x 3090 in the system), or, implement a compute cluster with more than one machine.
What do you mean? We have 1536 machines with Xeon Gold 6248 (61,440 cores) and 576x2 machines with Xeon Gold 6138 (23,040x2 cores). When I run models, I can ask Nastran to use upto 256 machines (40 cores each) and the solver uses DMP and SMP to split the work load. There's a lot of overhead to split the model over multiple machines (nodes) so it's never worth it so far.

None of the solver that I use can utilize GPU unfortunately.
 

nk215

Active Member
Oct 6, 2015
412
143
43
50
Since the the 16-core Epyc 73F3 has a higher list and street price than the otherwise similar 24-core Epyc 74F3, I would reserve the former for some ultra-niche applications.
Where do you see the street price from? I just searched newegg and amazon and came up empty.
 

nk215

Active Member
Oct 6, 2015
412
143
43
50
2 EPYC EPYC 73F3 (16C) + all channels populated with Dual-Ranked DDR4-3200 DIMMs should give you a nice performance boost!

I wouldn't go with the 72F3 (8C), that might actually be not enough cores.

Higher clock, more modern µArch, 33% more memory channels (8 instead of 6), much higher memory clock (3200 instead of 2666), two CPUs instead of one

If you want to stay with single socket, you might try Threadripper Pro (but wait for the 5000 line to be released), because with TR Pro you can overclock memory. If you're willing to work without ECC and do some work on the overclocking, you might be able to get 8 Channels @ 3600Mhz with much lower timings than ECC DIMMs (CL14 or CL16 instead of CL22)
The current TR pro is the 39xx series. Are you saying that I should skip 2 generations (wait for the 5xxx chip)?
 

alex_stief

Well-Known Member
May 31, 2016
884
312
63
38
Where do you see the street price from? I just searched newegg and amazon and came up empty.
The current TR pro is the 39xx series. Are you saying that I should skip 2 generations (wait for the 5xxx chip)?
AMD is skipping a thousand in their naming scheme. TR 5000 (Zen3 architecture) will be the direct successor to TR 3000 (Zen2).
 
  • Like
Reactions: NablaSquaredG