EU Suggestions for fp64 sys: cpu+mobo+ram

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

fp64

Member
Jun 29, 2019
71
21
8
Hello,

I would be interested in a stopgap system (mobo+cpu+ram) while I am waiting for amd to complete its zen2 rollout. I do not see getting anything new until sometime next year. Also, I have never dabbled with server parts, hence I am looking for suggestions if such can be made.

Background: I am running 24/7 fluid simulation research codes of my own writing that do turbulent flow computations with amd gpu acceleration ie fortran/c/opencl. This means loads of double precision floating point calculations (fp64).

The current setup is a 3930K with 32Gb and 2x Radeon vii. This cpu has to go. I am looking for 2nd-hand (refurbished?) cpu with more cores and speed but with at least all the 3930k fp extensions, 40 pci3 lanes, mobo with at least 2 full x16 pci3 slots (a plx chip with more slots would be nice) and at least 64gb ddr3(4?) (non ecc is ok). An x99-based system looks plausible. Btw, this is about EU-priced systems. What can be got with 500 euros?

Thanks.
--
 

Samir

Post Liker and Deal Hunter Extraordinaire!
Jul 21, 2017
3,257
1,446
113
49
HSV and SFO
Neat. For your simulation code, do more cores or faster thread performance matter more? Or do the gpus do the heavy lifting?
 

ari2asem

Active Member
Dec 26, 2018
745
128
43
The Netherlands, Groningen
regardless the budget and taking pcie-lanes into account, i would choose for threadripper-platform. all threadripper mainboard have electricly 2 x16-slots and 2 x8-slots. x8-slots are mostly x16-wide, but electricly x8-lanes.

if you want a lot of x16 slots (maximum is 7 slots), you can choose for x79 (old age), x99 or x299 intel chipsets.

asus x299 sage, asus x99-e ws. these 2 boards have 2 PLX8748 pcie-switches, enabling
x16-x8-x8-x8-x8-x8-x8.

dont ask me about info for cpu and prices, but you can imagine that this kind of systems are expensiver than 500 euro budget.

i bougth used parts of x79 mainboard, 32gb ram and e5-2690 v2 cpu for 700 euro.

used (or second hand) mainboard with 6 or 7 pcie-x16 slots are very rare to find.
 

fp64

Member
Jun 29, 2019
71
21
8
Neat. For your simulation code, do more cores or faster thread performance matter more? Or do the gpus do the heavy lifting?
u have asked the right questions. about 3/4 of the computational load is handled by the radeon_vii cards. they rip thru these so cpu is slowing everything down (and overheats because of the high summer ambient temp which self-throttles). the elapsed time for each computational step is 10-15% gpus the rest evidently cpu. the cpu part multithreads quite well so i stand to gain by having more cpu cores going faster.
 
Last edited:

ari2asem

Active Member
Dec 26, 2018
745
128
43
The Netherlands, Groningen
u have asked the right questions. about 3/4 of the computational load is handled by the radeon_vii cards. they rip thru these so cpu is slowing everything down (and overheats because of the high summer ambient temp which self-throttles). the elapsed time for each computational step is 10-15% gpus the rest evidently cpu. the cpu part multithreads quite well so i stand to gain by having more cpu cores going faster.
if number cpu core is important, you can choose dual socket intel platform, c602 or c612 chipset with xeon cpu's, v2 cpu's have 10 cores, v3 have 12 cores, v4 have 14 cores.

once again....not cheap options....you need raise your budget
 

fp64

Member
Jun 29, 2019
71
21
8
regardless the budget and taking pcie-lanes into account, i would choose for threadripper-platform. all threadripper mainboard have electricly 2 x16-slots and 2 x8-slots. x8-slots are mostly x16-wide, but electricly x8-lanes.

if you want a lot of x16 slots (maximum is 7 slots), you can choose for x79 (old age), x99 or x299 intel chipsets.

asus x299 sage, asus x99-e ws. these 2 boards have 2 PLX8748 pcie-switches, enabling
x16-x8-x8-x8-x8-x8-x8.

dont ask me about info for cpu and prices, but you can imagine that this kind of systems are expensiver than 500 euro budget.

i bougth used parts of x79 mainboard, 32gb ram and e5-2690 v2 cpu for 700 euro.

used (or second hand) mainboard with 6 or 7 pcie-x16 slots are very rare to find.
some months ago, someone over the reddit/amd forum did a test of the cpu-only version of my code on his 1950x and at best it was about 10% faster than my 6-core 3930k. a terrible result. the tester blamed the numa internals of the first generation threadripper. so this model is a non-starter. let's see how the zen2 version does; at best, zen2 threadripper will not be a buying option until well into the next year.

i was under the impression that the asus ws mobos with plx chips offer 4 pci3 slots with 16x/16x/16x/16x electrical but with a transfer speed penalty of several %.

my msi x79 mobo, 3930k and 32gb ddr3 @1600 cost me 1.1K euros in 2014 with vat @ 23%. the prices u quote above are surprising.

i must have @ min 2x16 electrical pci3 slots which is not a big deal. my current mobo has these. any extra slots will be a bonus that i can exploit as i have 3 radeon vii with one sitting in the box.
--
 
Last edited:

Samir

Post Liker and Deal Hunter Extraordinaire!
Jul 21, 2017
3,257
1,446
113
49
HSV and SFO
Thank you for the details on the utilization. What is the penalty of x8 pci3 on a gpu? Is it just 50% or is it something more? or less?

You mentioned that more cpu threads do help, but how much does each additional thread help? Is each thread a full 100% gain or is it partial? There is a balance between all out thread/core speed and multiple cores, especially when it comes to price so if 2x the cores at a slower speed are more beneficial than all out core speed, that is good because it is typically the cheaper route as well.

You mention cpu throttling because of heat being an issue. This is where some strong fans without pwm will help as this really shouldn't be an issue. I've noticed that a lot of times pwm modes do not cool enough and in time, so it's best to go 'old school' and just have the fan floored from the get-go if you know that the utilization will be there. I've run systems in 100F heat this way without ill effects as long as the fans are floored.

One thing I forgot to ask is if the codes you are running would have the capability to be run on multiple systems working together. Then your speed is not tied to just one system and you can quickly and cheaply scale out as much processing power as you need.
 

fp64

Member
Jun 29, 2019
71
21
8
my current msi mobo allows 2 gpu cards 16x/16x or 3 @16x/8x/8x. the second arrangement results in a small overall slowdown even though there is an extra card carrying out computations in parallel. at the moment the 16x/16x arrangement is optimal.

a linear scaling would mean that with the adding of two more cores the speedup on the cpu side will be at best 0.85*2/6. there will be inefficiencies in this that will reduce the assumed gain somewhat. a worthwhile aim for me now is to achieve a 50% speedup on the cpu side.

my current cpu is water cooled. i have also removed the side panel and i have a domestic fan blowing into the computer innards in order to minimize the presence of stagnant pockets inside the casing.

the cluster solution is rather inefficient for small scale setup. as an example, for a two-node cluster 64GB each, every node will have to transmit and receive ~20GB of data every ~15 cpu seconds. add to that that it entails code reprogramming and debugging and that i have zero experience in setting up and maintaining such a setup, and things look bad.

had a look at geekbench scores of an E5-2678v3 (8c) vesrus my current 3930k (6c). the single core and multithreaded numbers for single precision GEMM and FFT of the 3930k outdistance the 8 core xeon by a wide margin. that is why these xeons are dirt cheap; they are no bleeding any good. btw they both have the same fp extensions.

just want to mention that my original x79 mobo was a gigabyte which went 'up in smoke' leaving a few blackened capacitors after four months of operation. the replacement gigabyte lasted even less - 6 weeks. i had to buy an msi which turned out to be sterling stuff. u cannot skimp on mobos.
--
 
Last edited:

fp64

Member
Jun 29, 2019
71
21
8
this is strange. it will not allow me to edit the above post to correct typos.

now i have been able to edit it.
 
Last edited:

Samir

Post Liker and Deal Hunter Extraordinaire!
Jul 21, 2017
3,257
1,446
113
49
HSV and SFO
That's some intense computing! I've never heard of a water cooled setup overheat. :eek: Especially with a fan blowing on it as well. :eek:

That's a lot of data and 10Gb networks are at best 1GB/sec so you would need 40Gb or more to move the data you need to, moreless process it.

After a certain time, editing a post is no longer possible.