...Can you detail your build in terms of chassis (desktop or rack mount), cooling, why no ECC, etc.
Ddr5 ecc udimm chips are not readily available yet it seems.
yeah non pro tr5000 would be nice, good to hear the vrm temps where fine, thats actually surprising they stay that cool with the tiny heatsink. what case/rack do you have it in? mine should be here any day now hopefully the KS works even if its out of spec
Agreed - no ECC because it's not available. My (limited) understanding is that there's already some form of ECC built into DDR5 as well which should cover most bases other than reporting of errors.
The existing solvers, now 4 years old, do not have ECC and it has not caused any known problems.
Here's the BOM:
|BOM item||description / P/N||Approx $|
|RAM (2x32)||Crucial CT2K32G48C40U5|
|Case||Fractal Meshify 2 XL|
|CPU cooler, thermal compound||Arctic Liquid Freezer II 420, Noctua NT-H2|
|HD||KXG60ZNV512G (picked because it's on the SM approved list)|
|Case Fans (6)||NF-A14 PWM|
|PSU||Seasonic Focus SSR-750PX|
This build was really all about the CPU. As mentioned upthread, the workload here is a niche CFD software; this software uses networked solvers, so the engineers configure the simulation using a frontend that can run on a laptop, then push it to a solver to do the number crunching.
It's all CPU, no GPU. It needs 2 channels of DDR4 (or 5) to scale well per each ~6 cores, so 8 cores with 2 channel RAM or 12-16 cores with 4 channel RAM are the sweet spots. More cores than that and CPU frequency drops off faster than the extra RAM channels help. Scaling is non-linear, so more than 16 cores does not really help regardless.
Minimum hardware list indicates a GPU is necessary, and it runs fine with the iGPU so I'm sticking with that. Part of the PSU sizing was so I could add a discrete GPU if needed.
The solutions generally take 1.5 - 4 hours to compute, though if an optimization problem is run (i.e. a DOE), back to back solutions are iterated that can take days of solid running. So the thermal solution needs to be designed to handle that.
I've done in house benchmarking on this software off and on over the last 4 years (when we first got this software) and based on that I suspected Alder Lake's P cores would be perfect (i.e. better than anything else on the market) if I could get 8 fast ones. Over and over I was reading that thermals were the primary constraint and that water cooling was needed to prevent thermal throttling under several types of heavy load.
Case was picked to for the ability to fit the 420mm AIO CPU cooler without compromise and to house enough case fans to allow for slight positive pressure in the case and with built in dust filters. I manually masked off selected openings in the case to control the airflow, preventing short circuiting, and directing the exhaust air where I want it (e.g. I blocked off most of the PCI slots, openings in the sheet metal, etc.).
If this AIO was not able to keep up my fallback was to get a 2U case and do custom watercooling and hook them into our plant's chilled water (60 deg F). Though a fun project, it has proven not to be necessary to run this CPU on its 8 P cores flat out and it would have added complexity (though the BOM cost would be about a wash, in house labor would be a lot higher). I'll be building 3 of these total - we have licensing for 3 simultaneous solvers, so another variable was "one big solver" vs 3 smaller ones.
Here's HWinfo screen cap after running P95 blend for about 55 minutes. I reset the counters about 4 minutes into the run. Temps are lower on the actual CFD software. In previous testing I have found P95 blend to be the best proxy for CPU frequency the actual CFD application will be able to achieve.
As you can see, VRM temps are healthy with this setup and the CPU is not thermally throttling, it is limited by the 241W power budget. Fan config is 5 NF-A14 as intake, 1 as CPU exhaust and the AIO as exhaust out the top of the case (stock fans are on the AIO, though I had higher pressure Noctuas on hand in case a swap was deemed appropriate after testing).
Resulting noise on full is just above background in our office environment. It was more noticeable at home (I did most of the build and initial testing there), but with the large diameter fans the frequency was not objectionable.
IPMI is currently set to fans at "Full". Since these will be sitting in a server room I may just leave them that way; currently playing with IPMI fan minimum thresholds to see if I can do otherwise. Putting them on optimal (variable speed) would help with long term dust collection as these are not active 24/7. If I do that, I'll move them all to the CPU Fan headers so the PWM on the case fans scales with CPU temps (as that's the only load in the system).