Code Compilation, Web Server, Game Hosting... Epyc, Threadripper, or Consumer CPU?

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

yoneech

New Member
May 24, 2023
7
0
1
I've got a general-purpose home server with an FX-8350 that's running multiple VMs, each responsible for miscellaneous things: Webservers, game servers, Discord bots, SVN repos, etc.

Me and a friend have been developing a game in Unreal Engine 4 over the past few years. I tried creating a Jenkins server that'd allow us to offload the build process to my home server. Unfortunately I seem to have choked-out all the performance I can get from my trusty ~11 year-old CPU. The performance was abysmal, taking around 8 hours to compile and package a UE4 project.

I've been trying to research an upgrade path. I found a good deal on some older, used Epyc parts on eBay (7551P was primarily what I was looking at), but I'm concerned about whether this will serve my use-case. I can't seem to find data on UE4 build times on Epyc.

Threadripper also came to mind, but TR Pro is outside my price range and the 3960x seems to be as far as I can go on my budget.

Lastly, I could just upgrade to a high-end, consumer-grade CPU like an i9 or R9 X3D, but I'm not sure if I'd find those lacking for the workload I want to achieve.

As for why AMD and not Intel: No particular reason. The prices I found AMD parts for just seemed more within budget than the Xeons I was looking at.

I wanted to see if anyone had any suggestions! Server hardware is a bit of an unknown for me.

Thank you!
 

Syr

Member
Sep 10, 2017
55
20
8
Personally, I went the route of having more focused builds per class of task, but I get that is not always within the monetary, spacial, electrical, and/or spousal/partner/roommate approval budgets of someone's living situation.

When it comes to my production/non-lab servers:
  • I run my game servers off of a consumer-grade R7 7800X3D right now, because a lot of them seem to run at their best (especially with a lot of player automation/factories in games like modded minecraft) with more cache. If thats not an option, then chips with high ST perf are often the next best thing, though it depends on the specific game server & if you are using any mods on it.
  • Most of my containerized services including my local devops infrastructure are running off of an old E5 2699 v4 build - generally any chip with a bunch of cores does fine in this role as long as the per-core/per-thread perf isnt abjectly terrible (Basically just avoid the xeon phi chips, though its practically impossible to get reasonably priced standard formfactor boards for them anyways), so a cheap epyc is a great choice. Plus you can use ECC RDIMMs (or even LRDIMMs on some platforms) for more capacity which is not available on consumer platforms. Mine is pretty much at capacity though and cant take further component upgrades, so I'll have to migrate some things (likely starting with the heavyweight services that are consuming the most resources) over to a new system.
 
  • Like
Reactions: yoneech

unwind-protect

Active Member
Mar 7, 2016
540
216
43
Boston
How long does an incremental build take? Change one line, how long until a program runs.

The affordable EPYC and TR aren't great for single-core performance which you need for tasks such as linking. You don't need the many PCIe lanes or the large RAM capacity.

I'd seriously think about a 16-core Ryzen system or a 13900K.
 
  • Like
Reactions: yoneech

yoneech

New Member
May 24, 2023
7
0
1
How long does an incremental build take? Change one line, how long until a program runs.

The affordable EPYC and TR aren't great for single-core performance which you need for tasks such as linking. You don't need the many PCIe lanes or the large RAM capacity.

I'd seriously think about a 16-core Ryzen system or a 13900K.
UE4 can be a bit of a wild beast from my experience. I've been packaging builds on my development machine for awhile now (Formerly an i5 8400, currently an i7 13700k), and if UE4 decides it can do an incremental build it can take 5-10 minutes. Other times UE4 decides it needs to force a full rebuild, which can take upwards of 3~4 hours. Code compilation is only part of a build: there's also packaging and shader compilation, the latter of which appears very dependent on core/thread count. Shaders are usually cached but when a minor change is made to a single shader it can sometimes instigate a full shader recompile.

My biggest concern with consumer-grade parts would be whether it's got enough power to run that, along with everything else (Along with a lower max RAM capacity -- less room headroom for future upgrades.) But if you don't think that'd be an issue I'm certainly not opposed to that route!

Personally, I went the route of having more focused builds per class of task, but I get that is not always within the monetary, spacial, electrical, and/or spousal/partner/roommate approval budgets of someone's living situation.
I could potentially separate out my less demanding services into a smaller-form device! Space and electrical budget are definitely considerations of mine, but I could potentially move my less demanding services to a compact, lower-power device.

Thank you both for the input! :)
 

Syr

Member
Sep 10, 2017
55
20
8
[...]

I could potentially separate out my less demanding services into a smaller-form device! Space and electrical budget are definitely considerations of mine, but I could potentially move my less demanding services to a compact, lower-power device.

Thank you both for the input! :)
If you want to use small nodes for lighter work, I would recommend checking out STH's tiny mini micro series & general SFF PC reviews here https://www.servethehome.com/tag/mini-pc/. Patrick & the STH crew have reviewed quite a number of small form factor PCs.

My actual homelab (not the production 'homecloud' I was mentioning earlier) is almost entirely composed of low power SFF PCs and generally they are quite suitable for these low-intensity applications.

---

[Also as a general note to anyone confused about my last message, if 2 servers didnt seem like I was going w/ specialized servers, I omitted listing a bunch of single-purpose servers in my 'homecloud', since I didnt think listing them was relevant to helping yoneech. Those were all IO constrained special-purpose builds (NAS, nextcloud, security cameras, AI)]
 
  • Like
Reactions: yoneech

unwind-protect

Active Member
Mar 7, 2016
540
216
43
Boston
UE4 can be a bit of a wild beast from my experience. I've been packaging builds on my development machine for awhile now (Formerly an i5 8400, currently an i7 13700k), and if UE4 decides it can do an incremental build it can take 5-10 minutes.
That's kind of insane. Most experts say that focus in humans is retained only for about 7 seconds, before you have to "reboot". Compile-link-start-debug times should be under that timeline for 1-line changes.

I would definitely throw quite a bit of money at a situation like that, but your problem is that you don't know how much parallelism there is in these builds. It is no use getting a 64 core dual EPYC if most of those incremental builds are only running a dozen jobs in parallel.

RAM-wise the situation with the civilian platforms is quite clear. 128 GB should be enough for 32 parallel compile jobs, even if there are memory hog linking jobs in there.
 
  • Like
Reactions: yoneech

mikegrok

New Member
Feb 26, 2023
16
4
3
Alabama, USA
Here is the result of running geek bench on my epyc 9124 with 96GB of ram in 3 sticks from windows:

The single threaded performance on that is
100% faster than the 7xx2 processors and
60% faster than the 7xx3 processors
While the 7xx2 processors are fairly cheap, the 7xx3 processors cost about the same as a genoa, so why get the the last generation?
The fastest ryzen (and intel) are 50% faster than my genoa epyc single threaded, and about the same multi threaded.

An interesting alternative is the epyc 8xx4 line that was announced a few weeks ago, which is 20% slower cpu clocks than genoa, and runs the zen4c processors (which are a bit slower), but it runs a cut down epyc 9xx4 io die.
You get 6 ecc ddr5 channels (up to 12 slots)
96 pcie lanes
8-64 cores
8 cores for under $450
16 cores for under $700
Motherboards and CPUs hit retail this week.

I am currently using 44 pcie lanes. ryzen can't touch that. I usually end up simulating complex situations including a dozen VMs performing complex tasks. When that happens I can add more ram to my computer than ryzen can touch. One of the last complex tasks I personally optimized (1 person job) let my client bring 300 billion dollars of hardware under paid management. Another one I completed a job in 6 months that was apparently identical to another task that was half way done with 36 man years invested. I just wanted a box setup and ready to go do whatever I find a need for. My last computers had a patched intel firmwares such that when I was under a time crunch I was not able to spin up the VMs needed for me to complete the project on time.

This computer is identical to one I configured on the dell site for 20k without a gpu.
$800 motherboard Supermicro h13ssl-nt
$1200 Epyc 9124
$150 96GB ram
$369 Intel u.2 SSD d7-p5600 6.4TB, it was listed as used, but according to the manufacturer's tool had zero hours on time, rated 5 years at 3dwpd.
$600 pny nvidia rtx 4070 , exactly 2 slots wide
2 $58 118GB optane
I already had a PSU, case, some Hard drives and a nvidia rtx 750ti, that I have my monitor connected so that I will have more available ram for compute on the 4070.
 
  • Like
Reactions: Sean Ho

yoneech

New Member
May 24, 2023
7
0
1
If you want to use small nodes for lighter work, I would recommend checking out STH's tiny mini micro series & general SFF PC reviews here https://www.servethehome.com/tag/mini-pc/.
Oh, awesome! Thank you so much! I may very well end up going this route to help more evenly distribute the overall workload!

That's kind of insane. Most experts say that focus in humans is retained only for about 7 seconds, before you have to "reboot". Compile-link-start-debug times should be under that timeline for 1-line changes.
Haha yeah, it's really strange. I can definitely confirm that on my 13700k, one line code compile times for UE4 are about ~30 seconds (and that's also using the "Live Coding" feature.) Certainly not the worst, but Unreal does some funky stuff behind the scenes.

Here is the result of running geek bench on my epyc 9124 with 96GB of ram in 3 sticks from windows:
Unfortunately, that build seems to be a bit outside my price range! But I do see your point, and appreciate the feedback! :)
 

jamesdwi

New Member
Oct 8, 2023
8
2
3
UE4 can be a bit of a wild beast from my experience. I've been packaging builds on my development machine for awhile now (Formerly an i5 8400, currently an i7 13700k), and if UE4 decides it can do an incremental build it can take 5-10 minutes. Other times UE4 decides it needs to force a full rebuild, which can take upwards of 3~4 hours. Code compilation is only part of a build: there's also packaging and shader compilation, the latter of which appears very dependent on core/thread count. Shaders are usually cached but when a minor change is made to a single shader it can sometimes instigate a full shader recompile.

My biggest concern with consumer-grade parts would be whether it's got enough power to run that, along with everything else (Along with a lower max RAM capacity -- less room headroom for future upgrades.) But if you don't think that'd be an issue I'm certainly not opposed to that route!



I could potentially separate out my less demanding services into a smaller-form device! Space and electrical budget are definitely considerations of mine, but I could potentially move my less demanding services to a compact, lower-power device.

Thank you both for the input! :)
These are the reasons I went with EPYC and AMD as well, 8 dimm slots, that can take as large of dimms as any home labber can afford. None of this 32GB ram total on board. Along with expansion slots, I researched Intel latest CPU and systems, and could't believe the lack of pci-e lanes, some appeared to have 2x pci-e 16x lanes. but only one was gen5, in some cases you plug in cards to both, you got 2x pci-e lanes, others systems had 1x pci-e x16 gen5, and 1x pci-e 1x slot gen3. mother boards were more slots most were running through the on-board chipset killing performance.

Didn't need the fastest, but I wanted to expand system to have more new toys in them latter as more cool used stuff came on ebay. 100gigabit networking is getting cheaper by the month.
 
  • Like
Reactions: mikegrok

Syr

Member
Sep 10, 2017
55
20
8
Actually that is something that might be impacting your UE4 build times in addition to CPU capacity - did you find yourself running out of ram and hitting swap/pagefile? Not sure if the FX-8350 can even take more than 16GB, since its an old DDR3 era platform. I know some DDR3 desktop platforms could go up to 32GB of ram (I had an ivybridge PC back in an office that maxed out at 32GB), but not sure if AM3 could do that.

I dont know how far along in development your game is, but you should probably make sure that it can handle the projected memory footprint of your final game's build job with some good margin to spare, or the ability to easily add more ram (in case your estimate is off).
 

mikegrok

New Member
Feb 26, 2023
16
4
3
Alabama, USA
Actually that is something that might be impacting your UE4 build times in addition to CPU capacity - did you find yourself running out of ram and hitting swap/pagefile? Not sure if the FX-8350 can even take more than 16GB, since its an old DDR3 era platform. I know some DDR3 desktop platforms could go up to 32GB of ram (I had an ivybridge PC back in an office that maxed out at 32GB), but not sure if AM3 could do that.

I dont know how far along in development your game is, but you should probably make sure that it can handle the projected memory footprint of your final game's build job with some good margin to spare, or the ability to easily add more ram (in case your estimate is off).
Having enough ram makes a HUGE difference. SSD speed is often about 1/20th ram speed, and Hard disk are 1/1000 ram speed.

If you are using either to supplement insufficient ram capacity, that will make an enormous difference.

Another thing to look at is bring up process monitor. When you click build, there will be a small cpu spike, then a delay then a larger sustained cpu spike. The delay is the computer loading the required resources off of media. If it is a short delay, great, otherwise you may want to change your storage media.

When I had built my computer I had stable diffusion on a hard drive. It would take 6 minutes to load the files off of disk, then 20 seconds to process them. Moving the files to a fast ssd dropped that 6 minute delay to 2 seconds. So 6 minutes and 20 seconds to 22 seconds. I think that a slow NVME ssd would take around 20 seconds to read the content so a 40 second iteration instead of a 22 second iteration.

Someone is dumping these SSDs on the market. Mine was $369 for 6.4TB. Intel D7-p5600. with zero hours of use. According to dell their retail price should be $7,800.
 

yoneech

New Member
May 24, 2023
7
0
1
I was definitely hitting a RAM cap on my build server (other services started getting starved of resources), but I seem to recall looking for bottlenecks on my development machine and not finding any obvious ones.

HDD vs SSD speeds were severely impacting my code compile times on my development machine initially: It originally took about 10~11 minutes to compile my code. I noticed my HDD resources were being maxed out, and picked up an M.2 NVMe to move everything onto. That brought my compile times down to 20~30 seconds.

Build times, however, are a different story... Shader compilation appears to take the longest out of all of the steps. On both my build server and my development machine, I would say this takes up 60-75% of the total build time. I can't seem to find where the bottleneck is happening, and researching the topic results in many other confused developers asking the same question with the occasional smug reply of, "It works on my machine, get better hardware!"

My best guess is that it's heavily dependent on multicore processing, since Unreal Engine spins up separate worker processes to handle compiling the shaders in parallel. Shaders do get cached, which would be storage speed based, but as mentioned: it still takes awhile with an NVMe SSD.

Ultimately, though, I'd like to hope that doing nightly builds would keep the cache up to date frequently enough that UE4 wouldn't decide to do a full rebuild. If the average build takes 2-3 hours, that's much better than 8. A bit longer than I'd like (I'd prefer something around 30 minutes-1 hour for the sake of quicker iteration), but in that sense it may just be wiser for me to go with consumer-grade parts seeing as I can definitely achieve that with my 13700k.
 

unwind-protect

Active Member
Mar 7, 2016
540
216
43
Boston
I thought that Unreal Engine always allows for runtime compilation of shaders (at the expense of stutter of course, but that should be acceptable during development).
 

Syr

Member
Sep 10, 2017
55
20
8
From what I've seen, shader compilation is unfortunately just slow in general if you have a lot of shaders unless you have a ton of cores. Some games precompile all of their shaders on the loading screen on first-run after installation (or after a patch that changes the shaders) instead of doing it at runtime, and even for more modern high-end systems it can still take a considerable amount of time unless you are running a high corecount workstation or server chip.
 

unwind-protect

Active Member
Mar 7, 2016
540
216
43
Boston
From what I've seen, shader compilation is unfortunately just slow in general if you have a lot of shaders unless you have a ton of cores. Some games precompile all of their shaders on the loading screen on first-run after installation (or after a patch that changes the shaders) instead of doing it at runtime, and even for more modern high-end systems it can still take a considerable amount of time unless you are running a high corecount workstation or server chip.
Well, yes, I sat through this just yesterday.

But as the developer of that game, don't you get to choose between precompilation and runtime compilation?
 

CyklonDX

Well-Known Member
Nov 8, 2022
1,177
404
83
While i have had not read any of the posts, here's my response from my experience.
You need to see it in light like this:

More threads = More load capacity
Higher clocks = Faster execution time


Most of the compilation happens on single core (GCC is quite optimized for multithreading)
Most games / game servers prefers higher clocks since it was written in after thought.
Most webservers like IIS/Apache is built to serve as many cores as possible as it increases the load capacity (this is true for SQL - though high clocks are also good on SQL)


I think your best choice is the TR 3960X as mid ground. It has quite few cores (24) but you do not suffer in clocks as badly as you are in Epyc's.
Thus for KVM's/VMs you have plenty of capacity to set them up, and have decent performance on them all. It also has plenty of cache, limit of 2TB of ram, and plenty of pcie lanes (64 4.0 lanes in total), just make sure to keep it cool, under 80'C under load -> they are known from dying quickly.

Intel CPU's are quite behind, and typically 2x as expensive for similar tier of performance in server space.

Desktop cpu's both are quite good, but you don't want weak threads on intel side, unless you can assign them to tasks like lets say apache or something that doesn't need to be that fast and is limited by feature-set requirement.
 

rtech

Active Member
Jun 2, 2021
343
125
43
TR 3960X vs 7950X

Buying Zen 2 TR does not make any sense if you do not have need for PCIE and large RAM capacity or ECC
TR is generally not that great purchase AMD releases them too long after desktop CPUs and they are quite overpriced

Compilation typically scales well with threads so i personally would be looking at 64/48 EPYC (mayber even dual socket) if cheaper than TR or 7950X if you dont know.
 

unwind-protect

Active Member
Mar 7, 2016
540
216
43
Boston
Compilation typically scales well with threads so i personally would be looking at 64/48 EPYC (mayber even dual socket) if cheaper than TR or 7950X if you dont know.
But we don't know what the maximum level of parallelism is that the UE build achieves. Without that buying a low-single-core-speed platform is risky no matter how many cores it has.
 

rtech

Active Member
Jun 2, 2021
343
125
43
It's good to remember that Epic refers to compilation performance alone, which is completely different from actual core utilization in real gameplay. For example, Unreal Engine 4 can already utilize far more than 8-12 cores for compilation tasks, but most games running on Unreal Engine 4 can only use eight cores when it comes to actual gameplay performance.
Source

Date in source 2021 though UE5 might mix this up in future and there would 64 core or 32x2 or even bigger core count dual socket would shine
It could be that EPYC will port these compilation optimizations to UE4 though i do not know enough about their corporation culture if they would do that.
 
Last edited:

unwind-protect

Active Member
Mar 7, 2016
540
216
43
Boston
Source

Date in source 2021 though UE5 might mix this up in future and there would 64 core or 32x2 or even bigger core count dual socket would shine
It could be that EPYC will port these compilation optimizations to UE4 though i do not know enough about their corporation culture if they would do that.
So up to and including RE4 they weren't "optimized for Threadripper" at all? Does that mean too few simultaneous processes? The article is a little light on technicals.