EU [WTB] Supermicro H12SSL or ASROCK ROMED8-2T| 1x EPYC 7xx3 with 32C/64T or better | 256Gb ECC RAM

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

redblood

New Member
Oct 2, 2025
25
3
3
I'm trying to put together a server. But I'm starting from scratch.

This will mainly be used for svt-av1 ffmpeg transcoding and LLM inference so won't be on the whole time.

What I currently have in mind is:

- Supermicro H12SSL or asrock romed8-2t. I'd prefer the romed8 for future gpu upgrades
- 1 Epyc 7xx3 with at least 32C/64T
- 8x32Gb or 64Gb ECC RAM

I don't know for what I can have this, but my budget for this plus the box, PSU and storage is 1700€ plus shipping.

I am based in France.

Also if anybody has tips for a beginner it's most welcome.
 
Last edited:

drdepasquale

Active Member
Dec 1, 2022
131
47
28
I'm trying to put together a server. But I'm starting from scratch.

This will mainly be used for svt-av1 ffmpeg transcoding and LLM inference so won't be on the whole time.

What I currently have in mind is:

- Supermicro H12DSi
- 2x EPYC 7551P or better
- 256Gb ECC RAM

I don't know for what I can have this, but my budget for this plus the box, PSU and storage is 1200€ plus shipping.

I am based in France.

Also if anybody has tips for a beginner it's most welcome.
The budget is going to be tough here for a dual socket system such as this because the motherboard alone often costs as much as your entire budget. The single socket H12 motherboards are more common and cost less. A dual socket system is only beneficial here if your workload needs more than 64 cores, 8 channels of memory, and 128 PCIe lanes.
 

redblood

New Member
Oct 2, 2025
25
3
3
Thanks but I don’t have only an inference workload. I also have a kubernetes cluster with many microservices including a cloud couple of websites with ~200 concurrent users. Transcoding av1. Video surveillance with plate and face recognition. A plex media library etc.
Maybe I can get away with a single slot board but a high core count on 2nd or 3rd gen epyc. But the board has to have enough bandwidth so I can eventually add mi50 or rtx 3090 gpus. That will depend on budget and how the market shifts.
 

iraqigeek

Member
Sep 17, 2018
97
65
18
I have a H12DSi and I can tell you it's useless for running LLMs. Even if you don't want to use both CPUs for inference it's a hassle to get inference working properly on dual CPUs with current open source tools because it's hard to control NUMA memory allocation. It's so frustrating that it's basically sitting collecting dust.

You'll be much better off with a single socket Epyc and a high core count Milan or Rome with 256MB L3 cache. I also think that a single 64 core Milan will have much higher throughput for non-AI workloads than a dual Naples system just because of the nature of the architecture and big improvements in Zen3 cores vs Zen1.

I'm running a six Mi50 rig now on a dual LGA3647, but that's because I got the motherboard very cheap and I won't be running any models that don't fit in VRAM. Not a single byte of the model or context will spill into system RAM. A single Epyc can easily handle four MI50s if you have the right board (ex: Asrock EpycD8). I'm planning such a build because the Mi50s are just great for the price.

Obligatory pic of the 6xMi50 rig. Cables are much tidier now. Top two Mi50s connected via cheap Chinese bifurcation adapter now. CPUs cooled with Asetek 570LC coolers. Top two MI50s mounted to radiator via fabricated aluminum plate and Lian Li O11DXL-1 upright GPU mount and high density foam "pillar" support on the other end. Cards cooled with 3D printed shroud and Arctic S8038-7k fans connected to the motherboard fan headers. Fun fact: Supermicro X11boards (at least the ones I tried) detect the GPUs in the BMC and adjust fan-speed based on GPU temp atuomatically!

IMG_20250930_221913_edit_1702766264848256.jpg
 
Mar 12, 2020
78
30
18
I have a H12DSi and I can tell you it's useless for running LLMs. Even if you don't want to use both CPUs for inference it's a hassle to get inference working properly on dual CPUs with current open source tools because it's hard to control NUMA memory allocation. It's so frustrating that it's basically sitting collecting dust.

You'll be much better off with a single socket Epyc and a high core count Milan or Rome with 256MB L3 cache. I also think that a single 64 core Milan will have much higher throughput for non-AI workloads than a dual Naples system just because of the nature of the architecture and big improvements in Zen3 cores vs Zen1.

I'm running a six Mi50 rig now on a dual LGA3647, but that's because I got the motherboard very cheap and I won't be running any models that don't fit in VRAM. Not a single byte of the model or context will spill into system RAM. A single Epyc can easily handle four MI50s if you have the right board (ex: Asrock EpycD8). I'm planning such a build because the Mi50s are just great for the price.

Obligatory pic of the 6xMi50 rig. Cables are much tidier now. Top two Mi50s connected via cheap Chinese bifurcation adapter now. CPUs cooled with Asetek 570LC coolers. Top two MI50s mounted to radiator via fabricated aluminum plate and Lian Li O11DXL-1 upright GPU mount and high density foam "pillar" support on the other end. Cards cooled with 3D printed shroud and Arctic S8038-7k fans connected to the motherboard fan headers. Fun fact: Supermicro X11boards (at least the ones I tried) detect the GPUs in the BMC and adjust fan-speed based on GPU temp atuomatically!
First, how do I acquire the MI50s? I tend to like to skate up hill, but this really peaked my interest when others mentioned they can be acquired for around $150.
Second, I have a couple of Epyc builds:
  • No.1 is Milan Single Processor/64 cores with a TB of memory. This was my first build with ADA RTX 2000 16GB and 3090 24GB.
  • No.2 is Milan Dual Processor/64 cores that I got for stupid cheap. I was going to rebuild the Single Milan and sell it. But I will take note to what you just wrote about dual CPUs with inference.
  • No.3 is Genoa Dual Processor/32 cores that I got the processors and motherboard, but haven't built yet.
Third, I have my Xeon Xeon Sapphire Rapids 56 core with ASUS Pro WS W790E-SAGE SE + (That I got both for cheap again) that's my daily desktop driver.
I wanted to get your opinion on what should I be using for AI and what I should be using as my daily driver. I water cool mostly everything. The 3090 is the only GPU that I water cool. But I been focusing on completing my Solar Build that I haven't set aside any time for my AI builds. My friends says I may have a problem (hoarder).
 

iraqigeek

Member
Sep 17, 2018
97
65
18
First, how do I acquire the MI50s? I tend to like to skate up hill, but this really peaked my interest when others mentioned they can be acquired for around $150.
alibaba, that's where I got all 17 of my Mi50s. Just search, sort by sold, and message a few of them with whatever questions you have. First one I messaged replied first answering all my questions clearly so I went with them.

For price, it's like Jensen said, the more you buy, the more you save. Price per card doesn't have a lot of wiggle room, but shipping doesn't scale linearly. It's like 50 for one card, but 120 for five cards, and 200 for 12 cards. Ask for DDP shipping (Deliver Duty Paid), it's more expensive upfront but saves you a lot on fees. I live in Germany, so things like import taxes are still predictable, YMMV.

Where the Mi50 shines is large MoE models. Three Mi50s run gpt-oss-120B at around half the token generation speed of three 3090s, but around 1/4 the prompt processing speed. The larger the model, the more attractive Mi50s become. I can run Qwen 3 coder 380B at Q4 with room to spare for context all in VRAM and still get above 20t/s generation, or can run gpt-osso120b plus Gemma 27B at Q8 plus Devstral 24B at Q8 plus a TTS model at Q8 or even FP16, all at the same time!

Second, I have a couple of Epyc builds:
  • No.1 is Milan Single Processor/64 cores with a TB of memory. This was my first build with ADA RTX 2000 16GB and 3090 24GB.
  • No.2 is Milan Dual Processor/64 cores that I got for stupid cheap. I was going to rebuild the Single Milan and sell it. But I will take note to what you just wrote about dual CPUs with inference.
  • No.3 is Genoa Dual Processor/32 cores that I got the processors and motherboard, but haven't built yet.
Third, I have my Xeon Xeon Sapphire Rapids 56 core with ASUS Pro WS W790E-SAGE SE + (That I got both for cheap again) that's my daily desktop driver.
I wanted to get your opinion on what should I be using for AI and what I should be using as my daily driver. I water cool mostly everything. The 3090 is the only GPU that I water cool. But I been focusing on completing my Solar Build that I haven't set aside any time for my AI builds. My friends says I may have a problem (hoarder).
Go with the cheapest, IMO. Mi50 has plenty memory and great memory bandwidth, but not enough compute to take advantage of said bandwidth in most scenarios. I put six Mi50s in an X11DPG-QT because I got that board for $125 and two ES Cascade Lake Xeons (QQ89) for 90 each. I'm prepping a second build around either an EPC621D8A or H11SSL because I also got both for very cheap, ditto for CPUs.

IMO, DDR5 is not worth it for LLMs. The platforms are a lot more expensive but provide very little benefit, but if you already have it, use it. Single socket will perform better than dual socket if you offload to system RAM, otherwise, it doesn't make much of a difference.
 

iraqigeek

Member
Sep 17, 2018
97
65
18
@iraqigeek I couldn’t find a seller for a mi50 below 200$ could you share yours?
My seller run out of stock over a month ago. I just checked on alibaba and it seems prices have jumped a lot since the last time I checked about two weeks ago. Seems stocks are running out.
 

redblood

New Member
Oct 2, 2025
25
3
3
My seller run out of stock over a month ago. I just checked on alibaba and it seems prices have jumped a lot since the last time I checked about two weeks ago. Seems stocks are running out.
This is probably due to mi50 support being added back to rocm
 

iraqigeek

Member
Sep 17, 2018
97
65
18
This is probably due to mi50 support being added back to rocm
Where did you get that? I haven't heard anything about AMD changing Mi50 support status. The only thing that changed is the knowledge of how to make recent versions of ROCm work with the Mi50 (aka, copying tensorfiles) becoming wide knowledge, whereas it required a google search before.
 

redblood

New Member
Oct 2, 2025
25
3
3
Where did you get that? I haven't heard anything about AMD changing Mi50 support status. The only thing that changed is the knowledge of how to make recent versions of ROCm work with the Mi50 (aka, copying tensorfiles) becoming wide knowledge, whereas it required a google search before.
 

iraqigeek

Member
Sep 17, 2018
97
65
18
It was never removed from that file. AMD has not ended support yet. They are marked as deprecated. ROCm 7 doesn't support anything before RDNA. And even though 6.4 is still supposed to, the hipBLAS build provided by AMD doesn't contain the tensorfiles for gfx906.

This has been the situation for months and still is. To get the Mi50 to work with ROCm you need to either build hipBLAS yourself for gfx906 or extract the tensorfiles for gfx906 from a build that has it, like arch Linux.
 
  • Like
Reactions: commander_cool

redblood

New Member
Oct 2, 2025
25
3
3
It was never removed from that file. AMD has not ended support yet. They are marked as deprecated. ROCm 7 doesn't support anything before RDNA. And even though 6.4 is still supposed to, the hipBLAS build provided by AMD doesn't contain the tensorfiles for gfx906.

This has been the situation for months and still is. To get the Mi50 to work with ROCm you need to either build hipBLAS yourself for gfx906 or extract the tensorfiles for gfx906 from a build that has it, like arch Linux.
I installed it on my linux for 7.0.2 but I had to rebuild and obtain the proper tensiles. I thought that with the rock it would no longer be necessary to do what I had to do.
 

iraqigeek

Member
Sep 17, 2018
97
65
18
Yep, you need to either build hipBLAS yourself or copy from downloaded binary that has them (ex: archLinux ROCm 6.4.x).

But to be clear, it's not much of a hassle given how cheap the cards were, and once you know what to do, it's easy to script the process to make it repeatable in other builds or when updating to newer version.
 

redblood

New Member
Oct 2, 2025
25
3
3
Yep, you need to either build hipBLAS yourself or copy from downloaded binary that has them (ex: archLinux ROCm 6.4.x).

But to be clear, it's not much of a hassle given how cheap the cards were, and once you know what to do, it's easy to script the process to make it repeatable in other builds or when updating to newer version.
I'm not saying it was hard, it was very easy, I just thought therock would bring back support so it would be even easier. However what I had trouble with, was adding the proper files to a container image to use proper rocm. For example I want frigate to use my AMD card and I can't put the proper files in the container without it being 40Gb in size.....
 

redblood

New Member
Oct 2, 2025
25
3
3
Is this your first experience with AMD?
yes.... first gpu with AMD. had nvidia before and professionnally we only use nvidia.

But it wasn't as hard as I expected, they get a truly bad rap, it takes less than 1 hour to sort out. Even if it's pre RDNA
 

iraqigeek

Member
Sep 17, 2018
97
65
18
But it wasn't as hard as I expected, they get a truly bad rap, it takes less than 1 hour to sort out. Even if it's pre RDNA
It's not as bad now, but it was way worse about a year ago.

AMD's attitude used to be: here's the hardware and basic driver, knock yourself out getting it to work. ROCm's performance was a shit show for everything until about six months ago. ROCm 6.4 had some 30% performance improvements in some scenarios regardless of whether your hardware was new or old. That tells you how bad things were.

The Mi50 flooded the market at the perfect time, just when AMD got their software stack in a decent shape.