First, how do I acquire the MI50s? I tend to like to skate up hill, but this really peaked my interest when others mentioned they can be acquired for around $150.
alibaba, that's where I got all 17 of my Mi50s. Just search, sort by sold, and message a few of them with whatever questions you have. First one I messaged replied first answering all my questions clearly so I went with them.
For price, it's like Jensen said, the more you buy, the more you save. Price per card doesn't have a lot of wiggle room, but shipping doesn't scale linearly. It's like 50 for one card, but 120 for five cards, and 200 for 12 cards. Ask for DDP shipping (Deliver Duty Paid), it's more expensive upfront but saves you a lot on fees. I live in Germany, so things like import taxes are still predictable, YMMV.
Where the Mi50 shines is large MoE models. Three Mi50s run gpt-oss-120B at around half the token generation speed of three 3090s, but around 1/4 the prompt processing speed. The larger the model, the more attractive Mi50s become. I can run Qwen 3 coder 380B at Q4 with room to spare for context all in VRAM and still get above 20t/s generation, or can run gpt-osso120b plus Gemma 27B at Q8 plus Devstral 24B at Q8 plus a TTS model at Q8 or even FP16, all at the same time!
Second, I have a couple of Epyc builds:
- No.1 is Milan Single Processor/64 cores with a TB of memory. This was my first build with ADA RTX 2000 16GB and 3090 24GB.
- No.2 is Milan Dual Processor/64 cores that I got for stupid cheap. I was going to rebuild the Single Milan and sell it. But I will take note to what you just wrote about dual CPUs with inference.
- No.3 is Genoa Dual Processor/32 cores that I got the processors and motherboard, but haven't built yet.
Third, I have my Xeon Xeon Sapphire Rapids 56 core with ASUS Pro WS W790E-SAGE SE + (That I got both for cheap again) that's my daily desktop driver.
I wanted to get your opinion on what should I be using for AI and what I should be using as my daily driver. I water cool mostly everything. The 3090 is the only GPU that I water cool. But I been focusing on completing my Solar Build that I haven't set aside any time for my AI builds. My friends says I may have a problem (hoarder).
Go with the cheapest, IMO. Mi50 has plenty memory and great memory bandwidth, but not enough compute to take advantage of said bandwidth in most scenarios. I put six Mi50s in an X11DPG-QT because I got that board for $125 and two ES Cascade Lake Xeons (QQ89) for 90 each. I'm prepping a second build around either an EPC621D8A or H11SSL because I also got both for very cheap, ditto for CPUs.
IMO, DDR5 is not worth it for LLMs. The platforms are a lot more expensive but provide very little benefit, but if you already have it, use it. Single socket will perform better than dual socket if you offload to system RAM, otherwise, it doesn't make much of a difference.