Cheapest way to get a temporary Epyc 4004/4005?

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

altano

Active Member
Sep 3, 2011
359
227
43
Los Angeles, CA
I pulled a failing SuperMicro H13SAE-MF motherboard out of production due to a failing nvme slot and I now need a CPU temporarily to test it.

Unnecessary backstory: I RMA'd the board with the instruction that it takes about 2-40 hours after boot for the drive slot to drop the drive and hang the system, but it's NOT transient and happens every time in 40 hours or less. SuperMicro sent the board back without repair after running it for 10 hours and not observing an issue. I asked them to re-test for longer and they told me to call tech support to work through my issue if I wanted but that I couldn't send it back.

So my two options are: (1) I now have a very expensive paperweight or (2) run the motherboard not in production as cheaply as possible, reproduce the issue, and call tech support, in the hopes that it leads to a successful RMA. To do #2 I need a cheap EPYC 4004/4005 CPU, and I can't find one for <$400, which isn't worth the risk.

Does anyone have any recommendations?
 

Karp

New Member
Jan 14, 2019
14
7
3
That board supports Ryzen 7000 & 9000 series. Do you have to use EPYC for testing?
 

altano

Active Member
Sep 3, 2011
359
227
43
Los Angeles, CA
Oooh I forgot about that option. Thanks!

I found a 7900x refurbished for $250 on Newegg. eBay’s more than that. That’s much better but I’m gonna look for cheaper.
 

Karp

New Member
Jan 14, 2019
14
7
3
Not sure if you can get by with less cores, but 9600X new for 189 on the egg.
 

altano

Active Member
Sep 3, 2011
359
227
43
Los Angeles, CA
If anyone sees this in the future, I thought I'd follow-up:

My original and replacement H13SAE-MF both started having a failing nvme slot (after 6 months with the original, and 2 months with the replacement).

Fortunately after updating the replacement's bios to 2.6 the problem went away completely. I've been running 5 weeks problem free.

So if you have nvme issues with this motherboard, updating the bios.
 

RolloZ170

Well-Known Member
Apr 24, 2016
10,062
3,227
113
germany
Fortunately after updating the replacement's bios to 2.6 the problem went away completely.
strange it worked for some time. maybe just the BIOS update(reflash) solved the issue and even with BIOS 2.6 you get same issue in some weeks ?
 

altano

Active Member
Sep 3, 2011
359
227
43
Los Angeles, CA
strange it worked for some time. maybe just the BIOS update(reflash) solved the issue and even with BIOS 2.6 you get same issue in some weeks ?
Yeah for sure that’s possible! That would be really weird but this whole saga is weird.

I asked SM support if 2.6 had any updates that might fix an nvme issue and the csr said “There is an AMD Agesa code update on R 2.6 that may relate to this issue; however, there is no information from AMD.”
 

mr44er

Active Member
Feb 22, 2020
167
50
28
That also sounds somewhat like an overly deep ASPM state. Perhaps implemented incorrectly or simply unsupported, either in the NVMe firmware or the motherboard's firmware. Flashing the NVMe firmware—provided a newer version is available—is always a recommended step. If the error recurs, try completely disabling ASPM for now; if you manage to make it past the 40-hour mark, try another round with, for instance, only L0/L0s enabled.
 

altano

Active Member
Sep 3, 2011
359
227
43
Los Angeles, CA
I tried disabling aspm in the bios but it didn’t help. I think it wasn’t actually disabled in the os, strangely enough. I created a one shot systems service that disabled at boot and that didn’t help either. I’m not sure I was doing it correctly tho.
 

altano

Active Member
Sep 3, 2011
359
227
43
Los Angeles, CA
Seems unlikely, but then again, how could a bios update fix something that only started after 6 and 2 months of use? Also seems unlikely.
 

RolloZ170

Well-Known Member
Apr 24, 2016
10,062
3,227
113
germany
Seems unlikely, but then again, how could a bios update fix something that only started after 6 and 2 months of use? Also seems unlikely.
BIOS content corruption caused by bug.
BIOS update may set BIOS defaults. BIOS flash re init internal Flash areas(NVAR, LOG)
 
  • Like
Reactions: altano

mr44er

Active Member
Feb 22, 2020
167
50
28
and that issue happens only after 6/2 month of use ?
Just guessing around, it was my experience when B650-platform was new with my 7950X. Not only NVME-related, but also with my quad nic and I knew both ran ok in my previous build.
 

RolloZ170

Well-Known Member
Apr 24, 2016
10,062
3,227
113
germany
Unnecessary backstory: I RMA'd the board with the instruction that it takes about 2-40 hours after boot for the drive slot to drop the drive and hang the system, but it's NOT transient and happens every time in 40 hours or less. SuperMicro sent the board back without repair after running it for 10 hours and not observing an issue.
issue only with some NVMe drives ? which ssd do you have ?
e.g. draws too much power, M.2 slot has it's own VRM.
M.2 can be heated up by hot CPU FAN Air ?