Hi mates. I think that my EPYCD8-2T failed on me (at the worst moment, of course).
My plight started 2 days ago, when I added the fourth GPU. The board did not boot properly (no screen output), and after the usual 30/40 secs, the power draw skyrocketed at almost 300W. No ssh server running, so the OS was not booted. BMC unreachable. All the fans at flank speed.
I couldn't see the Dr.Debug display, since they intelligently placed it under the fourth GPU, so I detached that GPU. No success, but at least now I could see the display. Pretty useless, since it displayed the vastest amount of codes, and ended up with a countdown from 90 to 01 (see the video, it starts at 0:19).
I detached all the GPUs save one, the all of them. No success. Lower power consumption but still high (>200W) after 30/45s from boot.
I noticed another strange thing, which makes me suspect that the board is indeed broken. While powered off, it always absorbed 7-8W, probably because of the BMC (a normal pc draws 1-2W on the average). Now it draws 20-22W, which is almost impossible even considering the BMC. Furthermore, the heatsink over the 10Gb NICs is so hot that you can hold your finger on it for just an instant. It was NOT so before.
Of course I cleared the cmos and even replaced the cmos battery. No success.
Any clue?? :-/
My plight started 2 days ago, when I added the fourth GPU. The board did not boot properly (no screen output), and after the usual 30/40 secs, the power draw skyrocketed at almost 300W. No ssh server running, so the OS was not booted. BMC unreachable. All the fans at flank speed.
I couldn't see the Dr.Debug display, since they intelligently placed it under the fourth GPU, so I detached that GPU. No success, but at least now I could see the display. Pretty useless, since it displayed the vastest amount of codes, and ended up with a countdown from 90 to 01 (see the video, it starts at 0:19).
I detached all the GPUs save one, the all of them. No success. Lower power consumption but still high (>200W) after 30/45s from boot.
I noticed another strange thing, which makes me suspect that the board is indeed broken. While powered off, it always absorbed 7-8W, probably because of the BMC (a normal pc draws 1-2W on the average). Now it draws 20-22W, which is almost impossible even considering the BMC. Furthermore, the heatsink over the 10Gb NICs is so hot that you can hold your finger on it for just an instant. It was NOT so before.
Of course I cleared the cmos and even replaced the cmos battery. No success.
Any clue?? :-/