AMD EPYC 3451 Benchmarks and Review A 16 Core Xeon D Competitor

  • Thread starter Patrick Kennedy
  • Start date
Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

WANg

Well-Known Member
Jun 10, 2018
1,308
971
113
46
New York, NY
Or you can just refer to it as "the CPU series that should've been on the HPe Microserver Gen10/Gen10+".

Seriously, there needs to be a ton of these Eypc embedded boards sold to Kubernetes devs. If not, they should be in the next generation of 10/40/100 switches/routers from Arista/Juniper/Cisco as their supervisory engine.

That being said, yeah, AMD probably need to buy a 10/40/100GbE player in the market if they want more in-roads here (Chelsio, maybe?) - their 10GBit embedded NIC is missing SRIOV, while Intel's embedded 10GBit adapter on their Xeon-Ds has them since the 1541 days. Imagine vendors actually having to embed an Intel X5/7 series NIC just to make the AMD "feature complete"...
 
Last edited:

zir_blazer

Active Member
Dec 5, 2016
356
128
43
The next step is perhaps the more important one. Some OEMs and systems vendors have started to adopt these higher-core count parts, but we need more platforms. I am saying this selfishly as I personally want a platform that exposes the 8x 10GbE NICs, and all 64 PCIe lanes (no need for SATA in 2020) with the AMD EPYC 3451.
It took you like, TWO YEARS, to said that part loud! I have been saying it since your first EPYC Embedded article, when I noticed just how much Socket AM4 crippled Zen builtin I/O. It is even absurd when you consider that right now you're getting Intel pushing 2.5G NICs as part of its next generation consumer platform when AMD could have offered the option for 10G years ago across all the lines if it had wanted to.
In the case of the dual die parts, except due to RDIMM support and the 10G MACs, I don't find them as interesing as the single die parts because first generation ThreadRipper is closer to them than AM4 Ryzens to single die ones. It is however impressive how much punch you can have in such small package size.

Note that you can technically have PCIe and SATA simultaneously since the protocol for the muxed lanes is Software configurable. If you have an OCuLink Port like in the AsRock EPYC3251D4I-2T, you can use either a OCuLink-to-U.2 or OCuLink-to-4xSATA breakout cable and pretty much cover both PCIe NVMe and SATA use cases, no need to choose one over the other. It may be less straightforward, but I find it far more flexible, at least on paper.


If I were to ask questions about the current generation EPYC Embedded (Although I'm far more interesed in what I expect to be a Matisse based one, due to amount of USB Ports), what I would like to know is whenever they expose the HD Audio/Azalia Bus so that you can wire an audio codec like the Realtek ones that everyone knowns, or if it isn't exposed at all and you have to use some onboard USB Sound Card like third generation ThreadRipper Motherboards have since it seems that Rome does not have a HDA Controller at all.
I'm also interesed on where the 10G MACs are muxed. Are they totally independent , so that you can have 32/64 PCIe + 4/8 10G MACs? Are they muxed into some of the PCIe Lanes, which means that you can do at most 28/56 lanes + 4/8 10G MACs? Which ones, the pure PCIe ones, or those that support SATA? I even recall having read somewhere that you had to team up two PCIe Lanes per 10G MAC. Since AMD is not releasing public technical documentation for Zen based Processors and they didn't really go into detail in any presentation, I don't know if you lose something else if you want to put the 10G MACs to use.




I actually had to watch the video a bit to see how the article balloneed into a 26 minute monster. What catched my attention is that you mentioned that vendors like Supermicro didn't used the AMD 10G MACs because it is rather dull, as it doesn't support SR-IOV or other features that you expect in Server-grade NICs. While I actually agree (I considered that when I thought whenever my dream Motherboard would have AMD own 10G MACs or spend the 4 PCIe Lanes for an Intel X550), ironically it is not a problem in consumer segments when most people are still using 1G Realtek or Intel i220 NICs which are as dull as AMD 10G MAC, so it is a better scenario there. Heck, if price is the main issue, then you would just hook a 1G PHY instead of a 10G one.

Actually, I had this in my notes:


NETWORKING

The Zen-based Zeppelin die has integrated four 10G MACs (Confirmed to support 10GBASE-KR, and possibily an alternate 1000BASE-KX mode for 1G), which are exposed only in the EPYC Embedded series. I would love to see them being actually used, but there are a lot of considerations:

DRIVER SUPPORT: Due to the fact that the integrated 10G MACs are only rarely seen being used, I'm not even aware about OS Driver support for them. The Linux Kernel has builtin Drivers, so they work there, but there is no information regarding Windows support. If AMD doesn't provide Windows Drivers for these 10G MACs, they are automatically discarded from any consumer oriented Motherboard. Period. In comparison, Intel NICs seems to be usually widely supported.

FEATURES, CAPABILITIES AND PERFORMANCE: Due to the lack of public documentation, the feature set and other capabilities of the Zeppelin integrated 10G MACs are unknow. While I would expect that they have overally a lower latency than any PCIe NIC (Integrated MAC -> 10GBASE-KR -> PHY vs Integrated PCIe Controller -> PCIe Bus -> PCIe NIC with integrated MAC and PHY), there are other features to consider like SR-IOV for PCI Passthrough in virtualization scenarios, network processing offloading, overall CPU usage, and any other thing that is reelevant for someone spending money in 10G networking gear. So far, everything points out that the Zeppelin integrated 10G MACs offers only basic connectivity. When it comes to features, Intel NICs seems to be the premier solution (Except on some Remote DMA scenarios, which only the highest end of its NICs supports).

LANE COST: In some of the publicly available presentations, supposedly AMD said that each 10GBASE-KR lane requires to team up two PCIe Lanes from the Zeppelin SoC, thus having all 4 10GBASE-KR lanes has a cost of 8 PCIe lanes. However, based on the Block Diagrams of the available EPYC Embedded Motherboards that uses the 10G MACs, like those that are based on the COM Express Type 7 Form Factor, each 10GBASE-KR lane takes only one PCIe Lane each instead of two, since otherwise, it would be exceeding Zeppelin known total amount of PCIe Lanes, which is 32. This discovery makes the integrated 10G MACs better than expected.

LANE MULTIPLEXING: Again, due to the lack of public documentation, I don't have idea if the four 10G MACs 10GBASE-KR lanes are multiplexed in a SERDES controller that otherwise does only PCIe, or if is entangled with the 8 lanes that are known to do both PCIe and SATA. Basically, the difference is whenever the Zeppelin die second 16x integrated PCIe Controller can be configured as 4x10GBASE-KR + 8xSATA/8xPCIe + 4 pure PCIe or only 4x10GBASE-KR + 4xSATA/4xPCIe + 8 pure PCIe. If the first option is not possible, I would consider that the Intel NICs holds an advantage because being able to use two OCuLink Ports to do two 4x NVMe Drives or 8 SATA via breakout cables is better than just having one OCuLink Port. Since Zeppelin SoC integrated SATA Controller should have nothing to envy to discrete HBA SATA Controllers (As anything SATA is considered low end and has few extra optional features that aren't already baseline, there is no way that Zeppelin integrated SATA is significantly worse than any discrete one), if the integrated 10G MACs feature set ends up being mediocre, it could be preferable to use these 4 multiplexed lanes as SATA and instead use the pure PCIe lanes to throw in a PCIe NIC.

BANDWIDTH BOTTLENECK: Assuming that the 4 10GBASE-KR lanes are fully capable of working at the expected 1.25 GiB p/s rate each so that they can saturate all four 10G links, they provide a higher effective bandwidth than if going with a PCIe NIC, since each PCIe 3.0 lane provides only 1 GiB p/s. This means that with a Quad Port PCIe 3.0 4x NIC like the Intel X710-TM4, only up to three 10G links would simultaneously work at full speed, whereas 4 would face a bottleneck due to having 1 GiB p/s less than required (Four 10G links would require 5 GiB p/s, 4x PCIe 3.0 lanes provides only 4 GiB p/s). Going with a NIC with 8x PCIe lane connectivity is absolutely overkill, and for a mere 10G Quad Port even looks bad, since 8 PCIe 3.0 lanes provides 8 GiB p/s bandwidth, which can confortably feed 6 10G links.
 
Last edited:
  • Like
Reactions: Aquatechie

SRussell

Active Member
Oct 7, 2019
327
152
43
US
Any information on how VMware products would license this with 2 Numa cores?
 

xeonguy

New Member
Aug 29, 2020
23
7
3
Good review, and these look really good for small power itx systems like with this Asrock board ASRock Rack > EPYC3451D4I2-2T

Is it released yet? Can't find anyone selling these new or 2nd hand.

Might be a dumb question, but how does one go about buying one of these processors?