AMD EPYC 7000 Series Architecture Overview for Non-CE or EE Majors

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

cactus

Moderator
Jan 25, 2011
830
75
28
CA
@Patrick "Using for small die per package..." "...4TB of RAM in adual-sockett design."

So, where is the architecture review for the CE and EEs?:)

It will be interesting to see if AMD pushes the Infinity Fabric out to their GPUs like nVidia is doing with NVLink on the IBM Power CPUS. Also interesting to know if all the high-speed serial lanes on the die are capable of running it or if there are specific ones. Each of the socket to socket links quotes more bandwidth than 16x PCIE4.0.
 
Last edited:
  • Like
Reactions: Patrick

DWSimmons

Member
Apr 9, 2017
44
10
8
52
@cactus I've been chewing on this for a day, mostly because I dig this kind of thing.

My Intel take first:
The Department of Energy pushed NVidia and IBM to bypass the PCI-e congestion for the NUMA cluster for a supercomputer. Intel is not naive, they have Knights Landing and worked with NVidia before. They've seen the DGX-1. Intel has Optane, SSDs, and NICs as well as alternative hardware stack solutions outside the main product line. It's not like they don't have tons of experience with sharing memory and interconnects. Maybe, NVidia has the real leverage as the competitor to PCI-e 4.0. They have the DGX-1 to develop a community of programmers, testers, and engineers. IBM has provided access to the chip for NVLink. Intel would need to provide NVLink access to Xeon. I feel Intel is somewhat powerless except to approach NVidia to offer NVLink when the business case presents itself.

Back to your original point, AMD pushing Infinity Fabric out to Vega. AMD does not have the problems of Intel and does not have to partner like IBM. In other words, hell yeah, let's get some IF out to the NUMA cluster. It's not all roses though. AMD, as far as I know, does not have a strong cluster and/or a strong NUMA community. (More on this later) IBM created a middleware interface that is supposedly familiar to Pci-e (why reinvent the wheel) so that programming to NVLink and all the shared memory issues would not be new. If AMD could pull off a stack coup, then IF to a Vega cluster would be uber-geeky. Dr. Su and AMD have come a long way and have shown with their Zen architecture the ability to share cores/sockets and memory effectively. Also, they don't have to look to NVidia for partnerships. AMD does not have a great track record in software. What they did do was push Mantle open source to Khronos group and it was renamed Vulkan. It has gained considerable traction in the last couple of years for its hardware agnosticism. I don't know about the software difficulties for Vega clustering and/or NUMA - particularly vis a vis Vulkan - though it appears that the ball was started rolling in this direction several years ago. I'd love for someone with insight to speak to this.


Incidentally, what's (really) old is new. This dream was announced 10 years ago. It was the Spider platform with HyperTransport 3.0(!). Here's hoping we don't have to wait another 10 years to realize the stack.