You can certainly create scenarios to target the architecture and make it look OK at best. On the other hand, if you think about virtualized servers, the cache layout makes quite a bit of sense. Most are not going to see 60 core VMs, and if you do, you are on a single socket rather than having to traverse socket-to-socket links.
If you are using more mainstream CPUs, say dual Silver 4214's per node, the AMD EPYC 7702P allows you to put everything on a single socket, rather than say 3x 2P nodes. Even if CPUs were at an even cost, you are still lower power and lower system cost with EPYC. You are now comparing a single socket latency to three dual-socket server latency.
The question is pretty complex. When you focus on a part of the solution, you can expose some of Rome's architectural decisions. Still, by the time you are doing even 2:1 socket or node consolidation into 1P Rome, focusing on smaller aspects is less exciting.
I fully expect Intel will have a set of benchmarks to highlight those workloads soon. The feedback I have been hearing outside of our own testing has been very positive as well.