well that's good! and that's not forced right, just works by default?
Which x4 link are you talking about?...
2666 seems not possible, keep the x4 link on 10,xxx Gbps because going to 13 makes it impossible to even keep 1200 IF.
...
The 4-link xgmi max speedWhich x4 link are you talking about?
Rome should be able to get upto 204GB/sec per socket, and that's 65-85BG/sec for two sockets?View attachment 14138
After a full day of tinkering this is my final and best result for memory latency, it sacrifices a bit of max bandwidth. Cinebench though does not seem to care about anything memory.
Maybe gonna do some tighter memory timings but for now fully stable and it's on a render job.
For references, default I had a memory latency of around ~150ns
Well as I said, I sacrificed bandwidth for latency optimisation. I'm running with 4 numa nodes and minimal interleaving.Rome should be able to get upto 204GB/sec per socket, and that's 65-85BG/sec for two sockets?
That's barely better than single Xeon V1 bandwidth.
I know you're only using 2400, not 3200, but if the IF clock issue can't be sorted out, and it's really going to be this bandwidth starved, it looks like I might be selling my dual Epyc ES just as soon as I'm finished building it (because that's a lot of cores to feed with such little bandwidth, and real world performance will probably be awful).
Rome should be able to get upto 204GB/sec per socket, and that's 65-85BG/sec for two sockets?
That's barely better than single Xeon V1 bandwidth.
I know you're only using 2400, not 3200, but if the IF clock issue can't be sorted out, and it's really going to be this bandwidth starved, it looks like I might be selling my dual Epyc ES just as soon as I'm finished building it (because that's a lot of cores to feed with such little bandwidth, and real world performance will probably be awful).
I just ordered a Supermicro H11SSL. I hope that one is decent. I'll be using it for CPU rendering as well, but I'm slowly moving to GPU rendering (e.g. Arnold). Curious to know if there's a real world impact. For example, if Cinebench doesn't seem to care then maybe C4D renderings wouldn't be affected?Well as I said, I sacrificed bandwidth for latency optimisation. I'm running with 4 numa nodes and minimal interleaving.
With most settings at default and memory at 3200 I already reached above 120gb/s bandwidth.
Also have you ever run this cache and memory bandwidth tool? It's not really forgiving and thinking you would even come close to the theoretical bandwidth is crazy.
Although I realise from the beginning this supermicro board is just complete garbage. It's just utter trash, it can't do shit, it runs like crap. Really like a 25 year old board, cpus are cool. Should have gone single socket with a real mobo.
BTW you really got me thinking and comparing results. With IF clock seperate the memory latency is up but cache is low latency. When running 1:1 the memclock is way down, mem latency is down as expected but cache latency is way up. Took a lot of tweaking to get it somewhat down again. Could it be that cache runs at memclock and IF is really only the interconnect fabric and not the fabric on die? Otherwise cache would have to behave differentRome should be able to get upto 204GB/sec per socket, and that's 65-85BG/sec for two sockets?
That's barely better than single Xeon V1 bandwidth.
I know you're only using 2400, not 3200, but if the IF clock issue can't be sorted out, and it's really going to be this bandwidth starved, it looks like I might be selling my dual Epyc ES just as soon as I'm finished building it (because that's a lot of cores to feed with such little bandwidth, and real world performance will probably be awful).
That's a retail trx sample dude, not really comparable to a epyc engineering sample.Probably need to play with my timing and add 2 ram sticks but for those taking note on fabric speed
View attachment 14141
View attachment 14142
View attachment 14143
Why does it identify as castle peak?No this is a epyc es chip not trx. But clocked higher
not sure why CPUZ does what it does but also note the Hexa channel ram witch Thread Ripper would not show. For now only have changed cTDP in biosWhy does it identify as castle peak?