I don't know about the specifics of x265 (although from your results it's not showing signs of scaling well), but x264 doesn't really scale past about 12 CPUs; it's got a hard limit of 128 threads but due to the way it splits out the workloads, the more threads you use, the worse the quality (and past about 16 threads the quality drop-off starts to become really obvious IMHO). If you're after the highest quality, limit the threads as much as you're prepared to put up with (and yes, I have done plenty of single-thread encodes at 1/10th realtime although I generally stick to two threads).
Many of the filter steps that you might put your video inputs through are also thread-limited, and this'll apply regardless of the codec. If your pre-encode frameserver and/or filter pipeline maxes out at 4x realtime, there's no benefit from using an encoder configuration that's faster than 4x realtime.
What's really great about the big fat multi-core beasts is being able to do many lowly-threaded encode jobs at once... or if you really want to give it something to sweat over, try doing an AV1 encode; Intel's SVT-AV1 encoder is pretty much designed for high-end xeons (although last time I looked at it the quality kinda sucked) if you want to see how scaling compares there.