Optimise 2P core/clock CPU combination on X10DRH-CT

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

Pakna

Member
May 7, 2019
50
3
8
I am running a 2P SAN for my homelab and am experimenting with mixing two different (but "close") Broadwell-E CPUs on this Supermicro platform, would like to hear some opinions or general suggestions on whether I am actually looking to lose or gain anything with such an asymmetric setup.

Here are the combinations I've tried so far:

CPU1CPU1 cores/turbo clockCPU2CPU2 cores/turbo clockDRAM speedWorks?
E5-2630 v38/3.2 GHzE5-2640 v48/3.4 GHz2133Yes
E5-2650 v412/2.9 GHzE5-2650 v412/2.9 GHz2400Yes
E5-2623 v44/3.2 GHzE5-2623 v44/3.2 GHz2133Yes
E5-2623 v44/3.2 GHzE5-2620 v48/3.0 GHz2133? (yes, probably?)

This is a somewhat busy, multi-purpose SAN backed by TrueNAS 13-U3. It tries to be many things: it serves as a general purpose filer, a Plex server, a personal cloud host, surveillance storage, and a VM / DB SAN (via iSCSI + zvol). The zvols with OS installation iSCSI extents are deduplicated. To support this, I have carved out four pools that are a mix of 10/14 TB Exos two-way mirrored drives and a handful of Samsung 970 PROs, Intel NVMe and mirrored Optane 900p drives (these are pulling double duty as SLOG and dedup vdev). The iSCSI are targeted by a couple of Proxmox hosts that host a number Kubernetes clustered nodes. RAM is 128 GB. Network traffic is split so that less demanding traffic is shunted over copper NICs whereas iSCSI traffic traverses a pair of 10G fibre links arranged in MPIO.

What I am having trouble is understanding whether this platform benefits more from cores or clocks - right now, this is run by a pair of 12-core E5-2650 v4 and I am quite happy with how it's running (e.g., fio is reporting each VM capable of > 10k IOPS).

That being said, I'd like to use up my stock of CPUs in an optimal fashion so I was thinking of combining higher speed of 4-core E5-2623 v4 with an 8-core E5-2620 v4? In addition to lost cores, I'd be taking a slight hit with RAM speed over the E5-2650 v4 (2400 -> 2133) but since I am not running a lot of jails or any VMs that should care about that drop. NB, all these CPUs are HT-capable.

Any thoughts are appreciated!
 

name stolen

Member
Feb 20, 2018
49
16
8
Love the thought process, love the ingenuity, and i love comparison testing. But, it seems like a lot of A/B testing would be needed to find real world benefits in an odd (as in non-even or mismatched) CPU pairing, and to test all combinations and buffers, packet sizes, driver versions, RAM speeds, core speeds, I think you could be testing for a year straight. Unless there's a specific performance characteristic you're seeking from the mismatched setup, my gut feeling is matched dually with higher RAM speed is optimal.

I think the data would be interesting, but I also think it would be an enormous undertaking to show us that the differences are minimal. :)
 

Pakna

Member
May 7, 2019
50
3
8
Thanks for insight - you raise absolutely valid points here. Talking offline to other people convinced me this is an exercise in excessive optimisation: it's great to min-max, but as you mention, testing thoroughly out all these combinations will be a massive undertaking for a limited performance gain at best. Just looking at these CPUs side-by-side, it's clear we're talking about couple of hundrends of MHz turbo-differences and RAM speeds, whilst having a large loss in core count. People with more lower-level knowledge brought on one additional thing I had not thought of: the OS scheduler - it typically assumes equal core speeds per socket, so there will be sub-optimal scheduler behaviour. Lastly, it came as a bit of a surprise that QPI actually works with different core speeds, so there is bound to be some annoyance there as well.

You bring up the most salient point here - the potential gains to suffer through all that are really very unlikely to be there. I have since decided to stick around with dual E5-2650 v4 and call it a day.

One last result from this is the power consumption - I found it suprising the idle system W delta between E5-2623 v4 and E5-2650 v4 is on the order of 10 W (!). Yes, you read that correctly - 4-core vs 12-core Broadwell-E(P) parts from 2016 are consuming almost the same amount of power in idle. This is reported to BMC and measured directly at the power supply, so I am comfortable taking this as true. There is really no reason to use the E5-2623 v4 in this situation.
 

i386

Well-Known Member
Mar 18, 2016
4,242
1,546
113
34
Germany
What I am having trouble is understanding whether this platform benefits more from cores or clocks
Storage servers and the software benefit from high clocks. The "big" storage server have dual cpus for more memory (ram is still magnitudes faster than any pcie connected storage)
One last result from this is the power consumption - I found it suprising the idle system W delta between E5-2623 v4 and E5-2650 v4 is on the order of 10 W (!). Yes, you read that correctly - 4-core vs 12-core Broadwell-E(P) parts from 2016 are consuming almost the same amount of power in idle.
Yes, same architecture will result in similar power consumption in idle mode. For even better results let the server "heatsoak" for a few hours and then compare the power consumption (STH does that in the server reviews by putting the test server between other servers)
 
  • Like
Reactions: Pakna