Network Throttling and Performance issues with a EXSi host.

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

Socrates

Member
Dec 28, 2016
92
7
8
47
Hi all,


My home server is built around a SuperMicro MBD-X10DRI powered by a dual Xeon E5 2630 v4 CPU. The memory currently is just at 32gb (gosh i wished the memory prices go down).
So Currently I have a few VM hosts running on this server, to be precise 5.
None of these hosts are resource intensive.

However recently I've created a host for windows 10 its primarily for p2p torrenting (pretty much legal stuff), and this host lags like crazy. The os freezes every few seconds and regains then lags and this keeps going on. Have assigned 16 CPu to this host, 16gb memory and PCIE passthrough using a LSI card to a 24 port SAS2 backplane. The hosts are on Intel Enterprise SSD's.
Let me talk a bit more about this in detail.

So I have a 1gbps bandwith at home, and on this ESXI server.
If i am downloading on my standalone PC i get expected speed of 112-113 mb/sec But if i try to do the same thing on this vm window host, i never get over 25 mb/sec. Plus the torrent client freezes, and so does any other open application or window on this host.
Not sure if this is due to HDD, or misconfigured CPU allocation? is that even a thing.
Let me explain.

The two cpu combined I get 20 CPU's.
I have been generously allocating CPU's to various hosts. Could this be the issue.?
For eg:-
I have assigned 16 CPU's for FreeNAS
As all the 16 is not used all the time (i guess), i have assigned 16 cpu again to SnapRAID server (media server host on Windows 10).
Again generously I have assiged 10 CPU's to AD, DHCP, DNS server.
D0 u think the over allocation of the cpu might be causing all this? I was thinking if all of the CPU's are not fully utilized by a single host, why not just assign the extra cpu's while building the host. I can shut the hosts and edit out the allocation if required and if this is what is causing the hosts to lag.
The network switch is a gigabit switch. Please note, the standalone PC is also connected to this switch, there is not bandwith lag on PC. Only on the vm hosts. Plus i think the main issue is due to the lag in the host itself, its also affecting the bandwith.

Are there settings one has to edit after installing ESXI for supporting 1gbps internet bandwith? Or tune the cpu, memory etc in the esxi server?
Please guide. Why is the host lagging, hdd slow and internet throttling?
The Chassis is 4U SuperMicro with a SAS2 backplane.
Pls note, the network card is built in to the MBD-X10DRI motherboard (i350 dual gbe LAN)

Or some driver issue?

Oh and btw, i have two disks attached to it 4TB each totalling 8TB, thats where i have a pcie passthrough
 
Last edited:

Evan

Well-Known Member
Jan 6, 2016
3,346
598
113
Way too many vCPU
You will crosss NUMA boundry and you will have scheduling issues, on ESX you need all the cores of the VM to be available in a cycle to process. (So VM servers run best with smallest CPU number that’s needed to run).

Look at your CPU ready time metric and you should see what I mean.
 
  • Like
Reactions: whitey

whitey

Moderator
Jun 30, 2014
2,766
868
113
41
LOL, your CPU %RDY must be through the roof, yeah back off the vCPU assignments to sane levels. Try not to overcommit by a factor of more than 2x really, so if you have two socket 10 cores each (and don't be thrown off by HT saying you have 40 logical processors, it ain't gonna give ya THAT much more gains)...back to story, 20 'REAL' cores, don't assign more than 40vCPUs total across your host and things will be MUCH happier. FreeNAS will CHRUN w/ 2-4 vCPU's and 12-16GB memory, more if ya got it but ya are stuck for now at 32GB so I personally would shoot for a 2vCPU/12GB memory config there w/ what ya got.

You're causing a 'hurry up and wait' state where it's a 'get in line' scenario for the CPU scheduler w/in vmkernel to get your VM's cycles due to how you have un-optimally configured your VM's...more is NOT always better as you can see...still have to use reason/bit o' restraint. :-D

'RIGHT-SIZE' your shizzle! heh (based on required resources, start small, work up to what the workload 'actually requires')
 
Last edited:

markarr

Active Member
Oct 31, 2013
421
122
43
Here is a decent article how vmware's cpu scheduler works.

Virtual CPUs – The Overprovisioning Penalty of vCPU to pCPU ratios | ZDNet

As Evan said, you have 20 cores, You created three vms with 16, 16, and 10. Each time one of those vms wants to run a cpu cycle vmware has to schedule all of them at once. It does not matter if the vm os is only using a core or two, esxi has to schedule all of the assigned cores to execute the cpu cycle. So the cpu ready metric that esxi has tells you how long a vm is sitting there waiting to run its cpu cycle. I am constantly fighting this at work when people say its slow just add more cpus.

Try not to assign more cpus per vm than what one cpu will take (you can if you configure some additional settings to pass the NUMA into the os and pin the cpus but, still not recommended). If you do then the vm will be slowed down but having to wait for the commands to moved back and forth between the two cpus, plus memory will start being split between the two which also causes some slowdowns. For your ad, dns, dhcp that would get by just fine with one or two cpus.
 

EffrafaxOfWug

Radioactive Member
Feb 12, 2015
1,394
511
113
Yes, you've allocated way more CPUs than you should. Policy for virtuals should always be to assign the right number of CPUs to the VMs - no matter what product vendors will say about "more CPUs will make everything go faster!" - allocate only the virtual hardware you need to get the job done.

But check if your memory is ballooning in your VMs as well - you say you've got 32GB total and at least 16GB of that is allocated to a single VM, how much memory is allocated in total? Using any more than 30GB and you'll start swapping VMs out to disc. From your descriptions of the symptoms - frequent freezes as opposed to just general slowness - ballooning sounds like more of a culprit than CPU ready stalls, and it'll utterly destroy performance if you're thrashing memory in and out of vswap.
 

vrod

Active Member
Jan 18, 2015
241
43
28
31
As everyone else says, way too many vCPU’s...

In terms of storage perf, if you are running 6.5, upgrade to the latest 6.5 U1. There’s a known issue with the TRIM function on various 6.5 builds, causing even the best ssd’s to run close to useless