Best Use of CPU Resource: Scenarios

Widdily · Mar 19, 2015

Hello all!

Perhaps my Googlefu skills are lacking but I wanted to see how you all manage processor time between multiple VMs. With HVM, I have a new found fascination with the controls available for dedicating time to VMs. So to get a better understanding, first a quick summary and a couple of scenarios I am curious as to how you guys would configure. Aside from how many vCPUs we are able to assign, available to us are three controls basically: minimum, maximum, and priority (reserve, limit, weight). I have not been able to find a good philosophy for their uses but I am experimenting with taking these steps through them:

Set a modest reserve based on how many VMs I'd like to run at any given time and use this for a basis of prioritization later (more on this in scenarios below).
Cap maximum at just a hair below 100% also based on how many VMs I'd like to run
Most interestingly, prioritizing the weight by which VM should take precedence

Now, what I like is how HVM shows overall system resource usage with given settings. It makes it a lot easier to visualize and calculate how the system would behave. However, just because of the dynamic nature of resource consumption, I am wondering if there might be a better way to look at these settings. Here are a couple of examples and I won't get into RAM usage as that is not the focus:

Dual Xeon 4C/8T (8C/16T total), running four VMs, I want to have what I call "contested space" for dynamic CPU usage. What I have loosely drawn in the sand is above 60% and below 98% overall usage (OAU). We'll say two VMs get 4*vCPU and two VMs get 2*vCPU. The two former VMs account for 40% OAU reserve and the other two account for 20% OAU reserve. Likewise for the limits and so that means in the "contested space" it leaves weighting as the control over which VM gets priority yet in times of little demand, all VMs have access to as much speed as possible.
Dual Xeon 2C/2T (4C/4T total), running two perhaps three VMs, that "contested space" is 50% in the anticipation of the third VM. Question time: would it be better to have the always on VMs assigned 4*vCPUs and the third assigned one or two OR should the two VMs be assigned just two vCPUs? So lets say those two VMs are assigned 4*vCPUs with 25% reserve each (arriving at 50% OAU) leaving 20% reserve (10% OAU) for the third VM.

Am I approaching this well? What is a concise way to look at CPU usage in a multithreaded environment? How much does the host require to remain responsive in the face of VM thrashing?

Thank you very much for reading and any suggestions!

Jeggs101 · Mar 19, 2015

What kind of VM thrashing? Have you googled "Noisy neighbor"? That's a problem the cloud guys like AWS are trying to solve but Intel isn't there yet.

EffrafaxOfWug · Mar 19, 2015

The following is ESX-speke but I believe the same principles hold true for Hyper-V as well; please feel free to ignore/berate if I'm wrong

At work, we basically run everything on unlimited. We have thirty or so "problem" VMs running crappy noddy apps that will frequently get into a tizzy at the drop of a hat (and also some that were written in an infinite wait loop so run at 100% 24/7 whilst doing absolutely nothing) where we enforce a hard MHz cap but other than that we just leave DRS to do the dirty work of shunting VMs around to make best use of available resources. As we've overprovisioned we don't bother setting shares/priorities as that just adds to management overhead.

That said we have considerably more CPUs at our disposal than your scenarios do, but you omit what I consider to be the crucial detail - which is the most important VM? Pick that, pick the optimal number of CPUs it should have, pick the worst time you're happy with for $task to complete and then set a resource reservation accordingly.

As an example, let's say VM1 is my important VM. It's running $application but as much as the project manager yells at us to add more CPU jiggawatts, it doesn't scale beyond 4vCPU. We allocate it 4vCPU. It's running on a host with X*3GHz cores, since we always want this to run with as much access to the CPU as possible, we set a MHz reservation of 12000 to assure it has unfettered access to all cycles available. Even if other VMs get piled on top of the same host, VM1 should always get first dibs whenever a spare cycle appears.
Let's also say VM2 is running on the same host, also with a 4vCPU configuration/application. We don't bother setting a reservation or a resource limit so that a) in the event of CPU contention it won't tread on the toes of VM1 and b) if it *does* need to use the CPU, and the host has the cycles to spare, and VM1 doesn't need them, it also has access to the full 4vCPU.

Trying too hard to carve out explicit "slices" of the host goes against one of the biggest advantages of virtualisation IMHO; it increases up-front management costs and ensures ongoing micromanagement for the serviceable life of the VM. Chuck them all in a big pot and let them fight over the CPU, identify the important bits and performance outliers and tweak them accordingly whilst leaving the rest of the VMs alone.

Edit: Just say Jeggs' response as well - and yup, you need to identify what sort of "thrashing" you mean - and if it's cache-thrashing you're basically out of luck and should probably be running such an application on either a dedicated physical host or at the very least an uncontested hypervisor. We have a couple of applications like that ourselves, after soak-testing in our clusters we basically found that we couldn't run them in a shared environment without serious performance problems. Dev stuff lives on VM but the project eventually stumped for dedicated hardware for the prod kit.

mrkrad · Mar 19, 2015

Also throwing 2x4vcpu along with 2x2vcpu will cause imbalance on a 8core cpu, since 4 idle cores are required to execute the 4vcpu, whilst 2 empty slots for the 2vcpu, so you will naturally get more balance towards the 2vcpu since there are more slices open to run concurrently.

I'd be more concerned with DISK IOPS, as 90% of my VMHOSTS are constrained by RAM and DISK IOPS whilst the CPU is run from 9% to 49% resulting in the share system being ignored!

NetWise · Mar 20, 2015

+1 to both of the above posts. Good General vSphere design is to avoid hard coding and reserving where possible. Inevitably the humans are very bad at assessing what the systems need 24/7 compared to the system. Let the system do its work. Now if you have only one host DRS won't help you much but that's the tool that should be handling this. Once you start Micro managing you get into a never ending spiral of chasing problems like slot size and NUMA balancing etc. Unless you have a really good NEED, don't touch the dials all the time just to be fancy.

The only place I typically set reservations is on virtualized voice applications. And limits go on VM's where as above they appear 'broken' and pin the CPU (often a VMware tools issue or missing - or some virtual appliance), or I'm pretty sure the app admin is a jackass and pinning it, or the application is such that it will go 24/7 with everything you give it, so like a dog, it needs you to leash it up and tell it you only need it to do x in a given day.

I suspect you're over thinking it.

The right answer, is to deploy, monitor, adjust, revise, repeat. Right now, you THINK you know the workload and can only make a best guess. You really need to see it happen first.

markarr · Mar 20, 2015

mrkrad said:
2x4vcpu along with 2x2vcpu will cause imbalance on a 8core cpu, since 4 idle cores are required to execute the 4vcpu

This. ESXi does not do out of order cpu scheduling. So if you give a vm 4 vcpu and it is running a single threaded application (you wouldn't but for argument sake) esxi has to wait till it has 4 vcpus available for that vm before it can execute anything, even though it only needs to use one core. So limit the amount of multi-cpu vms you have running on your host, and let the hypervisor do its thing and limit the amount of manual intervention on the cpu side.

One way to see how efficient esxi is being with the cpu resources when you have multiple multi-cpu vms, is to look at the CPU ready time on the vm's. That time is the amount of time the vm is waiting to execute its next cpu cycle.

TeeJayHoward · Mar 20, 2015

markarr said:
This. ESXi does not do out of order cpu scheduling. So if you give a vm 4 vcpu and it is running a single threaded application (you wouldn't but for argument sake) esxi has to wait till it has 4 vcpus available for that vm before it can execute anything, even though it only needs to use one core. So limit the amount of multi-cpu vms you have running on your host, and let the hypervisor do its thing and limit the amount of manual intervention on the cpu side.

One way to see how efficient esxi is being with the cpu resources when you have multiple multi-cpu vms, is to look at the CPU ready time on the vm's. That time is the amount of time the vm is waiting to execute its next cpu cycle.

Sorry for the hijack, but this was the best explanation of "why you don't give a VM over 9000 vCPUs" that I've ever read. Could you explain to me how ESXi handle the other direction, though? If you give a VM 1vCPU and it needs to execute a multithreaded application, will it use the other physical cores, or will it just use the one and serialize the threads?

markarr · Mar 20, 2015

TeeJayHoward said:
Sorry for the hijack, but this was the best explanation of "why you don't give a VM over 9000 vCPUs" that I've ever read. Could you explain to me how ESXi handle the other direction, though? If you give a VM 1vCPU and it needs to execute a multithreaded application, will it use the other physical cores, or will it just use the one and serialize the threads?

It will serialize the threads. Depending on the load on the host, it may hop cores but will never use more than one physical core at a time.

NetWise · Mar 20, 2015

The other direction is simple. You gave it 1 vCPU. That's all it gets. It will not use the other cores.

mrkrad · Mar 20, 2015

ESxi will idle any cores not used by vm's and not used by the system reserve! It doesn't put them to use at all!

Search

Best Use of CPU Resource: Scenarios

Widdily

New Member

Jeggs101

Well-Known Member

EffrafaxOfWug

Radioactive Member

mrkrad

Well-Known Member

NetWise

Active Member

markarr

Active Member

TeeJayHoward

Active Member

markarr

Active Member

NetWise

Active Member

mrkrad

Well-Known Member