What performance sacrificed by running VMware vSphere

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

snakyjake

Member
Jan 22, 2014
75
1
8
What performance do I sacrifice by running VMware vSphere (or other virtualization solution)?

I don't know much about "bare metal" and all the offerings. Sounds ideal to have one super machine hosting other machines. I want one of the virtual machines to be my main workhorse, and several other VM's to experiment. I need to know what compromises/sacrifices I will need to consider, and ultimately decide if virtualization is something I want to do, versus separate physical machines.

I use web, video encoding (handbrake or if something better), video/photo editing, and a lot of file checksums (via snapraid).

Thanks.
 
Last edited:

i386

Well-Known Member
Mar 18, 2016
4,218
1,540
113
34
Germany
<1%
All modern CPUs have instruction sets for virtualizing & Features that allow to Pass through Hardware to virtual machines
 
  • Like
Reactions: Spartacus

ecosse

Active Member
Jul 2, 2013
463
111
43
<1%
All modern CPUs have instruction sets for virtualizing & Features that allow to Pass through Hardware to virtual machines
tbh if I was starting fresh with no impedance on existing technologies I'd look at containers first. VMware / Hyper-V is probably easier to get into though.

It was a few years ago admittedly (so my memory is hazy!) but when I was discussing with a vendor and vmware hosting a retail merchandising service on vmware versus bare metal the database throughput to the server - latency and throughput - was the main difference - something like a 7%-15% overhead. Some technologies Vmware stated could perform better than bare metal - Hadoop being one of them.

https://arxiv.org/pdf/1708.01388.pdf is interesting.
 

Rand__

Well-Known Member
Mar 6, 2014
6,626
1,767
113
It depends on what you are looking for - in my experience you loose very little CPU performance, you loose a bit memory performance, but you tend to loose a lot of storage performance (unless the VM is actually running on a datastore on a local drive which counteracts many of the advantages of running a VM).

So in your use case - yes you can do it.
 

nk215

Active Member
Oct 6, 2015
412
143
43
49
The only thing that I noticed is that you won’t get single-core turbo boost in a hypervisor environment. It may have something to do with the fact that I always have a handful of VMs running. Many of them are idle but never 100% idle.
 

Docop

Member
Jul 19, 2016
41
0
6
45
I agree with a perhaps 2 % less for video intensive application. Let see directly over gaming... the difference on a main one vs running in vm a game, with passthrough is about 1 to 2fps drop at max. Overall is full transparent, i run 2 box and 2x usb card with vm pass on all the port. all good. But the more ram you put the better. But still any vm with 4gb is ok. gaming i put like 9gb.
 

BoredSysadmin

Not affiliated with Maxell
Mar 2, 2019
1,050
437
83
It depends on what you are looking for - ... but you tend to loose a lot of storage performance (unless the VM is actually running on a datastore on a local drive which counteracts many of the advantages of running a VM).

So in your use case - yes you can do it.
CPU and memory performance efficiency is highly depending on the hypervisor. VMware is the thinnest/efficient one and you'd lose an only low single-digit performance-wise.
I disagree with you on storage. While in most cases IO is the typical performance bottleneck of most virtual environments, in most cases it's hard to measure/estimate or correctly locate to root cause.
What I'm trying to say is correctly designed shared storage which could sustain sufficient IOPS under full load will work in extremely similar fashion, virtualized or not. SSDs are relatively cheap. Dedup and compression make them even "cheaper". There is no reason to run VMs from disks in production. Hybrid SANs could be ok, but SSDs should be at least 20% of total capacity, depending on needs.
 

Rand__

Well-Known Member
Mar 6, 2014
6,626
1,767
113
Well I guess it depends on whether we are talking about semi virtualized storage (as in a file with pass through devices) or fully virtualized ones.

IOPS are quite latency sensitive as we all know, so adding even a minuscule bit to each and every one can cause quite an overhead.
Sure, not on a single SSD or nvme, but when we are talking about large amounts of IOPS I never managed to achieve expected performance.
O/c you might be correct and I measure incorrectly or didn't have the skill (or support contracts) to actually locate the root cause, but thats my experience;)
 

BoredSysadmin

Not affiliated with Maxell
Mar 2, 2019
1,050
437
83
In my experience many storage arrays overpromise, under-deliver, under bought and do not scale out easily [or at all].
One example is in VMWare world, Paravirtual disk controller should provide the best performance. You could have up to 4 disk controllers it's recommended to spread your disks on as many controllers as possible.
VMFS v6 is a must as well.
In general each VMWorld has a few sessions specifically for best designs for monster VMs and most of these session recording available online. I could find some if you're interested.
 
Last edited:

BoredSysadmin

Not affiliated with Maxell
Mar 2, 2019
1,050
437
83
VMworld | On-Demand Video Library | VMware
check out these videos:
Extreme Performance Series: Performance Best Practices (HBI2526BU)
Extreme Performance Series: DRS 2.0 Performance Deep Dive (HBI2880BU)
Extreme Performance Series: SQL Server, Oracle, and SAP Monster DB VMs (BCA1482BU) [my fav is this one]
Extreme Performance Series: vSphere + Intel Optane DC PMEM=Max Performance (HBI2090BU)
Extreme Performance Series: vSphere Compute and Memory Schedulers (HBI2494BU)

VMWorld account is required but it's free to get.
 

thulle

Member
Apr 11, 2019
48
18
8
For your use cases its probably negligible.
My worst use cases recently using KVM are:
Factorio headless server on 2690v1, 20% performance drop.
Optane 900p QD1 IOPS on x5650, 93% performance drop.
 

BoredSysadmin

Not affiliated with Maxell
Mar 2, 2019
1,050
437
83
For your use cases its probably negligible.
My worst use cases recently using KVM are:
Factorio headless server on 2690v1, 20% performance drop.
Optane 900p QD1 IOPS on x5650, 93% performance drop.
This isn't first source info but from what I heard while VMWare is lightest, other hypervisors like Xen server (QEMU/KVM based) could definitely steal 20-30% performance.
I'd be very curious to see other actual benchmarks supporting or disagreeing with my view.
 

thulle

Member
Apr 11, 2019
48
18
8
This isn't first source info but from what I heard while VMWare is lightest, other hypervisors like Xen server (QEMU/KVM based) could definitely steal 20-30% performance.
I'd be very curious to see other actual benchmarks supporting or disagreeing with my view.
That matches what I heard a couple of years back too, though I have no current information.
Xen and KVM are different hypervisors as far as I know.

Factorio is very taxing on memory, the number of memory channels on the underlying hardware was the primary criterium when looking for VM hosting for a recent clusterio multiserver factorio event. Would be interesting to see a comparison of different hypervisors.
I don't think it would matter in the case of the optane though, I did a trace to see what the CPUs were up to, and it was all vmexit/vmentry to handle the interrupts. I fiddled around a bit and managed to lower the latency marginally before posting about it on reddit. I got a reply from Alex Williams @RedHat, maintainer of VFIO, saying newer hardware with posted interrupts was needed.
That comment led me here:
https://www.anandtech.com/show/10158/the-intel-xeon-e5-v4-review/5

Posted interrupts allow for injecting the interrupt directly to the VM, avoiding the exit/entry latency.

On my X5650 I get a 200μs latency, or 5000IOPS
Xeon E5-v2 seems to be able to bring this down to 50μs, or 20000IOPS
Xeon E5-v4, where posted interrupts are introduced, brings it down to 5μs, which would get the 900p from baremetal 15μs/66kIOPS to 20μs/50kIOPS. A more decent 25% performance reduction but it still seems to be the worst case scenario for virtualization.
 
Last edited:

Docop

Member
Jul 19, 2016
41
0
6
45
But as the main question, just for a home system with 1-3 vm or even 5. Vshpere don't give much, but eat memory, just a plain esxi install give most used feature needed. Because, just quick note, you install esxi, then you create a vm and install vshpere to do advance gestion and so on. But not needed.