Troubleshooting GPU passthrough ESXi 6.5

Ch33rios

Member
Nov 29, 2016
102
6
18
39
you can try to drop the memory down to 2 gb see if that resolves the issue with driver install. if it does, then you need to expand the pci hole size.
The hole need to be big enough for all the pci devices. so you might need to experiment and try numbers like 1200 - 4800 etc. I dont have an exact means of figuring it out but i read it somewhere that values might need to change base on vid memory.

pciHole.start = "1200"
pciHole.end = "2200"
Its very strange that the aforementioned config (pciHole.start/end) setting seems to be required for some but not all. I wonder what the determination is? I know for my setup, your steps worked perfectly and I've read others having success without seemingly setting any pciHole config.
 

RyC

Active Member
Oct 17, 2013
357
89
28
ESXi 6.5 (and IIRC 6.0) is supposed to set the pciHole for you automatically
 

TLN

Active Member
Feb 26, 2016
410
55
28
31
I've checked my setup and it says "PCI.dynhole" or something like that.

I'm running into another issues: I have multiple videocards installed. When I pass-through older 7750 to my VM - it works fine. Sleep, daisy-chained monitors, etc. When I pass-through my R9 280x to same VM (and remove 7750 obviously) - it works, but I see some flickering from time to time and VM fails withing several hours.
When I pass-through that Videocard to different VM (same OS - Windows 7), it works fine. Haven't tested it for days though. Difference is, since it's new VM and created in ESXi 6.5 it's "version 13", while my old VM is "version 11". Can this cause a problems?
Also, old VM is pretty bloated with different software, so this might be an issue too.

Any ideas, how can I troubleshoot it?
 

Ch33rios

Member
Nov 29, 2016
102
6
18
39
I've checked my setup and it says "PCI.dynhole" or something like that.

I'm running into another issues: I have multiple videocards installed. When I pass-through older 7750 to my VM - it works fine. Sleep, daisy-chained monitors, etc. When I pass-through my R9 280x to same VM (and remove 7750 obviously) - it works, but I see some flickering from time to time and VM fails withing several hours.
When I pass-through that Videocard to different VM (same OS - Windows 7), it works fine. Haven't tested it for days though. Difference is, since it's new VM and created in ESXi 6.5 it's "version 13", while my old VM is "version 11". Can this cause a problems?
Also, old VM is pretty bloated with different software, so this might be an issue too.

Any ideas, how can I troubleshoot it?
Im by no means an expert in ESXi troubleshooting but I'd wager the best place to first start looking is the logs within ESXi and see if there are any notable errors/warnings/interesting entries around the time that the VM fails. Additionally, you can perhaps check the VM event logs too to see if there is some additional info at the VM level.

The fact that the 280x works fine when on a separate VM but when you switch it back and forth on another VM just shows to me that you probably shouldn't be trying to switch it in/out :) I've accepted that while GPU passthrough is indeed a very workable and useful solution for the home setup (on consumer grade cards anyway) it is not by any means perfect and when you start to swap passthrough hardware in/out of a VM...things can get unstable.

If its no big issue just build a new VM fresh if you want to leverage the 280x as the primary gpu for a Win7 install. Seems like the best option with the least amount of head-scratching when doing troubleshooting.
 

TLN

Active Member
Feb 26, 2016
410
55
28
31
The fact that the 280x works fine when on a separate VM but when you switch it back and forth on another VM just shows to me that you probably shouldn't be trying to switch it in/out :) I've accepted that while GPU passthrough is indeed a very workable and useful solution for the home setup (on consumer grade cards anyway) it is not by any means perfect and when you start to swap passthrough hardware in/out of a VM...things can get unstable.
Actually it was prety stable with my setup. I haven't reloaded host, but I was swapping Videocard betwen VMs, including Mac OS X and it was working just fine.
Not no mention that I was running that VM on a different host with HD8490 connected to it.
 

Mymlan

Clean, Friendly, and In Stock.
Oct 1, 2013
19
38
13
Yes, its always been an issue for on-board GPU's but this is the first time I have been able to make use of the Intel Graphics driver after the pass-through and gain access to Quick sync. This was definitely not available in 6.0. Whatever the case, we are getting closer.
Hey fellas,

I'd like to throw my experiences into the mix. I've been finding much better results with ESXi and the Skylake/Kaby Lake iGPUs for passthrough. I haven't found a VM that couldn't accept the integrated graphics and boot properly with the drivers.

Unfortunately, as Crhendo pointed out, it seems to be reserved for compute only (like QuickSync) and cannot display to any monitors. I've found supposed proof here:

Intel NUC-6i5SYH – ESXi – Wifi – Iris Graphics-Pass-through in Windows | My Virtual Blog

And plenty of similar interest here:

Interesting little note, ESXi 6.5 now supports pass-through of Intel HD graphics. • r/vmware


I'm not sure if anyone has had any luck on getting legitimate video output via passed-through iGPUs except for the one guy with the SkullTrail NUC, but I figured I'd reach out and see if anyone has any more info.
 

epicurean

Active Member
Sep 29, 2014
676
42
28
After many months of frequent "purple screens of death" for my esxi 6.0 u3 setups,
I have since done the GPU passthroughs for my server (SM X9DR3-LN4F+) very differently for the past 2 weeks, and no more "purple screens of death" to date- fingers crossed

4 windows VMs - 2 x win 8.1, 2 x win 10 workstations
1 AMD, and 3 different Nvidia GPUs (which traditionally does not passthrough well in esxi).

I added only this in the passthru.map

# NVIDIA
10de ffff bridge false

For the 1 windows 8.1 and 2 windows 10 VMs that uses an Nvidia GPU, I added these entries in the vmx file

hypervisor.cpuid.v0 = "FALSE"
pciHole.start = "1200"
pciHole.end = "2200"

I did not make any amendments to the single windows 8.1 VM that uses an AMD GPU (Firepro V5700 which has no audio). The other 3 Nvidia GPUs are (Quadro 2000, GT 710, GT630)

All 4 windows VMs also has a USB 3.0 host controller passthrough to it (ASmedia 1042A)
 
Last edited:

Rand__

Well-Known Member
Mar 6, 2014
4,610
918
113
@epicurean Now that's interesting. Given that you used consumer cards that makes me wonder whether this would work with a GTX1060 or so ...
 

epicurean

Active Member
Sep 29, 2014
676
42
28
@Rand_ only one way to find out :)

I gave up on AMD cards although they were supposed to be "unlocked" as their driver installation was also trickier , and frankly messy as well.
Now I have a bunch of AMD 6450 cards lying around.
 

TLN

Active Member
Feb 26, 2016
410
55
28
31
I added only this in the passthrough.map

# NVIDIA
10de ffff bridge false
What is passthrough.map?

For the 1 windows 8.1 and 2 windows 10 VMs that uses an Nvidia GPU, I added these entries in the vmx file

hypervisor.cpuid.v0 = "FALSE"
pciHole.start = "1200"
pciHole.end = "2200"
Doesn't 6.0 and 6.5 add pci hole automatically?

hypervisor.cpuid.v0 = "FALSE" is required for Nvidia

All 4 windows VMs also has a USB 3.0 host controller passthrough to it (ASmedia 1042A)
Wait, so you can share controller with all the VMs? o_O? I've run into problems adding mouse/keyboard to VM, with two VMs, as I have only two controllers available on board.
 

Ch33rios

Member
Nov 29, 2016
102
6
18
39
Wait, so you can share controller with all the VMs? o_O? I've run into problems adding mouse/keyboard to VM, with two VMs, as I have only two controllers available on board.
Yeah I had that same reaction. COuld be specific to the mobo being used but for me, I had to buy a PCIe USB card to pass through to my Win10 VM for the mouse/keyboard. My only other option was to pass through the USB3.1 controller to the VM as it was the only other separate controller on my board (and I use a USB stick to boot ESX so didn't want to lose that functionality).
 

epicurean

Active Member
Sep 29, 2014
676
42
28
@TLN
Look inside your host for this file /etc/vmware/passthru.map . Sorry about the typo earlier.

Yes , 6.0 and beyond suppose to add the PCI hole automatically but since my workstations are up 24/7, have not had the chance to remove those entries to check if they work still. I am also certain there is no harm leaving them as it is.

My PCI-e USB 3.0 card has 4 individual chips, so it shows up as 4 PCI devices that I can passthrough to 4 VMs. Its this card
Amazon.com: HighPoint 4-Port USB 3.0 PCI-Express 2.0 x 4 HBA RocketU 1144D: Computers & Accessories
 
  • Like
Reactions: marcoi

vinay

New Member
Mar 31, 2017
17
1
3
34
Just thought i will give my update.. I after numerous trial and errors, nothing worked for me. I gave it all for a week. But for some reason the actuall monitor never got the signal and the VM kept crashing when i tried to install the AMD driver. Ever since i thought i give it shot on KVM(proxmox to be exact) and it worked.. Not in the first try but atleast when i managed to passthrough the GPU the signal went to monitor and so on.. Now until ESXi come with a simpler/concrete solution , i dont think i can return and its such a shame because i just like esxi network stack..
 

Paul Braren

New Member
Nov 10, 2012
22
6
3
Wethersfield, CT
TinkerTry.com
Now that's interesting! I was thinking about that, but never found such card.
Yep, that PCIe USB 3.0 card has worked great for ESXi passthrough since the original HighPoint 1144A card back in 2011. Allowed me to pass all sorts of fun stuff through to VMs, like >2TB arrays before 2TB virtual disks were possible. Told HighPoint booth folks at CES 2012, they seemed happy, and a bit confused by this unusual use case. Based on this post, maybe not so unusual!
 
  • Like
Reactions: name stolen

TLN

Active Member
Feb 26, 2016
410
55
28
31
Yep, that PCIe USB 3.0 card has worked great for ESXi passthrough since the original HighPoint 1144A card back in 2011.
I see many cards like 1144A, 1144B, 1144C and so on. What's the difference?
I doubt that I'll need one, two "Desktop VM" are enough for me and I have two controllers, but who knows.
 

MACE529

New Member
May 1, 2017
2
2
3
25
Hey guys. You have no idea how happy I am that I've found this post. This may be my safe haven. Heres' my story;
I've got a build and a half (literally) on my hands, and reckon you'll be interested.
I've just purchased a 1080Ti, and thought it'd do well to replace one of my 780Ti's. All reference btw.
I lashed out and went M-ATX LGA2011-3 (which sort of locked me to a single board, whoops) paired with a 6800K @ 4.2GHz.
Everything is nice and stable in Windows, as you'd expect. Even with both 1080Ti & 780Ti in at once, able to happily 'swap' 3D applications between GPUs across 3 screens. Photos can be provided (happily) upon request.
Was originally intrigued with Linus' unRaid videos (as we all were) and coincidentally found this VMVisor around the same time of new GPU purchase.
Without any Googling assistance, gave building a dualrig machine a shot, for shits. Have plenty of experience with VMWare Workstation, but never actually understood ESXi (let alone on bare metal) up until 3-4 weeks ago now. And holy shite, was I surprised with what I could do.
Built two Windows images, 1 - (6cpu, 16gb RAM, 1x512gb 850Pro, 1080Ti) & 2 - (4cpu, 10Gb RAMm 1x512gb 850Pro, 780Ti) and booted, not expecting things to work. But they did, sort of. Had some of the listed issues in this post, but fixed them from sporadic searches on random sites; mostly same fixes aswell.
In the end, I had two perfectly gaming-capable systems running. Synergy was connecting them both for I/O, sorted out a means for both systems' audio to work with my AVR, and I had the most powerful virtual machine on Day2 of 1080Ti release. Goodtimes.avi.
Seriously, 3DMark (all tests work flawlessly) running on both systems was a dream. Literally double the performance of the 780Ti, and I could just sit there and watch the numbers rise. Temps are actually stable aswell, all crammed in a Corsair Air540.
Had an issue the other day, ESXi died on boot. This system goes off when I'm not home, so AutoStart is configured, yadeya.
Forgot that I had the NZXT USB Hub bridge thingy (two TypeA ports) passed through by mistake, making all ESXi config running from memory.
So that broke, and I had to redo my ESXi entirely. Easier than fixing. Did that, no problems. Fixed my issue.
I had an issue bringing back the 1080VM, so removed the PCI passthrough and attached it to a new 'replacement' VM.
I'm in the middle of setting that one up (for SysPrep; attempting to build WDS image for my server) as I type this now, and am hitting a snag. Code 43 for the 1080Ti. To be honest, I know this is a virtual issue and nothing physical, but it scares the life out of me. I do believe I killed my 'replaced' 780Ti in an attempted SLI config, which for some reason didn't work out. Haven't been able to test that yet, but back to the point.
I've disabled and uninstalled the SVGA adapter, and have that false hypervisor flag in the VMX set already. In theory, same as my currently on 780VM.
Its currently 12:46AM +10 where I am, and unfortunately I was hoping to do some crisp CSGO tonight. That may not be happening, so I'm gonna end this here. If anyones got an advice, or would like some more info on what this system is doing, lemme know and I'll do things.

Please and thank you all <3
 

MACE529

New Member
May 1, 2017
2
2
3
25
Excuse the camera, but boom! Tw0 systems! 1080 powering blue monitors, 780's handling the top Ultrawide.
Why do things not work while the sun is up? Ugh. Time for bed.
Solution to my issue; surprisingly me applying the 'hypervisor.cpuid.v0 = FALSE' flag originally when I rerecreated the VM, either didn't apply or it skipped my mind. So reapplied that, started from scratch, deleted all sources of NVIDIA, reinstalled the SVGA adaptor just to uninstall it (thank you VMWareTools) and redid from the start. Allofasudden things! Had to reconfigure to get the DisplayPort to precede the HDMI port I was using for testing, but it all works. Screens overclock respectively as they usually do without hassle, a quick gaming test shows stable everything, so lets back some shit up (Y)
Anyone need/want any info, lemme know. More than happy to share this build. Need to write up a proper story for this rig for the NVIDIA forums lel.
 

Attachments

Ch33rios

Member
Nov 29, 2016
102
6
18
39
Excuse the camera, but boom! Tw0 systems! 1080 powering blue monitors, 780's handling the top Ultrawide.
Why do things not work while the sun is up? Ugh. Time for bed.
Solution to my issue; surprisingly me applying the 'hypervisor.cpuid.v0 = FALSE' flag originally when I rerecreated the VM, either didn't apply or it skipped my mind. So reapplied that, started from scratch, deleted all sources of NVIDIA, reinstalled the SVGA adaptor just to uninstall it (thank you VMWareTools) and redid from the start. Allofasudden things! Had to reconfigure to get the DisplayPort to precede the HDMI port I was using for testing, but it all works. Screens overclock respectively as they usually do without hassle, a quick gaming test shows stable everything, so lets back some shit up (Y)
Anyone need/want any info, lemme know. More than happy to share this build. Need to write up a proper story for this rig for the NVIDIA forums lel.
Congrats! Its always the little things isnt it :) Setup looks nice and Im a little jealous. I sort of wish I had bought a slightly more powerful CPU in retrospect. Ah well...wait a few more months and I'll be back at spending money