Troubleshooting GPU passthrough ESXi 6.5

AveryFreeman

consummate homelabber
Mar 17, 2017
370
45
28
41
Near Seattle
averyfreeman.com
Yeah, you've gotta tell us what you're working with. If it's a GeForce, I'm out of ideas. Try and find some resources from other people who've gotten it working, but AFAIK it's a really hard nut to crack, that's why I bought a Quadro.
 

AveryFreeman

consummate homelabber
Mar 17, 2017
370
45
28
41
Near Seattle
averyfreeman.com
@AveryFreeman I did hypervisor.cpuid.v0=false before even installing the GPU in the VM. I Started the System with it, and shut it down right after it. After that I installed the PCI Device and added pciPassthru.use64bitMMIO = "TRUE", pciPassthru1.msiEnabled = "FALSE" and pciPassthru0.msiEnabled = "FALSE"

Then I started the VM again and Installed the Device Drivers. If I Keep the SVGA then I will only get Bluescreens at the windows Start. I deinstalled the Nvidia Drivers now, reinstalled them, then they work for this short period (No code 43) but after one reboot Code 43 appears again.

I even tried the ACS Check , but that also didn't help. I am on the Newest Windows Version.
Did you see this post? Might be helpful - if you saw/tried it already, I apologize. Looks like there's hope for geforce people after all, though.

 

superempie

Member
Sep 25, 2015
78
10
8
The Netherlands
2 Geforces working fine here in my setup and happy with it for multiple months. EVGA 1070 for Linux Mint and ASUS 1080Ti for gaming. No passthru map adjustments needed. Depends on your hardware.
Typing this on it. Might only did ACS parameter for USB issues.
 
Last edited:

Iamfreaky

New Member
Mar 25, 2021
3
0
1
@AveryFreeman Yeah i tried that.

My Currently status is that it kinda works with the new GPU Drivers from NVIDIA, which dont generate Code 43 anymore. No need anymore for working on the VMX at least in 6.5.0.

Restarting the Client Works without any problem. Restarting the host still gives a BSOD if the GPU wasnt deactivatet befor the Host reboots.

I can manuell start the scripts, which will work. but not via the group rules in Windows automatically at boot or shutdown.
 

pro_info

New Member
Jun 27, 2021
2
0
1
France
Hello guys,
I've been following this topic for a few years now without an account on this forum, but I came to the news to see if some of you have passed the passthrough with the R465 and higher drivers that are supposed to allow virtualization. For my part I don't see any difference in my tests.
Historically I was under ESXi 6.7, the reboot gave an error 43, and I had tweaked a script to deactivate / reactivate, but it is not viable when the vm crashes, you have to restart the whole hypervisor ...

I switched a year ago to unraid, which allows me to inject the dump of my vbios which makes the card work without any problem, but in terms of stability and reliability, ESXi is far ahead, not to mention the snapshots, the dynamic vmdk...
It is for this reason and the one of the Aquantica network chip support of my QNA-UC5G1T 5Gbe that I am looking to switch back to ESXi. (I have to manage the adapter in passthrough by a vm and rerouted the upnp between two subnets, it works but it is not ideal).

i7-9700K
EVGA RTX 2070
...
I'm still trying variations of post #205, but without much success. I'll see if version 6.5 works and if it can satisfy me as a last resort.
FYI, with a pre-R465 driver, method #205 works directly with a GTX 750Ti I had lying around.

So as announced the drivers higher than R465 should work without problems, I believed it, but for my part no change, I have the error after a reboot of the VM. As said before, I have the impression that something has been changed on the RTX 2xxx generation that complicates the task.

I will put my configuration later today after some more tests.
(I go through a translation, sorry if everything is not understandable, it seems to me rather good.)
 

pro_info

New Member
Jun 27, 2021
2
0
1
France
Indeed, under ESXi 6.5, with the R471 driver, my RTX 2070 is functional even after several restarts of the VM. And without any modification on the passthru.map or vmx side.
I still need to see if I can get my NIC 5Gbe to work, it looks like I can't install the drivers.
export lspci-v RTX2070 esxi.png

Edit :
Well, everything works now, I had to update the 6.5 version with the last patch to be able to install the adapter drivers and then downgrade the esx-cli which crashes miserably when you want to edit a VM.

I have a strange behaviour when I launch the VM with the GPU passthrough, the fans of the graphic card pump 10-15 times and then the VM starts correctly, same behaviour when I stop.

Edit2 : Now the fans are not pumping, I don't know what could have solved the problem...
 
Last edited:

Eds89

Member
Feb 21, 2016
62
0
6
33
I'm wondering if someone can give me some pointers as I'm revisiting doing this myself but struggling a bit;
I have a custom build server running ESXi 7.0 U2, with a Quadro P400 connected.
I have the Quadro passed through to my VM, but even with R471 driver, I got Error 43, so had to add the hypervisor.cpuid.v0 = FALSE
Rebooted, and that error disappears, however, I am not seeing any GPU usage when doing a 4K transcode.
I have a physical monitor connected to the card, and have tried a couple of configs with it being the primary, to no avail.
I also tried svga.present = FALSE, but then I get no physical video output, nor anything via the console. Doing this also causes the error 43 to return to the device when I connect via RDP.

Not sure what else I need to try, but feels like I am very close to getting it working.

Appreciate any advice you may be able to offer.
Eds
 

Railgun

Member
Jul 28, 2018
30
10
8
I will look at my config as I have/had a GTX 1660 ti passed through to a Win10 VM with no issues. I don't recall reading differences between Quadro vs GTX, and if I do hazily recall reading something like that, it was the GTX line that had issues, not the Quadro line.

In my case, I'm running ESXi 6.7. I'm building a new ESXi 7 box to test with and have moved the card over. Initial config passed the card through OK, but I'd not spun up the VM yet to verify. I may be going a different route with the hypevisor, but I'll test it in the meantime.

...aaand just realized this is nearly a year old...so maybe not.
 

ARNiTECT

Member
Jan 14, 2020
92
7
8
Just popping in to see if there has been any update on the situation recently?

I'm still using the solution as post #245 with scripts for shutdown/start-up.

I'm using ESXi 7U2 and 3x Win10 VMs with GPUs, primarily for gaming:
1. Nvidia Quadro RTX 4000 with shutdown/startup script, drivers 516.25, streams to moonlight
2. Nvidia T1000 no script required, drivers 516.25, streams parsec/steam (no gamestream feature on this gpu)
3. Intel UHD P630 (Xeon E-2278G) no script required, for emby decoding

The VMs are usually left on, but I occasionally shut them down for tinkering, and this works fine.
Often weeks/months go by without issue, but occasionally VM 1 with the RTX4000 is troublesome and requires the host to be restarted.
 

sev

New Member
Jul 26, 2022
8
0
1
So I'm having a bear of time getting pass through to work and after spending 4 weeks on it, im about to give up.

I have a dell precision 3620 i7-6700 desktop I'm using as my esxi 7.0u3 host. I run about 5 vm's on this machine and i'm doing pass through with 3 devices, all of which work perfectly. I'm passing through the following:

1. igpu (for decoding for my cctv) and network pci-e 1x NIC to one VM for CCTV
2. LSI 9210i to another vm for NAS

I built a windows 10 20h2 VM and want to add in a Nvidia GTX 1050. I followed the steps found earlier in the posts here and I got the drivers installed, and the card comes up with error 43. If I disable/enable it, it comes up and appears to be working in device manager, but the nvidia drivers arent seeing the card. Nvidida-smi errors out with 'not found' and GPU-Z shows the card driver but shows the GPU clock at 0 mhz.

Ive tried the following settings, including reverting the VM to hardware version 6.5 running on 7.0u3

pciPassthru.use64bitMMIO=true
pciPassthru.64bitMMIOSizeGB=64
hypervisor.cpuid.v0=false
pciPassthru.msiEnabled=false

I've tried every which way to get this to work but no matter what I try, I cant get this card to work. I'm at the point where I'm wondering if I dont have enough PCI-e lanes available to get the card to work.

HELP!!!
 

superempie

Member
Sep 25, 2015
78
10
8
The Netherlands
Did you try this:
- Setting this on the VM: svga.present = FALSE
- Or this on the Hypervisor in Configure - System - Advanced System Settings: VMkernel.Boot.disableACSCheck = true
- Try switching PCI-E slots if you put the GPU in the first one closest to the CPU. You might have to redo the passthrough on the other VM's, because PCI-E ID's can change.

You also can try re-installing the Win10 VM (or create a second one) without passing through the GPU yet, update it, install TightVNC, download Nvidia driver and do not install it yet, shutdown VM, passthrough GPU as PCI device in VM settings, set "hypervisor.cpuid.v0 = FALSE" and "svga.present = FALSE" , boot up VM, VNC into it, install driver, reboot, check if it works.
You might want to retry this setting "VMkernel.Boot.disableACSCheck = true" before that.

FWIW, I tried updating my primary GPU passthorugh machine from ESXi 6.7 to 7.0u3 and had so much problems, I reverted back to 6.7. Glad I made a backup. No 7.0u3 for me on my GPU passthrough machine until VMware fixes things.
I still have another machine running on 7.0u3, but it doesn't use GPU passthrough and only passes through HBA's, which works fine.
 

sev

New Member
Jul 26, 2022
8
0
1
Did you try this:
- Setting this on the VM: svga.present = FALSE
- Or this on the Hypervisor in Configure - System - Advanced System Settings: VMkernel.Boot.disableACSCheck = true
- Try switching PCI-E slots if you put the GPU in the first one closest to the CPU. You might have to redo the passthrough on the other VM's, because PCI-E ID's can change.

You also can try re-installing the Win10 VM (or create a second one) without passing through the GPU yet, update it, install TightVNC, download Nvidia driver and do not install it yet, shutdown VM, passthrough GPU as PCI device in VM settings, set "hypervisor.cpuid.v0 = FALSE" and "svga.present = FALSE" , boot up VM, VNC into it, install driver, reboot, check if it works.
You might want to retry this setting "VMkernel.Boot.disableACSCheck = true" before that.

FWIW, I tried updating my primary GPU passthorugh machine from ESXi 6.7 to 7.0u3 and had so much problems, I reverted back to 6.7. Glad I made a backup. No 7.0u3 for me on my GPU passthrough machine until VMware fixes things.
I still have another machine running on 7.0u3, but it doesn't use GPU passthrough and only passes through HBA's, which works fine.
I'll give this a try, thanks! The one thing I cant do is swap the this 1050 and the 9210i card because the 1050 cannot physically fit on the bottom slot on the gpu. I know doing so would limit it to 8 lanes, which is fine and might balance out any sort of pci-e lane exhaustion, but I did try this with a quadro k420 (which does fit) and it didnt make a difference.

I have tried svga.present=false and what happens is enable/disable-ing the 1050 causes it to stay in error 43. With svga.present=true, enabling/disabling the gpu causes it to show that it's working (though it is not)
 

superempie

Member
Sep 25, 2015
78
10
8
The Netherlands
Try to avoid, it might interfere. The VMware remote desktop feature is just a bonus and not suitable for GPU work.
Don't know what your use case is, but if it's a Win10 VM for some GPU work, connect a monitor to it.

As for what you stated "With svga.present=true, enabling/disabling the gpu causes it to show that it's working (though it is not)", it might even have worked if you connect a monitor to it. You didn't mention if you did.
 

sev

New Member
Jul 26, 2022
8
0
1
Did you try this:
- Setting this on the VM: svga.present = FALSE
- Or this on the Hypervisor in Configure - System - Advanced System Settings: VMkernel.Boot.disableACSCheck = true
- Try switching PCI-E slots if you put the GPU in the first one closest to the CPU. You might have to redo the passthrough on the other VM's, because PCI-E ID's can change.

You also can try re-installing the Win10 VM (or create a second one) without passing through the GPU yet, update it, install TightVNC, download Nvidia driver and do not install it yet, shutdown VM, passthrough GPU as PCI device in VM settings, set "hypervisor.cpuid.v0 = FALSE" and "svga.present = FALSE" , boot up VM, VNC into it, install driver, reboot, check if it works.
You might want to retry this setting "VMkernel.Boot.disableACSCheck = true" before that.

FWIW, I tried updating my primary GPU passthorugh machine from ESXi 6.7 to 7.0u3 and had so much problems, I reverted back to 6.7. Glad I made a backup. No 7.0u3 for me on my GPU passthrough machine until VMware fixes things.
I still have another machine running on 7.0u3, but it doesn't use GPU passthrough and only passes through HBA's, which works fine.

So I tried enabling the hypervisor with VMkernel.Boot.disableACSCheck = true,

Ah gotcha. And yes I do have a monitor connected. I also have tried to set the bios of the host machine to start with the IGPU, Auto and start with the nividia card (as its currently configured) and it made no difference.

I tried the VMkernel.Boot.disableACSCheck = true setting on the esxi settings, reboted the host, made no difference.

I honestly am starting to think that with the igpu passed through and the raid card and the NIC, maybe there just isnt enough PCI-E lanes. However even if i turn off passthrough on my igpu and try to get the nvidia 1050 to work, it does the same exact thing.

I tried removing ALL my pci-e cards except the 1050 yesterday and using the onboard nic, but for some reason ESXI boots and cant find a nic. Not sure why this might be. It only seems to detect the NICs (both) if the raid card is in place. Weird behavior

My ESXI host has mulitple functions, I use it for:

1. Active directory home lab for all my authentication
2. NAS (pass through LSI 9210i) and host files
3. Blue Iris for my CCTV, using the IGPU for video decoding (works great) and using my a passed through nic (pci-e 1x) for physically separating CCTV network from my home network

The NVIDIA gpu is going to be used for computer vision running deepstack so I can have it identify objects.


btw, im using nvidia driver 512.15 if that matters.
 
Last edited:

superempie

Member
Sep 25, 2015
78
10
8
The Netherlands
iGPU may interfere, don't have experience with those. PCI-E lanes should not be the issue, then only things would perform slower.
I currently use Nvidia drivers 512.77 for my gaming VM, so most probably your are good with 512.15.

You did try the steps I mentioned for a new VM, and following the steps in that exact order? So using TightVNC from another machine/vm? If not, I don't think I have any more tips for you.
 

sev

New Member
Jul 26, 2022
8
0
1
iGPU may interfere, don't have experience with those. PCI-E lanes should not be the issue, then only things would perform slower.
I currently use Nvidia drivers 512.77 for my gaming VM, so most probably your are good with 512.15.

You did try the steps I mentioned for a new VM, and following the steps in that exact order? So using TightVNC from another machine/vm? If not, I don't think I have any more tips for you.
doing it now... hang tight!
 

Docop

New Member
Jul 19, 2016
24
0
1
43
Quite simple actually, esxi 6.7u2 , click mem all reserve and add pci card to the vm. no setting to put anywhere and it just work.