My setup is still 6.7 U3. I’m pleased to say the issue has not recurred since making the changes from my previous post.
Wow, I'm shocked
Reading this thread back a bit was an astounding blast from the past. Physical passthrough isn't really much of a thing anymore, is it? I mean, there's other solutions for sharing your IGD, like GVT-D. I may have cut teeth on the sandy and ivy bridge chipsets of 2011, but anything I've run Skylake or newer doesn't have a problem sharing virtual GPU resources. All the components are also many times faster than they used to be, so I'm not really sure what the use is in tying an entire GPU up with one VM, unless you're a gamer or a masochist.
When I was reading the voodoo witchcraft stuff we all used to try because phys passthrough is so poorly documented, it really brought me back to all the hours I wasted trying to achieve one unsupported BS edge case or another, as well. You're kind of at the mercy of VMware using their software, unfortunately, and they're really only concerned about providing vGPU support for $4,000+ VDI systems - which makes sense, because they're a business who attracts customers practically made out of money. Especially now that Broadcom purchased VMware and fired like half the staff already, VMware's only innovations I've noticed recently have been based around making their products become more expensive and harder to obtain, coincidentally pissing off and alienating the user base they never acknowledge - namely, non-paying users. Soon they'll provide little more than a deep, dark chasm for IT departments to back up dumptrucks of money.
But anyway, here's about the only stuff I remember that could be helpful:
I definitely know, if you have trouble physically passing through _anything_ PCIe device (not just dGPU, but _especially_ GPU, of course), make sure in your host's advanced settings ACS checks are disabled. You could also try and blacklist the driver for the GPU so the host doesn't get it first - some people's setups still have that issue, even with a dedicated video card that the host isn't using. The most effective way to do that is with a kernel flag in the bootloader, just like Linux (because VMware is just stolen Linux). I remember actually doing kickstart flags for other things on ESXi, like setting up support for MacOS, or hosts that wouldn't boot after an update, but I've never explicitly used any flags for PCI passthrough, so I don't have any exact pearls of knowledge to oyster all over the place at the moment (other than disabling ACS checks - it's effective af).
I moved away from where I lived years ago when I wrote to this thread, and where I had my homelab I participated in these edge-case rituals with all ya'll. They're kind of fun, but there's no future unless there's community, so I promised myself I wouldn't fill my next room with servers, and I'd go all-in on the open source. I gave my big-ass servers to a friend who wanted equipment to record IP cameras, and my virtualization workloads are almost all short-lived tests in VMs and containers on laptops and workstations.
Anyway, good luck with that, but just wanted to let you know there's a whole other ecosystem out there where you're encouraged to patch software, away from that awful, soul-sucking closed ecosystem.
Oh, one last thing - I did just come across NVIDIA's supported vDGA passthrough methods (also shared vGPU resources for VDI), here's their official repo - but they might be charging for licenses, too:
GitHub - NVIDIA/vgpu-device-manager: NVIDIA vGPU Device Manager manages NVIDIA vGPU devices on top of Kubernetes