VM with passthrough "freezes" entire ESXi box when shutdown/rebooting guest

Discussion in 'VMware, VirtualBox, Citrix' started by thedotlair, Apr 11, 2016.

  1. thedotlair

    thedotlair New Member

    Joined:
    Jul 3, 2011
    Messages:
    7
    Likes Received:
    0
    Hi all,

    VERY strange issue that I've come across and can't seem to get my head around this .. so far I've tried diagnosing this for nearly 12 hours straight without being able to get to the bottom of it.

    I'm running ESXi 6.0u2 build 3620759 (free license), on a custom build Asus P9x79 PRO running an E5-2660 Xeon and 64gb of DDR3 RAM. The system has four cards in it: IBM/LSI M1015 passthrough to a FileServer VM, an ATI X1300 boot graphics card, 2x Intel PT1000 Dual NIC cards and an AMD HD6450 Graphics card for passthrough. I have a Windows 10 VM which has been configured to use the AMD HD6450 configured in passthrough both Audio and Video to use as a test bench for a new type of system that we're currently building but don't have the actual hardware with us .. so we're shortcutting to get the development of the software achieved.

    The VM has the latest VMWare Tools installed (via the console) and runs absolutely perfectly ... except when you decide to either reboot or shutdown the VM, the actual ESX host becomes unresponsive. Stops responding to network traffic, ESXi thick and HTML consoles don't response (naturally) but there are absolutely no errors on the screen, like a PSOD. It still displays the normal yellow console which is also unresponsive, so a very hard lock.

    I've looked through the knowledge base and found KB1030265 which I've followed and now have disabled interrupt mapping but this hasn't made a slight bit of difference.

    Can anybody point me in a direction to either get logs from this thing or any suggestions to try and debug this? Appreciate that it's a tough call, especially since the hardware not everybody will be running etc but any experiences that are similar and things I can change/tune would be appreciated.

    I'm tempted to drop back to ESXi 5.5 and see if that exhibits the same problem, which would indicate hardware faults, but I would have thought loading up the VM with 1080p graphics/sound would have caused a bigger issue than shutting down the VM.

    Thanks

    Dean
     
    #1
  2. RyC

    RyC Active Member

    Joined:
    Oct 17, 2013
    Messages:
    355
    Likes Received:
    83
    Try leaving out the audio device when passing through to the VM if you don't use HDMI audio
     
    #2
  3. thedotlair

    thedotlair New Member

    Joined:
    Jul 3, 2011
    Messages:
    7
    Likes Received:
    0
    Thanks RyC, I'll give that a shot as it could be the audio causing an issue but I kinda need that as well :( I'm also going to try reverting back to 6.0GA to see if it's a Update2 issue.
     
    #3
  4. thedotlair

    thedotlair New Member

    Joined:
    Jul 3, 2011
    Messages:
    7
    Likes Received:
    0
    Just spoken to a colleague about this who's doing the same thing but on a SuperMicro board and is getting exactly the same behavior when passing through a GPU to a Windows VM. He's also running Update2 as well. Difference being, he's passing through a 290X.
     
    #4
  5. pricklypunter

    pricklypunter Well-Known Member

    Joined:
    Nov 10, 2015
    Messages:
    1,521
    Likes Received:
    432
    Have you tried passing the card through to say a win8.1 VM? Does it do the same thing? Could it be a win 10 driver issue?
     
    #5
  6. xienze

    xienze New Member

    Joined:
    Apr 11, 2016
    Messages:
    3
    Likes Received:
    0
    I know that ATI cards have trouble starting back up when a proper PCIe bus reset hasn't occurred (like you would normally do running bare metal). Indeed, this is the issue I would see with my VM: the first time you start the VM after rebooting the host, no problems. If you restart the VM, the VM itself would lock up. I didn't see the exact issue you were seeing though (host locking up).

    Here's something quick you can try, and if this works for you I can give you a much more in-depth post about how to fix the issue in an automated manner. Reboot your host and start your VM with the card passed through. Then open the Windows device manager and disable the card. Restart the VM -- there should be no lockups. After the VM has started, re-enable the card via device manager. If all that works for you, I can help you out with an automated solution.
     
    #6
  7. thedotlair

    thedotlair New Member

    Joined:
    Jul 3, 2011
    Messages:
    7
    Likes Received:
    0
    Hey Xienze, thanks for the suggestion. I went through and tried everything you said and ended up with a host lockup after the device was disabled in the VM and then the VM rebooted :(

    Up until that point, it was working perfectly even though it was the only VM turned on at that point.

    Actually tried it with two different VMs: a Windows 10 Pro and a Windows 8.1 Pro. Both gave the same behaviour of the host locking up :(
     
    #7
    Last edited: Apr 11, 2016
  8. whitey

    whitey Moderator

    Joined:
    Jun 30, 2014
    Messages:
    2,762
    Likes Received:
    857
    Just happened to me as well w/ an NVIDIA quadro 4000. WTF
     
    #8
    T_Minus likes this.
  9. thedotlair

    thedotlair New Member

    Joined:
    Jul 3, 2011
    Messages:
    7
    Likes Received:
    0
    Well I got it solved! But not in the way that I wanted to :( Had to go back to 5.5 Update 3 but absolutely no freezing on startup/shutdown and works like a dream.

    Guess I'll be staying on this for a while especially as the HTML fling is available :)
     
    #9
  10. starkindler89

    starkindler89 New Member

    Joined:
    Apr 25, 2016
    Messages:
    1
    Likes Received:
    0
    I've been having the same problem on my build: HP Z820 workstation, Xeon E5-2770, 64GB DDR3, Radeon R9-380 passthrough. I could get the card to passthrough alright and display video but after shutting down or rebooting the VM, the host would freeze and crash. After rebooting and booting the VM, I could get it to work again but as soon as the VM rebooted, the host would hang again.
    I tried different versions of ESXi 5.5, 5.5u2, 5.5u3, 6.0, 6.0u2, installing various versions of VMware tools, only installing part of the VMware tools suite, different graphics card drivers, Windows 7, 8.1, and 10 but nothing was working. Finally stumbled on a thread which mentioned the PCIe bus not getting fully reset with Radeon cards. I found that if I disable the passthrough video card in Windows device manager before shutting down the VM, I was able to reboot the VM and enable the card device again without crashing the host. With a little scripting added to the startup/shutdown section of the local GPO, I'm up and running now without any problems! Hope this helps!
     
    #10
  11. MKO

    MKO New Member

    Joined:
    Jun 23, 2016
    Messages:
    6
    Likes Received:
    0
    I've also experienced the ESXi 6.0 u2 host freezing when shutting down or rebooting a Windows 8 VM with passthrough devices attached.

    After reading starkindler89's post I started disabling passthrough devices and found that passthrough of the USB 3 controller is the cause of the issues on my system. I have an asus M5A99X evo r2.0 board and passed through a radeon 6870 and the Asmedia USB 3 ports to my primary VM.
    When I don't pass this controller to the VM or eject it in the guest prior to shutting down or rebooting everything is fine.
    I need to look into which device to disable in device manager, because ejecting the root hub also removes the passthrough.

    The ESXi build running on my machine is 3620759 from March 2016. Based on the VMware KB entries related to the latest build (3825889) updating probably won't solve the underlying issue but I might give it a try soon because some IOFilter issue has been solved.
    I will try updating the BIOS first, maybe something controller related will change, it might also be driver related ofcourse.
     
    #11
    Last edited: Jun 23, 2016
  12. xienze

    xienze New Member

    Joined:
    Apr 11, 2016
    Messages:
    3
    Likes Received:
    0
    MKO, if you are able to disable the device in the guest prior to VM shutdown and that fixes your problem, what you can do is write a simple script that disables upon shutdown and enables upon startup. It's what I do for my graphics card. If you can verify that works I'll write up the process for you.
     
    #12
  13. MKO

    MKO New Member

    Joined:
    Jun 23, 2016
    Messages:
    6
    Likes Received:
    0
    xienze, thanks but no need now :) I know how to disable and enable devices through powershell but I was unable to disable the suspectUSB hub, I could only eject it.
    But I have sinds found that the issue is caused by a usb composite device which was present in device manager.
    The vm thinks this device is connected to the passed through USB controller but this was no longer the case.
    After disabling this device everything appears to be working correctly, even the connected mouse and keyboard stil work fine.
     
    #13
  14. F1ydave

    F1ydave Member

    Joined:
    Mar 9, 2014
    Messages:
    118
    Likes Received:
    21
    There are some known IRQ problems with some cards with VMware. A lot of people are able to solve it by not using the main express slot.

    At last the 5.5u3 worked out for you!
     
    #14
  15. Jacob Staub

    Jacob Staub New Member

    Joined:
    Jul 15, 2016
    Messages:
    3
    Likes Received:
    1
    To all that have contributed to this thread before me: awesome work. I've been whacking my head against ESXi and have been dying to find anything of use in the giant information cesspool.

    My Build: HP Z620 workstation, Xeon E5-2670, 64GB DDR3, ESXi 6.0.0 3620759, NVIDIA Quadro 2000 passthrough, Windows7 Professional VM. Entirely the same behavior was observed with respect to rebooting the VM after successfully passing through a Quadro 2000. Entirely the same behavior was observed with respect to the disabling and enabling of the Quadro 2000 driver. The scripts I used to automate the shutdown/reboot workaround were the following:

    1. To disable the Quadro 2000 driver (a bit cumbersome to implement as a "Schedule task"):
    "C:\Program Files\devmanview-x64\DevManView.exe"/disable "NVIDIA Quadro 2000"

    2. To enable the Quadro 2000 driver (simple to implement as a "Schedule task"):
    "C:\Program Files\devmanview-x64\DevManView.exe"/enable "NVIDIA Quadro 2000"

    Click here for instructions on DevManView which explains the scripts and how DevManView handles them.

    To implement the scripts within a Windows 7 VM two "Schedule tasks" were created using the following method:
    1. Goto: Control Panel >> System and Security >> Administration Tools >> Schedule tasks

    2. Set up the enable driver "Schedule task"
    A. Under "Actions" menu choose "Create Basic Task"
    B. Name = arbitrary
    C. Trigger = When the computer starts
    D. Action = Start a program
    E. Program/script = File with enable script saved with suffix ".cmd"

    3. Set up the disable driver "Schedule task" (play with plethora of accessory settings within the "Schedule task" interface as required to yield desired task execution behavior)
    A. Under "Actions" menu choose "Create Task"
    B. Name = arbitrary
    C. Trigger >> New >> Settings = Basic >> Begin the task: On an event >> Log: System >> Source: USER32 >> Event ID: 1074
    D. Actions >> New >> Action: Start a program >> Program/script = File with disable script saved with suffix ".cmd"
    E. Conditions >> None
    F. Settings >> As required
    Note: Click here for remarks on "Trigger" set up.
     
    #15
  16. Jacob Staub

    Jacob Staub New Member

    Joined:
    Jul 15, 2016
    Messages:
    3
    Likes Received:
    1
    A quick follow up on my GPU passthrough experience:

    Quadro 2000 audio was successfully passed through to a Windows 7 VM. The process amounted to passing through Q2000 audio first. Once Q2000 audio was working Q2000 video was passed through. VM start time with Audio/Video passed through takes so long it seems like the system is frozen on startup. Audio quality was below average with scratchy delay that seemed to come and go based on VM activity levels. And despite scripting the enabling and disabling of the Q2000 audio and video devices, shutting down/restarting the VM reliably crashed the ESXi host.

    Out of curiosity an attempt was made to pass through an available, "Intel Corporation C600/X79 series chipset High Definition Audio Controller." The Intel Hi-Def device passed through successfully to the VM. Audio quality was still below average with a hint of scratchy delay based on VM activity level. However, for Lo-Fi purposes the audio quality suffices. No enable/disable script of the Intel Hi-Def device was required to produce stable shut down/restart behavior.

    So, it appears audio (at lest built in audio) can be passed through to a VM reliably so long as the audio is not part of the GPU. I have not tested an add-on PCIe audio card but it ought to work since the flow of information is over the same bus. That being said, I have a feeling the audio quality wouldn't be all that great but the only way to find out for sure is to try it.

    Happy hunting,
    Jake
     
    #16
  17. nk215

    nk215 Active Member

    Joined:
    Oct 6, 2015
    Messages:
    313
    Likes Received:
    91
    ESXi 5.5 has none of these issues right? I am on ESXi 5.5U1 and has no problem with PCI pass through. In fact, I use PCI pass-through to log onto my guest at the ESXi console w/o any issue.
     
    #17
  18. epicurean

    epicurean Member

    Joined:
    Sep 29, 2014
    Messages:
    543
    Likes Received:
    20
    Hi Jacob,
    How exactly did you do 1. ,2 and 3 ? I have AMD 6450 cards in my windows VM, and I am having the same esxi freeze everytime I try to shutdown or restart the VMs
     
    #18
  19. Jacob Staub

    Jacob Staub New Member

    Joined:
    Jul 15, 2016
    Messages:
    3
    Likes Received:
    1
    Hello epicurean,

    Before implementing my version of a solution I recommend following the instructions given by "mvrk" in post number four of the following link.

    mvrk's solution is more elegant and easier to implement. If for some reason mvrk's solution doesn't work for you I'll provide further assistance on my version.

    Please let the forum know how it goes.

    Click here if you'd like to read a little more about the development experience I had on my way to passing through a GPU. The capability does work even if the process is anything but smooth. My server is now passing through 2 x NVIDIA 2000 GPUs to two Windows 7 VMs without much perceptible trouble.

    Regards,
    Jake
     
    #19
    epicurean likes this.
  20. nk215

    nk215 Active Member

    Joined:
    Oct 6, 2015
    Messages:
    313
    Likes Received:
    91
    I just tested a Quadro 2000 with a test ESXi 6U2 setup. GPU pass through works great with Win7 guest. NO issue.
     
    #20
Similar Threads: passthrough freezes
Forum Title Date
VMware, VirtualBox, Citrix ESXI 6.5u3 - GPU passthrough to single VM Oct 5, 2019
VMware, VirtualBox, Citrix ESXi 6.7U2, OSX Mojave, GPU Passthrough Jul 6, 2019
VMware, VirtualBox, Citrix Disk Passthrough ESXi to FreeNAS Jul 4, 2019
VMware, VirtualBox, Citrix [BOUNTY] ESXi GPU Passthrough [ERROR 43] May 22, 2019
VMware, VirtualBox, Citrix ESXi - mobo USB passthrough not working - SuperMicro X11DPG-QT Apr 2, 2019

Share This Page