1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

Troubleshooting GPU passthrough ESXi 6.5

Discussion in 'VMware, VirtualBox, Citrix' started by Ch33rios, Jan 2, 2017.

  1. roswellian

    roswellian Member

    Joined:
    Oct 18, 2013
    Messages:
    67
    Likes Received:
    5
    I'm running Debian 9.1 x64 server.

    Linux Debian 4.12.3video #1 SMP Thu Jul 27 09:50:58 PDT 2017 x86_64 GNU/Linux

    The kernel is recompiled with the patch as I previously suggested, so I put video as the appendix.
     
    #121
  2. Nohbdy

    Nohbdy New Member

    Joined:
    Apr 25, 2017
    Messages:
    5
    Likes Received:
    0
    Anyone run into other cases of code 43 not related to passthrough? I have a 1050ti in a Dell R730 with ESXi that is giving a code 43 on passthrough after trying every passthrough configuration option listed in this thread (and then some).

    I tried booting Windows 10 natively and after installing nvidia drivers, I'm still getting code 43, so it must not be passthrough related. But, I'm at a loss of other configuration options to try. The R730 won't post PCIe GPUs, I'm wondering if it is related to this.
     
    #122
  3. calvinz360

    calvinz360 New Member

    Joined:
    Jul 12, 2017
    Messages:
    8
    Likes Received:
    0
    Thanks for the info. Would you mind shedding some light on how do I perform the patch on the kernel as I'm not so proficient in the nix world :( I was able to obtain the kernel source but not sure how to proceed beyond that.

    I do noticed there were multiple patch files within 26891 – Radeon KMS fails with inaccessible AtomBIOS on systems with (U)EFI boot
     
    #123
  4. roswellian

    roswellian Member

    Joined:
    Oct 18, 2013
    Messages:
    67
    Likes Received:
    5
    Here is a useful article detailing on how to compile a kernel on Ubuntu platform. How to Compile and Install Linux Kernel v4.9.11 Source On a Debian / Ubuntu Linux

    The patch I was referring to is the first one in the original link, with the name of "Bios from firmware loading hack". Here is the direct link https://bugs.freedesktop.org/attachment.cgi?id=33766

    This is the original link on the vmware website where I found this solution.
    VMDirectPath and ATI Radeon |VMware Communities
     
    #124
    calvinz360 likes this.
  5. calvinz360

    calvinz360 New Member

    Joined:
    Jul 12, 2017
    Messages:
    8
    Likes Received:
    0
    when I tried to patch with the file it gave me this output which required me to provide a file to patch

    Code:
    ~# patch -p1 < ~/radeon.patch
    can't find file to patch at input line 3
    Perhaps you used the wrong -p or --strip option?
    The text leading up to this was:
    --------------------------
    |--- drivers/gpu/drm/radeon/radeon_bios.c       2010-02-24 19:52:17.000000000 +0100
    |+++ /usr/src/linux/drivers/gpu/drm/radeon/radeon_bios.c        2010-02-27 18:38:09.000000000 +0100
    --------------------------
    File to patch:
     
    #125
  6. roswellian

    roswellian Member

    Joined:
    Oct 18, 2013
    Messages:
    67
    Likes Received:
    5
    You need to download the source code of the kernel as my link above suggested, then go into the source code folder /usr/src/linux/drivers/gpu/drm/radeon/ and apply the patch. However the patch was based on old radeon_bios.c file, and probably won't work with lastest version of radeon_bios.c What I normally do is to manually modify the radeon_bios.c instead of applying the patch directly.

    You should read the patch, and it is very straightforward. The key part is to add
    "radeon_read_bios_from_firmware" function to load a vbios file, and modify "
    radeon_read_bios" function to call "radeon_read_bios_from_firmware" if the kernel cannot load BIOS from the card directly.

    I can send you my modified one from 4.12.3 kernel if you are interested.
     
    #126
  7. calvinz360

    calvinz360 New Member

    Joined:
    Jul 12, 2017
    Messages:
    8
    Likes Received:
    0
    Definitely appreciate that if you can pass me the modified copy. At the same time I could learn on fiddling with the kernel. ;)
     
    #127
  8. roswellian

    roswellian Member

    Joined:
    Oct 18, 2013
    Messages:
    67
    Likes Received:
    5
    There you go.

    [C] radeon_bios.c - Pastebin.com

    Do a comparison with default radeon_bios.c in the kernel, you will see the difference.
     
    #128
  9. calvinz360

    calvinz360 New Member

    Joined:
    Jul 12, 2017
    Messages:
    8
    Likes Received:
    0
    Thanks ! I manage to incorporate the file and understand the sections to be updated within it and compile the kernel. However I'm still hitting errors :(

    Code:
    0b:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Oland PRO [Radeon R7 240/340]
            DeviceName: pciPassthru0
            Subsystem: PC Partner Limited / Sapphire Technology Oland PRO [Radeon R7 240/340]
            Kernel modules: radeon
    0b:00.1 Audio device: Advanced Micro Devices, Inc. [AMD/ATI] Cape Verde/Pitcairn HDMI Audio [Radeon HD 7700/7800 Series]
            Subsystem: PC Partner Limited / Sapphire Technology Cape Verde/Pitcairn HDMI Audio [Radeon HD 7700/7800 Series]
            Kernel driver in use: snd_hda_intel
            Kernel modules: snd_hda_intel
    Code:
    [    2.465359] [drm] radeon kernel modesetting enabled.
    [    2.503779] radeon 0000:0b:00.0: enabling device (0000 -> 0003)
    [    2.504891] radeon 0000:0b:00.0: BAR 6: can't assign [??? 0x00000000 flags 0x20000000] (bogus alignment)
    [    2.505030] radeon 0000:0b:00.0: BAR 6: can't assign [??? 0x00000000 flags 0x20000000] (bogus alignment)
    [    2.505169] radeon 0000:0b:00.0: Direct firmware load for radeon/vbios.bin failed with error -2
    [    2.505336] [drm:radeon_get_bios [radeon]] *ERROR* No bios
    [    2.505426] [drm:radeon_get_bios [radeon]] *ERROR* Unable to locate a BIOS ROM
    [    2.505543] radeon 0000:0b:00.0: Fatal error during GPU init
    [    2.505618] [drm] radeon: finishing device.
    [    2.712345] radeon: probe of 0000:0b:00.0 failed with error -22
    vbios.bin was already placed at the required location as well

    I tried passthrough to a windows guest on attempt to see the GPU bios through GPU-Z but it showed me an "Unknown" value instead :(
     
    #129
    Last edited: Aug 2, 2017
  10. roswellian

    roswellian Member

    Joined:
    Oct 18, 2013
    Messages:
    67
    Likes Received:
    5
    I think the error message still shows something was wrong when loading the vbios. Either cannot find the vbios file or the vbios file is bad. Please check two things:
    1. The BIOS from your video card.
    How did you extract the BIOS? Please keep in mind that you cannot do the extraction under VM. You should do it on a real machine. Or you can directly download the BIOS from this website based on your brand and model. TechPowerUp

    2. The BIOS(vbios) file location.
    Once you have the complete extracted bios from 1, you should rename it as "vbios.bin" and put it in /lib/firmware/radeon/

    The last step is to double check the modified radeon_bios.c and make sure that all the modifications are included in your file.
     
    #130
    calvinz360 likes this.
  11. calvinz360

    calvinz360 New Member

    Joined:
    Jul 12, 2017
    Messages:
    8
    Likes Received:
    0
    On my second run after reinstalling Debian 9.1x64 and updating to custom kernel 4.12.3, it was able to load the BIOS and GPU

    Code:
    [    9.106729] [drm] radeon kernel modesetting enabled.
    [    9.309081] radeon 0000:0b:00.0: enabling device (0000 -> 0003)
    [    9.311513] radeon 0000:0b:00.0: BAR 6: can't assign [??? 0x00000000 flags 0x20000000] (bogus alignment)
    [    9.311522] radeon 0000:0b:00.0: BAR 6: can't assign [??? 0x00000000 flags 0x20000000] (bogus alignment)
    [    9.358058] radeon 0000:0b:00.0: VRAM: 1024M 0x0000000000000000 - 0x000000003FFFFFFF (1024M used)
    [    9.358060] radeon 0000:0b:00.0: GTT: 2048M 0x0000000040000000 - 0x00000000BFFFFFFF
    [    9.358087] [drm] radeon: 1024M of VRAM memory ready
    [    9.358088] [drm] radeon: 2048M of GTT memory ready.
    [    9.414552] [drm] radeon: dpm initialized
    [    9.422539] [drm] enabling PCIE gen 2 link speeds, disable with radeon.pcie_gen2=0
    [   10.211812] radeon 0000:0b:00.0: WB enabled
    [   10.211815] radeon 0000:0b:00.0: fence driver on ring 0 use gpu addr 0x0000000040000c00 and cpu addr 0xffff9be3f663fc00
    [   10.211816] radeon 0000:0b:00.0: fence driver on ring 1 use gpu addr 0x0000000040000c04 and cpu addr 0xffff9be3f663fc04
    [   10.211817] radeon 0000:0b:00.0: fence driver on ring 2 use gpu addr 0x0000000040000c08 and cpu addr 0xffff9be3f663fc08
    [   10.211818] radeon 0000:0b:00.0: fence driver on ring 3 use gpu addr 0x0000000040000c0c and cpu addr 0xffff9be3f663fc0c
    [   10.211819] radeon 0000:0b:00.0: fence driver on ring 4 use gpu addr 0x0000000040000c10 and cpu addr 0xffff9be3f663fc10
    [   10.212194] radeon 0000:0b:00.0: fence driver on ring 5 use gpu addr 0x0000000000075a18 and cpu addr 0xffffb8fe01a35a18
    [   10.511072] radeon 0000:0b:00.0: failed VCE resume (-110).
    [   10.511139] radeon 0000:0b:00.0: radeon: MSI limited to 32-bit
    [   10.511370] radeon 0000:0b:00.0: radeon: using MSI.
    [   10.511419] [drm] radeon: irq initialized.
    [   12.618015] radeon 0000:0b:00.0: fb1: radeondrmfb frame buffer device
    [   12.639333] [drm] Initialized radeon 2.50.0 20080528 for 0000:0b:00.0 on minor 1
    [   12.905369] radeon 0000:0b:00.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=none:owns=none
    Code:
    0b:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Oland PRO [Radeon R7 240/340]
            Subsystem: PC Partner Limited / Sapphire Technology Oland PRO [Radeon R7 240/340]
            Kernel driver in use: radeon
            Kernel modules: radeon
    0b:00.1 Audio device: Advanced Micro Devices, Inc. [AMD/ATI] Cape Verde/Pitcairn HDMI Audio [Radeon HD 7700/7800 Series]
            Subsystem: PC Partner Limited / Sapphire Technology Cape Verde/Pitcairn HDMI Audio [Radeon HD 7700/7800 Series]
            Kernel driver in use: snd_hda_intel
            Kernel modules: snd_hda_intel
    
    really appreciate your help and patience with me thus far :) however right now I need to figure out why openCL is not detecting it as a platform available o_O Should I be installing amdgpu/fglrx drivers to achieve that ?

    Side note: Does it matter if I don't have a physical monitor connected to it ? As my intention is for it to do processing
     
    #131
    Last edited: Aug 3, 2017
  12. roswellian

    roswellian Member

    Joined:
    Oct 18, 2013
    Messages:
    67
    Likes Received:
    5
    That's great!! I installed AMD/ATI Open Source Drivers based on Debian official wiki. You should check with Ubuntu, and they should have the similar guide.

    I always use dummy plug instead of physical monitor, so far it works well. Here is the one with DVI you can find on ebay Headless server DVI-D EDID 1920x1200 Plug Linux Windows emulator dummy | eBay
     
    #132
  13. pingspike

    pingspike New Member

    Joined:
    Oct 8, 2017
    Messages:
    1
    Likes Received:
    0
    I've been following the steps in here with interest and I'm in the same state you were here.
    vbios failed with error -2.

    How did you resolve this part?
    I've doubled checked the file is in /lib/firmware/radeon/vbios.bin
    I even placed another file in /radeon/ but I didn't think that was necessary.

    I got my vbios.bin file by running this command:
    dd if=/dev/mem of=/boot/vbios.bin bs=65536 skip=12 count=1
    on a bare metal linux install
    I also tried using a rom file from gpu-z bare metal windows renamed to vbios.bin

    Hopefully I won't have to recompile the kernel again, I was kind of surprised how long that took.
     
    #133
  14. Fahd

    Fahd New Member

    Joined:
    Aug 3, 2017
    Messages:
    16
    Likes Received:
    3
    Sorry for the bump. Just wanted to say thank you to everyone that posted solutions and suggestions in this thread. This worked without a hitch for the most part. Passed through a GTX1070 to Win10Pro on ESXi 6.5.

    Had an issue getting the nvidia driver working, turns out my windows version was too old with build 10240. Nvidia drivers need newer/latest windows builds. Updated windows to the latest build and that got the nvidia driver to install and working well.
     
    #134
    cheezehead likes this.
  15. Thermir

    Thermir New Member

    Joined:
    Dec 4, 2017
    Messages:
    2
    Likes Received:
    1
    Finally! This was driving me nuts. Moving cards around to different PCIe slots, and a plethora of other things I’ve lost track of up to this point. It’s late now, I was about to throw in the towel, until one last Google search led me here, to this thread, and it was this that solved my issue. So, thanks!

    I’ll share my setup here, in case it helps others:

    Motherboard: EVGA SR-2 Classified (BIOS A58)
    CPU 2 x Xeon X5650
    RAM: 48GB (12 x 4GB DDR3)
    Storage Controller: IBM ServeRaid M5015 (OEM branded Broadcom/Avago/LSI 9260-8i), with 7 SATA3 drives attached (details inconsequential to this thread)
    GPUs: For ESXi - NVIDIA GeForce GT210, For VM - EVGA GTX 670
    Host: ESXi 6.5.0 Update 1 (Build 6765664)
    VM (for GTX 670): Windows 10 x64 (Build 16299), [Virtual HW Version 13, 6 vCPU, 8GB vRAM (reserved), VMXNET3 NIC, & PVSCSI storage controller.]

    Here is the board layout. A1-A4 are on the same parent PCIe bridge, so anything plugged into those slots can only be passed through to the same VM. Likewise, for the slots labeled B1-B3, as they are on a different PCIe bridge.

    Since I need the first PCIe slot for a console GPU (A1), that left me with using slots B1-B3 for my VM. So, my GTX 670 is plugged into slot B1. (Odd numbered slots are all PCIe x16, while even numbered are PCIe x8.) The rest of the hardware, for the ESXi host, is plugged into slots A1-A4 (GT210 GPU, Intel i340-T4 NIC, & IBM ServeRaid M5015 card.)

    I have the following now successfully passed through to the Windows 10 VM:

    1. onboard 2port NEC USB 3.0 controller (uPD720200).
    2. EVGA GTX 670.

    Next, I’m gong to try to attach a Vantec 3port USB 3 (2 x Type A, 1 x Type C) (UGT-PC331AC) to it. This card was also giving me a BSOD in Windows 10, but with the above change, I’m hopeful that it will also take care of that issue (fingers crossed).

    3403321E-77B8-408D-88FB-A92D08C6D7E4.jpeg
     
    #135
  16. nk215

    nk215 Active Member

    Joined:
    Oct 6, 2015
    Messages:
    222
    Likes Received:
    55
    Does the GPU work with Remote Desktop?
     
    #136
  17. Thermir

    Thermir New Member

    Joined:
    Dec 4, 2017
    Messages:
    2
    Likes Received:
    1
    Yes, RDP works.

    I have a small 1366x768 display plugged into the 1st HDMI port on it. So, that gives the VM 2 displays in Windows (1 - The VMware SVGA 3D adapter, and 2 - GTX 670.) I set the displays to extend the desktop, and told Windows to use display 2 (connected to GTX 670) as the main display.

    Wether or not that setup will work for 3D over RDP, I doubt it. That Remote Desktop protocol wasn’t designed for those types of workloads, so while it “may” work, it won’t be the best. Also, I’m not using Horizon View, so won’t be testing any of the other capabilities in that regard, although if I were looking to test 3D capabilities in a client-server scenario, that would probably be the setup that I would aim for in my testing. Horizon View has some pretty strict requirements for a lot of the graphical capabilities, though. (One of which would require me to remove the pass through configuration for the Windows VM with respect to the GTX 670, and put it directly back under the control of ESXi and its vmkernel, defeating the purpose of my setup.) So, honestly, I’d be surprised if a GTX 670 would work that way. In the event I wanted to do something like that, it would most likely be for some type of gaming, which I would rather just use the streaming function of Steam for it to display the games on one of the laptops in the house. For anything else that doesn’t run within the Steam platform, it would be local GTX 670 display on the VM.

    At any rate, it’s out of scope for my intended usage. I just plan to use the VM for some ad-hoc video capture and streaming, with maybe some light gaming for guests when needed, and it works fine for that so far.


    I forgot to add: Ome thing I plan on getting is a display emulator adapter for the 1st HDMI port of the GTX 670 and testing things that way. It would negate the need to have a monitor plugged in to get a second display enabled in Windows. It would just be for curiosity though, unless it works way better than I anticipate. However, right now, I think the only scenario that could benefit from it would be streaming the Steam games to a laptop in the house. Something like this: CompuLab 4K Display Emulator (fit-Headless 4K) https://www.amazon.com/dp/B00JKFTYA8/ref=cm_sw_r_cp_tai_-nakAb69F836B
     
    #137
    Last edited: Dec 6, 2017
    nk215 likes this.
  18. nk215

    nk215 Active Member

    Joined:
    Oct 6, 2015
    Messages:
    222
    Likes Received:
    55
    Thank you. I'll keep this in mind on my next try with GPU passthrough. I currently use quadro cards to avoid any headache but they are nowhere near the performance of the new consumer counterpart
     
    #138
Similar Threads: Troubleshooting passthrough
Forum Title Date
VMware, VirtualBox, Citrix troubleshooting vmware Aug 2, 2017
VMware, VirtualBox, Citrix ESXi RDM to SATA controller passthrough Nov 14, 2017
VMware, VirtualBox, Citrix GPU Passthrough ESXi 6.5 - nVidia Quadro 2000 Nov 8, 2017
VMware, VirtualBox, Citrix ESXi crash with AOC-SLG3-2E4 and Intel DC P3600 Passthrough Oct 20, 2017
VMware, VirtualBox, Citrix When doing GPU passthrough, is an EFI VM better performing than a BIOS based VM? Sep 12, 2017

Share This Page