Nvidia GPU passthrough not working with libvirt (KVM-QEMU)

lpallard

Member
Aug 17, 2013
216
6
18
Hello,

I am testing KVM to be able to use a GPU inside of a Windows guest VM for CAD/Engineering applications on a future computer build. Before I commit on buying the parts (which will amount to perhaps over $3000USD), I thought to give it a go on my existing setup as a feasibility test and so far, its' been more or less a nightmare.

Using libvirt (KVM-Qemu) is nothing compared to Virtualbox in terms of ease of use and user friendliness... Pretty much every step on the way has been plagued with errors and problems to a point where I tend towards ditching Linux in favor of Windows 10 on this future computer. Never though I would say that after almost 12 years using Linux but...

The GPU I want to pass to the VM is NVIDIA GF108 [GeForce GT 440]

Current host (only info relevant to GPU issue):
Code:
System:     Host: workstation Kernel: 5.3.0-51-generic x86_64 bits: 64 compiler: gcc v: 7.5.0
                   Desktop: Xfce 4.14.1 tk: Gtk 3.22.30 wm: xfwm4 dm: LightDM
                   Distro: Linux Mint 19.3 Tricia base: Ubuntu 18.04 bionic
Machine:  Type: Desktop Mobo: ASUSTeK model: M5A97 v: Rev 1.xx serial: <filter>
                   BIOS: American Megatrends v: 1605 date: 10/25/2012
CPU:          Topology: Quad Core model: AMD Phenom II X4 965 bits: 64 type: MCP arch: K10 rev: 3
                   L2 cache: 2048 KiB
                   flags: lm nx pae sse sse2 sse3 sse4a svm bogomips: 27291
                   Speed: 3400 MHz min/max: 800/3400 MHz Core speeds (MHz): 1: 3400 2: 2200 3: 800 4: 800
Graphics:  Device-1: NVIDIA GP108 [GeForce GT 1030] vendor: ASUSTeK driver: nvidia v: 440.59
                   bus ID: 01:00.0 chip ID: 10de:1d01
                   [B] Device-2: NVIDIA GF108 [GeForce GT 440] vendor: ZOTAC driver: vfio-pci v: 0.2
                   bus ID: 05:00.0 chip ID: 10de:0de0[/B]
                   Display: x11 server: X.Org 1.19.6 driver: nvidia
                   resolution: 1920x1080~60Hz, 3440x1440~60Hz
                   OpenGL: renderer: GeForce GT 1030/PCIe/SSE2 v: 4.6.0 NVIDIA 440.59 direct render: Yes
I followed the steps provided on this website: Beginner friendly guide to GPU passthrough on Ubuntu 18.04

AFAIK, my hardware supports virtualization & IOMMU and every bit required to passthrough the GPU properly. Kernel on host also supports the necessary extensions, and VT-d/IOMMU are enabled in BIOS.

dmesg |grep AMD-Vi
Code:
[    0.000000] AMD-Vi: Using IVHD type 0x10
[    0.000000] AMD-Vi: device: 00:00.2 cap: 0040 seg: 0 flags: 3e info 1300
[    0.000000] AMD-Vi:        mmio-addr: 00000000feb20000
[    0.000000] AMD-Vi:   DEV_SELECT_RANGE_START     devid: 00:00.0 flags: 00
[    0.000000] AMD-Vi:   DEV_RANGE_END         devid: 00:00.2
[    0.000000] AMD-Vi:   DEV_SELECT             devid: 00:02.0 flags: 00
[    0.000000] AMD-Vi:   DEV_SELECT_RANGE_START     devid: 01:00.0 flags: 00
[    0.000000] AMD-Vi:   DEV_RANGE_END         devid: 01:00.1
[    0.000000] AMD-Vi:   DEV_SELECT             devid: 00:04.0 flags: 00
[    0.000000] AMD-Vi:   DEV_SELECT             devid: 02:00.0 flags: 00
[    0.000000] AMD-Vi:   DEV_SELECT             devid: 00:07.0 flags: 00
[    0.000000] AMD-Vi:   DEV_SELECT             devid: 03:00.0 flags: 00
[    0.000000] AMD-Vi:   DEV_SELECT             devid: 00:11.0 flags: 00
[    0.000000] AMD-Vi:   DEV_SELECT_RANGE_START     devid: 00:12.0 flags: 00
[    0.000000] AMD-Vi:   DEV_RANGE_END         devid: 00:12.2
[    0.000000] AMD-Vi:   DEV_SELECT_RANGE_START     devid: 00:13.0 flags: 00
[    0.000000] AMD-Vi:   DEV_RANGE_END         devid: 00:13.2
[    0.000000] AMD-Vi:   DEV_SELECT             devid: 00:14.0 flags: d7
[    0.000000] AMD-Vi:   DEV_SELECT             devid: 00:14.2 flags: 00
[    0.000000] AMD-Vi:   DEV_SELECT             devid: 00:14.3 flags: 00
[    0.000000] AMD-Vi:   DEV_SELECT             devid: 00:14.4 flags: 00
[    0.000000] AMD-Vi:   DEV_ALIAS_RANGE         devid: 04:00.0 flags: 00 devid_to: 00:14.4
[    0.000000] AMD-Vi:   DEV_RANGE_END         devid: 04:1f.7
[    0.000000] AMD-Vi:   DEV_SELECT             devid: 00:14.5 flags: 00
[    0.000000] AMD-Vi:   DEV_SELECT             devid: 00:15.0 flags: 00
[    0.000000] AMD-Vi:   DEV_SELECT_RANGE_START     devid: 05:00.0 flags: 00
[    0.000000] AMD-Vi:   DEV_RANGE_END         devid: 05:00.1
[    0.000000] AMD-Vi:   DEV_SELECT_RANGE_START     devid: 00:16.0 flags: 00
[    0.000000] AMD-Vi:   DEV_RANGE_END         devid: 00:16.2
[    0.000000] AMD-Vi:   DEV_SPECIAL(IOAPIC[0])        devid: 00:14.0
[    0.000000] AMD-Vi:   DEV_SPECIAL(HPET[0])        devid: 00:14.0
[    0.000000] AMD-Vi:   DEV_SPECIAL(IOAPIC[255])        devid: 00:00.1
[    1.907488] pci 0000:00:00.2: AMD-Vi: Found IOMMU cap 0x40
[    1.907489] AMD-Vi: Interrupt remapping enabled
[    1.907618] AMD-Vi: Lazy IO/TLB flushing enabled
IOMMU Device grouping
Code:
IOMMU Group 0 00:00.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD/ATI] RD9x0/RX980 Host Bridge [1002:5a14] (rev 02)
IOMMU Group 10 00:14.4 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD/ATI] SBx00 PCI to PCI Bridge [1002:4384] (rev 40)
IOMMU Group 11 00:14.5 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB OHCI2 Controller [1002:4399]
IOMMU Group 12 00:15.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD/ATI] SB700/SB800/SB900 PCI to PCI bridge (PCIE port 0) [1002:43a0]
IOMMU Group 12 05:00.0 VGA compatible controller [0300]: NVIDIA Corporation GF108 [GeForce GT 440] [10de:0de0] (rev a1)
IOMMU Group 12 05:00.1 Audio device [0403]: NVIDIA Corporation GF108 High Definition Audio Controller [10de:0bea] (rev a1)
IOMMU Group 13 00:16.0 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB OHCI0 Controller [1002:4397]
IOMMU Group 13 00:16.2 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB EHCI Controller [1002:4396]
IOMMU Group 14 01:00.0 VGA compatible controller [0300]: NVIDIA Corporation GP108 [GeForce GT 1030] [10de:1d01] (rev a1)
IOMMU Group 14 01:00.1 Audio device [0403]: NVIDIA Corporation GP108 High Definition Audio Controller [10de:0fb8] (rev a1)
IOMMU Group 15 02:00.0 Ethernet controller [0200]: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller [10ec:8168] (rev 06)
IOMMU Group 16 03:00.0 USB controller [0c03]: ASMedia Technology Inc. ASM1042 SuperSpeed USB Host Controller [1b21:1042]
IOMMU Group 1 00:02.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD/ATI] RD890/RD9x0/RX980 PCI to PCI bridge (PCI Express GFX port 0) [1002:5a16]
IOMMU Group 2 00:04.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD/ATI] RD890/RD9x0/RX980 PCI to PCI bridge (PCI Express GPP Port 0) [1002:5a18]
IOMMU Group 3 00:07.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD/ATI] RD890/RD9x0/RX980 PCI to PCI bridge (PCI Express GPP Port 3) [1002:5a1b]
IOMMU Group 4 00:11.0 SATA controller [0106]: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 SATA Controller [AHCI mode] [1002:4391] (rev 40)
IOMMU Group 5 00:12.0 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB OHCI0 Controller [1002:4397]
IOMMU Group 5 00:12.2 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB EHCI Controller [1002:4396]
IOMMU Group 6 00:13.0 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB OHCI0 Controller [1002:4397]
IOMMU Group 6 00:13.2 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB EHCI Controller [1002:4396]
IOMMU Group 7 00:14.0 SMBus [0c05]: Advanced Micro Devices, Inc. [AMD/ATI] SBx00 SMBus Controller [1002:4385] (rev 42)
IOMMU Group 8 00:14.2 Audio device [0403]: Advanced Micro Devices, Inc. [AMD/ATI] SBx00 Azalia (Intel HDA) [1002:4383] (rev 40)
IOMMU Group 9 00:14.3 ISA bridge [0601]: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 LPC host controller [1002:439d] (rev 40)
I think the GPU is properly isolated frrom being attached to a driver on the host OS:

lspci -nnv
Code:
05:00.0 VGA compatible controller [0300]: NVIDIA Corporation GF108 [GeForce GT 440] [10de:0de0] (rev a1) (prog-if 00 [VGA controller])
    Subsystem: ZOTAC International (MCO) Ltd. GF108 [GeForce GT 440] [19da:3199]
    Flags: bus master, fast devsel, latency 0, IRQ 16, NUMA node 0
    Memory at f4000000 (32-bit, non-prefetchable) [size=16M]
    Memory at c8000000 (64-bit, prefetchable) [size=128M]
    Memory at d0000000 (64-bit, prefetchable) [size=32M]
    I/O ports at c000 [size=128]
    Expansion ROM at f5000000 [disabled] [size=512K]
    Capabilities: <access denied>
    Kernel driver in use: vfio-pci
    Kernel modules: nvidiafb, nouveau, nvidia_drm, nvidia

05:00.1 Audio device [0403]: NVIDIA Corporation GF108 High Definition Audio Controller [10de:0bea] (rev a1)
    Subsystem: ZOTAC International (MCO) Ltd. GF108 High Definition Audio Controller [19da:3199]
    Flags: bus master, fast devsel, latency 0, IRQ 17, NUMA node 0
    Memory at f5080000 (32-bit, non-prefetchable) [size=16K]
    Capabilities: <access denied>
    Kernel driver in use: vfio-pci
    Kernel modules: snd_hda_intel
Now on to the issue. Following all these steps brings me to a functional VM with GPU detected (lspci in guest OS) and the Linux Mint driver manager even recommends installing Nvidia driver 390 which I did. After rebooting, everything seems to be working fine but when I start the Nvidia control panel I get an empty popup...

Screenshot_2020-05-06_12-03-02.png

Launching the control panel from CLI, I get an error

Code:
ERROR: Unable to load info from any available system

(nvidia-settings:1794) GLib-GObject-CRITICAL **: 12:03:15.025: g_object_unref:
assertion 'G_IS_OBJECT (object)' failed
**Message: 12:03:15.030: PRIME: No offloading reequired.  Abort
**Message: 12:03:15.030: PRIME: is it supported? no
Googling this error made me believe this was similar to the Error 43 in Windows guests (the nvidia driver is borked when it detects it runs inside of a VM, thanks nvidia), but I cannot be sure. I tried the following:
  • Using nouveau. Things got worst exponentially.
  • Changing the video device in the VM settings. Cirrus will allow the best scenario (I can see the desktop and GUI works, but nvidia driver produces the above error). QXL wont boot. VGA, Virtio and VMVGA will boot, but the image is all distorted. Also, I cannot remove (delete) the Video device in the VM details. If I do, it comes back wirh "Cirrus".
  • Adding/chganging thse in the VM xml file to spoof the driver:
Code:
<hyperv>
      <vendor_id state='on' value='123456789abc'/>
</hyperv>
<kvm>
      <hidden state='on'/>
</kvm>
  • Tried specifying the GPU PCI address in xorg.conf, all this made is the VM will no longer boot to desktop (stuck with a blinking cursor before getting to the GUI, probably xorg crashing).

Any ideas? Please let me know if I need to provide more information which I will for sure!
 

Attachments

Last edited:

MiniKnight

Well-Known Member
Mar 30, 2012
2,998
907
113
NYC
I don't know the answer since I've never done this on Ryzen but man that guide starting with BIOS issues on the newest version is cringe-y
 

lpallard

Member
Aug 17, 2013
216
6
18
OK I may have found the issue: The card I'm trying to passthrough does not support UEFI and AFAIK it is required that the VM use UEFI together with the card.

I guess if this is true, game over. What a house of card.