When loading the NCT7904 driver on Proxmox VE 8.4.1, the driver for the ACPI watchdog may not load, causing the system to reboot if the watchdog is enabled in the UEFI setup.
The NCT7904 driver supports the watchdog feature present in the chip, however, it isn't used in this machine. It does prevent Proxmox from loading the ACPI watchdog driver, for some reason.
To solve this problem (along with adding the appropriate driverwdat_wdt
to/etc/modules
and/etc/initramfs-tools/modules
) we can add a dependency on thenct7904
driver to ensure it is loaded afterwdat_wdt
.
Add a file with the following line to/etc/modprobe.d
:
Then runCode:softdep nct7904 pre: wdat_wdt
update-initramfs -u -k all
.
Dmesg output after activating a few VFs...
Interestingly, SR-IOV works even with the IOMMU disabled in the kernel, such as by booting withoutCode:[ 202.228536] ixgbe 0000:06:00.0 eth2: SR-IOV enabled with 3 VFs [ 202.388943] ixgbe 0000:06:00.0: Multiqueue Enabled: Rx Queue count = 4, Tx Queue count = 4 XDP Queue count = 0 [ 202.494754] pci 0000:06:10.0: [8086:15c5] type 00 class 0x020000 [ 202.495087] pci 0000:06:10.0: Adding to iommu group 23 [ 202.495662] pci 0000:06:10.2: [8086:15c5] type 00 class 0x020000 [ 202.495940] pci 0000:06:10.2: Adding to iommu group 24 [ 202.496938] pci 0000:06:10.4: [8086:15c5] type 00 class 0x020000 [ 202.497145] pci 0000:06:10.4: Adding to iommu group 25 [ 202.531943] ixgbevf: Intel(R) 10 Gigabit PCI Express Virtual Function Network Driver [ 202.531950] ixgbevf: Copyright (c) 2009 - 2018 Intel Corporation. [ 202.532086] ixgbevf 0000:06:10.0: enabling device (0000 -> 0002) [ 202.533494] ixgbevf 0000:06:10.0: PF still in reset state. Is the PF interface up? [ 202.533497] ixgbevf 0000:06:10.0: Assigning random MAC address [ 202.534041] ixgbevf 0000:06:10.0: aa:b0:33:d1:b1:4a [ 202.534046] ixgbevf 0000:06:10.0: MAC: 5 [ 202.534048] ixgbevf 0000:06:10.0: Intel(R) 82599 Virtual Function [ 202.534073] ixgbevf 0000:06:10.2: enabling device (0000 -> 0002) [ 202.535466] ixgbevf 0000:06:10.2: PF still in reset state. Is the PF interface up? [ 202.535469] ixgbevf 0000:06:10.2: Assigning random MAC address [ 202.549212] ixgbevf 0000:06:10.2: 9a:7e:52:25:26:c9 [ 202.549221] ixgbevf 0000:06:10.2: MAC: 5 [ 202.549223] ixgbevf 0000:06:10.2: Intel(R) 82599 Virtual Function [ 202.549262] ixgbevf 0000:06:10.4: enabling device (0000 -> 0002) [ 202.550663] ixgbevf 0000:06:10.4: PF still in reset state. Is the PF interface up? [ 202.550666] ixgbevf 0000:06:10.4: Assigning random MAC address [ 202.551042] ixgbevf 0000:06:10.4: c6:51:53:c4:40:22 [ 202.551046] ixgbevf 0000:06:10.4: MAC: 5 [ 202.551048] ixgbevf 0000:06:10.4: Intel(R) 82599 Virtual Function [ 232.126333] ixgbe 0000:06:00.0: registered PHC device on eth2 [ 236.017461] ixgbe 0000:06:00.0 eth2: NIC Link is Up 1 Gbps, Flow Control: RX/TX [ 363.644981] ixgbe 0000:06:00.0 eth2: VF Reset msg received from vf 0 [ 363.676640] ixgbevf 0000:06:10.0: NIC Link is Up 1 Gbps
intel_iommu=on
.
IOMMU grouping after activating some VFs on the X553. Looking good for VFIO passthrough.
Code:IOMMU Group 0 00:00.0 Host bridge [0600]: Intel Corporation Atom Processor C3000 Series System Agent [8086:1980] (rev 11) IOMMU Group 1 00:04.0 Host bridge [0600]: Intel Corporation Atom Processor C3000 Series Error Registers [8086:19a1] (rev 11) IOMMU Group 2 00:05.0 Generic system peripheral [0807]: Intel Corporation Atom Processor C3000 Series Root Complex Event Collector [8086:19a2] (rev 11) IOMMU Group 3 00:06.0 PCI bridge [0604]: Intel Corporation Atom Processor C3000 Series Integrated QAT Root Port [8086:19a3] (rev 11) IOMMU Group 4 00:0c.0 PCI bridge [0604]: Intel Corporation Atom Processor C3000 Series PCI Express Root Port #3 [8086:19a7] (rev 11) IOMMU Group 5 00:0f.0 PCI bridge [0604]: Intel Corporation Atom Processor C3000 Series PCI Express Root Port #5 [8086:19a9] (rev 11) IOMMU Group 6 00:10.0 PCI bridge [0604]: Intel Corporation Atom Processor C3000 Series PCI Express Root Port #6 [8086:19aa] (rev 11) IOMMU Group 7 00:11.0 PCI bridge [0604]: Intel Corporation Atom Processor C3000 Series PCI Express Root Port #7 [8086:19ab] (rev 11) IOMMU Group 8 00:12.0 System peripheral [0880]: Intel Corporation Atom Processor C3000 Series SMBus Contoller - Host [8086:19ac] (rev 11) IOMMU Group 9 00:13.0 SATA controller [0106]: Intel Corporation Atom Processor C3000 Series SATA Controller 0 [8086:19b2] (rev 11) IOMMU Group 10 00:15.0 USB controller [0c03]: Intel Corporation Atom Processor C3000 Series USB 3.0 xHCI Controller [8086:19d0] (rev 11) IOMMU Group 11 00:16.0 PCI bridge [0604]: Intel Corporation Atom Processor C3000 Series Integrated LAN Root Port #0 [8086:19d1] (rev 11) IOMMU Group 12 00:17.0 PCI bridge [0604]: Intel Corporation Atom Processor C3000 Series Integrated LAN Root Port #1 [8086:19d2] (rev 11) IOMMU Group 13 00:18.0 Communication controller [0780]: Intel Corporation Atom Processor C3000 Series ME HECI 1 [8086:19d3] (rev 11) IOMMU Group 14 00:1c.0 SD Host controller [0805]: Intel Corporation Device [8086:19db] (rev 11) IOMMU Group 15 00:1f.0 ISA bridge [0601]: Intel Corporation Atom Processor C3000 Series LPC or eSPI [8086:19dc] (rev 11) IOMMU Group 15 00:1f.2 Memory controller [0580]: Intel Corporation Atom Processor C3000 Series Power Management Controller [8086:19de] (rev 11) IOMMU Group 15 00:1f.4 SMBus [0c05]: Intel Corporation Atom Processor C3000 Series SMBus controller [8086:19df] (rev 11) IOMMU Group 15 00:1f.5 Serial bus controller [0c80]: Intel Corporation Atom Processor C3000 Series SPI Controller [8086:19e0] (rev 11) IOMMU Group 16 01:00.0 Co-processor [0b40]: Intel Corporation Atom Processor C3000 Series QuickAssist Technology [8086:19e2] (rev 11) IOMMU Group 17 04:00.0 Ethernet controller [0200]: Intel Corporation I210 Gigabit Network Connection [8086:1533] (rev 03) IOMMU Group 18 05:00.0 Ethernet controller [0200]: Intel Corporation I210 Gigabit Network Connection [8086:1533] (rev 03) IOMMU Group 19 06:00.0 Ethernet controller [0200]: Intel Corporation Ethernet Connection X553 1GbE [8086:15e4] (rev 11) IOMMU Group 20 06:00.1 Ethernet controller [0200]: Intel Corporation Ethernet Connection X553 1GbE [8086:15e4] (rev 11) IOMMU Group 21 08:00.0 Ethernet controller [0200]: Intel Corporation Ethernet Connection X553 1GbE [8086:15e5] (rev 11) IOMMU Group 22 08:00.1 Ethernet controller [0200]: Intel Corporation Ethernet Connection X553 1GbE [8086:15e5] (rev 11) IOMMU Group 23 06:10.0 Ethernet controller [0200]: Intel Corporation X553 Virtual Function [8086:15c5] IOMMU Group 24 06:10.2 Ethernet controller [0200]: Intel Corporation X553 Virtual Function [8086:15c5] IOMMU Group 25 06:10.4 Ethernet controller [0200]: Intel Corporation X553 Virtual Function [8086:15c5]
Code:lspci -nn 00:00.0 Host bridge [0600]: Intel Corporation Atom Processor C3000 Series System Agent [8086:1980] (rev 11) 00:04.0 Host bridge [0600]: Intel Corporation Atom Processor C3000 Series Error Registers [8086:19a1] (rev 11) 00:05.0 Generic system peripheral [0807]: Intel Corporation Atom Processor C3000 Series Root Complex Event Collector [8086:19a2] (rev 11) 00:06.0 PCI bridge [0604]: Intel Corporation Atom Processor C3000 Series Integrated QAT Root Port [8086:19a3] (rev 11) 00:0c.0 PCI bridge [0604]: Intel Corporation Atom Processor C3000 Series PCI Express Root Port #3 [8086:19a7] (rev 11) 00:0f.0 PCI bridge [0604]: Intel Corporation Atom Processor C3000 Series PCI Express Root Port #5 [8086:19a9] (rev 11) 00:10.0 PCI bridge [0604]: Intel Corporation Atom Processor C3000 Series PCI Express Root Port #6 [8086:19aa] (rev 11) 00:11.0 PCI bridge [0604]: Intel Corporation Atom Processor C3000 Series PCI Express Root Port #7 [8086:19ab] (rev 11) 00:12.0 System peripheral [0880]: Intel Corporation Atom Processor C3000 Series SMBus Contoller - Host [8086:19ac] (rev 11) 00:13.0 SATA controller [0106]: Intel Corporation Atom Processor C3000 Series SATA Controller 0 [8086:19b2] (rev 11) 00:15.0 USB controller [0c03]: Intel Corporation Atom Processor C3000 Series USB 3.0 xHCI Controller [8086:19d0] (rev 11) 00:16.0 PCI bridge [0604]: Intel Corporation Atom Processor C3000 Series Integrated LAN Root Port #0 [8086:19d1] (rev 11) 00:17.0 PCI bridge [0604]: Intel Corporation Atom Processor C3000 Series Integrated LAN Root Port #1 [8086:19d2] (rev 11) 00:18.0 Communication controller [0780]: Intel Corporation Atom Processor C3000 Series ME HECI 1 [8086:19d3] (rev 11) 00:1c.0 SD Host controller [0805]: Intel Corporation Device [8086:19db] (rev 11) 00:1f.0 ISA bridge [0601]: Intel Corporation Atom Processor C3000 Series LPC or eSPI [8086:19dc] (rev 11) 00:1f.2 Memory controller [0580]: Intel Corporation Atom Processor C3000 Series Power Management Controller [8086:19de] (rev 11) 00:1f.4 SMBus [0c05]: Intel Corporation Atom Processor C3000 Series SMBus controller [8086:19df] (rev 11) 00:1f.5 Serial bus controller [0c80]: Intel Corporation Atom Processor C3000 Series SPI Controller [8086:19e0] (rev 11) 01:00.0 Co-processor [0b40]: Intel Corporation Atom Processor C3000 Series QuickAssist Technology [8086:19e2] (rev 11) 04:00.0 Ethernet controller [0200]: Intel Corporation I210 Gigabit Network Connection [8086:1533] (rev 03) 05:00.0 Ethernet controller [0200]: Intel Corporation I210 Gigabit Network Connection [8086:1533] (rev 03) 06:00.0 Ethernet controller [0200]: Intel Corporation Ethernet Connection X553 1GbE [8086:15e4] (rev 11) 06:00.1 Ethernet controller [0200]: Intel Corporation Ethernet Connection X553 1GbE [8086:15e4] (rev 11) 08:00.0 Ethernet controller [0200]: Intel Corporation Ethernet Connection X553 1GbE [8086:15e5] (rev 11) 08:00.1 Ethernet controller [0200]: Intel Corporation Ethernet Connection X553 1GbE [8086:15e5] (rev 11)
Code:lspci -t -[0000:00]-+-00.0 +-04.0 +-05.0 +-06.0-[01]----00.0 +-0c.0-[02]-- +-0f.0-[03]-- +-10.0-[04]----00.0 +-11.0-[05]----00.0 +-12.0 +-13.0 +-15.0 +-16.0-[06-07]--+-00.0 | \-00.1 +-17.0-[08-09]--+-00.0 | \-00.1 +-18.0 +-1c.0 +-1f.0 +-1f.2 +-1f.4 \-1f.5
Code:lsusb Bus 001 Device 001: ID 1d6b:0002 Linux 6.6.73 xhci-hcd xHCI Host Controller Bus 002 Device 001: ID 1d6b:0003 Linux 6.6.73 xhci-hcd xHCI Host Controller lsusb -t /: Bus 001.Port 001: Dev 001, Class=root_hub, Driver=xhci_hcd/4p, 480M /: Bus 002.Port 001: Dev 001, Class=root_hub, Driver=xhci_hcd/4p, 5000M
Code:mmc0: SDHCI controller on PCI [0000:00:1c.0] using ADMA 64-bit mmc0: new HS400 MMC card at address 0001 mmcblk0: mmc0:0001 DG4008 7.28 GiB mmcblk0: p1 p2 p3 mmcblk0boot0: mmc0:0001 DG4008 4.00 MiB mmcblk0boot1: mmc0:0001 DG4008 4.00 MiB mmcblk0rpmb: mmc0:0001 DG4008 4.00 MiB, chardev (244:0)
Code:sensors jc42-i2c-0-1a Adapter: SMBus I801 adapter at e000 temp1: +48.5°C (low = +0.0°C) ALARM (HIGH, CRIT) (high = +0.0°C, hyst = +0.0°C) (crit = +0.0°C, hyst = +0.0°C) nct7904-i2c-0-2d Adapter: SMBus I801 adapter at e000 in1: 1.23 V (min = +0.00 V, max = +4.09 V) in6: 670.00 mV (min = +0.00 V, max = +4.09 V) in7: 806.00 mV (min = +0.00 V, max = +4.09 V) in8: 1.20 V (min = +0.00 V, max = +4.09 V) in9: 982.00 mV (min = +0.00 V, max = +4.09 V) in10: 1.24 V (min = +0.00 V, max = +4.09 V) in11: 1.05 V (min = +0.00 V, max = +4.09 V) in12: 1.79 V (min = +0.00 V, max = +4.09 V) in13: 1.27 V (min = +0.00 V, max = +4.09 V) in14: 1.56 V (min = +0.00 V, max = +4.09 V) in15: 3.33 V (min = +0.00 V, max = +12.28 V) in16: 3.08 V (min = +0.00 V, max = +12.28 V) in19: 3.20 V (min = +12.28 V, max = +12.28 V) in20: 3.35 V (min = +0.00 V, max = +12.28 V) fan1: 3506 RPM (min = 164 RPM) fan2: 3047 RPM (min = 164 RPM) fan3: 0 RPM (min = 164 RPM) temp1: +46.0°C (high = +65.0°C, hyst = +60.0°C) (crit = +75.0°C, hyst = +70.0°C) sensor = thermal diode temp2: +51.6°C (high = +65.0°C, hyst = +60.0°C) (crit = +75.0°C, hyst = +70.0°C) sensor = thermal diode temp5: +38.5°C (high = +85.0°C, hyst = +80.0°C) (crit = +100.0°C, hyst = +95.0°C) sensor = thermistor temp6: +0.0°C (high = +85.0°C, hyst = +80.0°C) (crit = +100.0°C, hyst = +95.0°C) sensor = Intel PECI temp7: +0.0°C (high = +85.0°C, hyst = +80.0°C) (crit = +100.0°C, hyst = +95.0°C) sensor = Intel PECI temp8: +0.0°C (high = +85.0°C, hyst = +80.0°C) (crit = +100.0°C, hyst = +95.0°C) sensor = Intel PECI temp9: +0.0°C (high = +85.0°C, hyst = +80.0°C) (crit = +100.0°C, hyst = +95.0°C) sensor = Intel PECI acpitz-acpi-0 Adapter: ACPI interface temp1: +0.0°C jc42-i2c-0-18 Adapter: SMBus I801 adapter at e000 temp1: +48.0°C (low = +0.0°C) ALARM (HIGH, CRIT) (high = +0.0°C, hyst = +0.0°C) (crit = +0.0°C, hyst = +0.0°C) coretemp-isa-0000 Adapter: ISA adapter Package id 0: +56.0°C (high = +71.0°C, crit = +91.0°C) Core 2: +51.0°C (high = +71.0°C, crit = +91.0°C) Core 6: +53.0°C (high = +71.0°C, crit = +91.0°C) Core 8: +52.0°C (high = +71.0°C, crit = +91.0°C) Core 12: +52.0°C (high = +71.0°C, crit = +91.0°C)
Code:lsmod (Alpine Linux 3.20 6.6.76-0-lts) Module Size Used by Not tainted ipv6 786432 16 [permanent] af_packet 65536 0 wdat_wdt 20480 0 pcspkr 12288 0 efi_pstore 12288 0 acpi_cpufreq 32768 0 qat_c3xxx 12288 0 intel_qat 278528 1 qat_c3xxx ee1004 16384 0 crc8 12288 1 intel_qat authenc 12288 1 intel_qat crypto_null 16384 1 authenc i2c_i801 40960 0 i2c_smbus 20480 1 i2c_i801 i2c_ismt 32768 0 ixgbe 442368 0 mdio_devres 12288 1 ixgbe libphy 221184 2 ixgbe,mdio_devres mdio 12288 1 ixgbe igb 315392 0 hwmon 40960 2 ixgbe,igb i2c_algo_bit 12288 1 igb intel_rapl_msr 20480 0 dca 16384 2 ixgbe,igb intel_cstate 20480 0 rapl 20480 0 pnd2_edac 24576 0 intel_rapl_common 36864 1 intel_rapl_msr hed 12288 0 evdev 28672 0 thermal 28672 0 button 24576 0 efivarfs 24576 1 isofs 53248 1 cdrom 81920 1 isofs uas 32768 0 nls_utf8 12288 0 nls_cp437 16384 0 vfat 20480 0 fat 98304 1 vfat mmc_block 61440 0 crc32_pclmul 12288 0 crc32c_intel 16384 0 xhci_pci 24576 0 xhci_pci_renesas 16384 1 xhci_pci xhci_hcd 360448 1 xhci_pci ahci 49152 0 libahci 61440 1 ahci libata 454656 2 ahci,libahci sdhci_pci 98304 0 cqhci 32768 1 sdhci_pci sdhci 90112 1 sdhci_pci mmc_core 253952 4 mmc_block,sdhci_pci,cqhci,sdhci simpledrm 16384 0 drm_shmem_helper 28672 1 simpledrm drm_kms_helper 249856 1 simpledrm drm 761856 3 simpledrm,drm_shmem_helper,drm_kms_helper drm_panel_orientation_quirks 28672 1 drm usb_storage 86016 2 uas usbcore 401408 4 uas,xhci_pci,xhci_hcd,usb_storage usb_common 16384 2 xhci_hcd,usbcore sd_mod 65536 1 t10_pi 20480 1 sd_mod crc64_rocksoft 16384 1 t10_pi crc64 16384 1 crc64_rocksoft scsi_mod 274432 4 uas,libata,usb_storage,sd_mod scsi_common 16384 5 uas,libata,usb_storage,sd_mod,scsi_mod squashfs 86016 1 loop 36864 2