Help with ConnectX 3 SR-IOV with Linux host and windows guest via kvm

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

okrasit

Member
Jun 28, 2019
40
32
18
I did that and they both came back exactly as they were before.
I've tried with default Windows driver and with MLNX_VPI_WinOF-5_50_53000_All_Win2019_x64, in which case there was a driver version mismatch between VPI and ETH. So what's the correct process here?
The microsoft driver should work, so it points in the direction of linux ofed :(
I did get that error 10, when I had winof driver on the VPI and microsoft one on the ethernet device.
 

okrasit

Member
Jun 28, 2019
40
32
18
Here's one important thing to notice:
1623276568604.png
If the "fast start-up" (fast boot) is enabled in windows, the mlx vpi device will get the error 43, requiring reboot or disable/enable toggling. That is, with a patched kernel driver or a working linux ofed (if such a thing even exists anymore).
 

dwalme

New Member
Jan 16, 2018
6
0
1
46
It's been a couple years and I am running into this same issue. Proxmox 8.3 running 6.8 or 6.11 kernels that have the broken 4.0-0 mlx4 kernel driver.

My Linux skills are limited. Before I dive into figuring out how to use this patch does anyone know offhand if this will still work with newer kernels?
 

Basriram

New Member
Sep 16, 2021
3
4
3
It's been a couple years and I am running into this same issue. Proxmox 8.3 running 6.8 or 6.11 kernels that have the broken 4.0-0 mlx4 kernel driver.

My Linux skills are limited. Before I dive into figuring out how to use this patch does anyone know offhand if this will still work with newer kernels?
I have created a dirty dkms patch script of the above mentioned patch for proxmox running 6.8.12-5 here

Just make sure you have linux-headers installed for your kernel version. You can ignore the "patch unexpectedly ends in middle of line" remarks but to be safe you may want to do a dry run first.
 
  • Like
Reactions: klui

phyerbarte

New Member
Dec 30, 2024
2
0
1
I have created a dirty dkms patch script of the above mentioned patch for proxmox running 6.8.12-5 here

Just make sure you have linux-headers installed for your kernel version. You can ignore the "patch unexpectedly ends in middle of line" remarks but to be safe you may want to do a dry run first.
Hi, I tried your patch with PVE, which kernal is 6.8.12-5, but seems the error 43 still there, is that means it not support ? How can I check if the patch correctly ? The scripts seems running correctly.

Bash:
modinfo  mlx4_core
filename:       /lib/modules/6.8.12-5-pve/updates/dkms/mlx4_core.ko
version:        99.4.0-0
license:        Dual BSD/GPL
description:    Mellanox ConnectX HCA low-level driver
author:         Roland Dreier
srcversion:     4EC12DB41AD0C1ADF790DF4
alias:          pci:v000015B3d00001010sv*sd*bc*sc*i*
alias:          pci:v000015B3d0000100Fsv*sd*bc*sc*i*
alias:          pci:v000015B3d0000100Esv*sd*bc*sc*i*
alias:          pci:v000015B3d0000100Dsv*sd*bc*sc*i*
alias:          pci:v000015B3d0000100Csv*sd*bc*sc*i*
alias:          pci:v000015B3d0000100Bsv*sd*bc*sc*i*
alias:          pci:v000015B3d0000100Asv*sd*bc*sc*i*
alias:          pci:v000015B3d00001009sv*sd*bc*sc*i*
alias:          pci:v000015B3d00001008sv*sd*bc*sc*i*
alias:          pci:v000015B3d00001007sv*sd*bc*sc*i*
alias:          pci:v000015B3d00001006sv*sd*bc*sc*i*
alias:          pci:v000015B3d00001005sv*sd*bc*sc*i*
alias:          pci:v000015B3d00001004sv*sd*bc*sc*i*
alias:          pci:v000015B3d00001003sv*sd*bc*sc*i*
alias:          pci:v000015B3d00001002sv*sd*bc*sc*i*
alias:          pci:v000015B3d0000676Esv*sd*bc*sc*i*
alias:          pci:v000015B3d00006746sv*sd*bc*sc*i*
alias:          pci:v000015B3d00006764sv*sd*bc*sc*i*
alias:          pci:v000015B3d0000675Asv*sd*bc*sc*i*
alias:          pci:v000015B3d00006372sv*sd*bc*sc*i*
alias:          pci:v000015B3d00006750sv*sd*bc*sc*i*
alias:          pci:v000015B3d00006368sv*sd*bc*sc*i*
alias:          pci:v000015B3d0000673Csv*sd*bc*sc*i*
alias:          pci:v000015B3d00006732sv*sd*bc*sc*i*
alias:          pci:v000015B3d00006354sv*sd*bc*sc*i*
alias:          pci:v000015B3d0000634Asv*sd*bc*sc*i*
alias:          pci:v000015B3d00006340sv*sd*bc*sc*i*
depends:
retpoline:      Y
name:           mlx4_core
vermagic:       6.8.12-5-pve SMP preempt mod_unload modversions
sig_id:         PKCS#7
signer:         DKMS module signing key
sig_key:        2D:F8:AE:B4:09:A5:69:3A:96:9A:15:62:C1:0C:F8:7F:E0:FD:B4:A1
sig_hashalgo:   sha512
signature:      58:66:CA:6E:97:EB:0A:09:49:BA:E1:AE:A5:6D:33:8E:87:EB:59:B4:
                E0:A7:81:68:EB:82:FC:29:25:41:06:88:E0:D6:33:C0:FD:68:AB:89:
                89:23:71:20:C2:6B:9A:EE:CE:E5:76:8E:A8:D8:B4:72:E5:28:BF:BA:
                1E:81:94:A6:F1:80:95:DF:F0:B1:12:57:2E:90:B6:D0:54:37:D3:EE:
                19:BA:54:D7:13:62:66:0E:5A:56:E2:94:B6:4E:0F:AD:A5:A2:37:6E:
                99:78:D0:A3:FA:CB:09:99:86:BA:5A:59:F4:A9:C2:EC:89:0F:6A:C2:
                DB:A8:3D:B0:F8:1F:BB:A8:D6:D4:37:B1:3A:2B:3A:03:5E:91:C7:9B:
                B5:10:C3:52:68:71:04:14:12:FB:37:C2:EE:03:5A:C9:CB:4A:4A:E2:
                1F:41:9F:0B:9E:8B:98:C9:EE:20:46:EF:C8:9D:79:71:67:F3:13:99:
                43:2F:53:06:48:3C:AB:E5:0C:12:DF:DC:18:6A:6D:12:03:05:0F:D7:
                0D:EC:88:62:1A:CD:65:38:6A:E6:90:15:EE:D5:90:EB:41:12:66:01:
                D7:F9:3A:38:DB:82:8C:AB:BB:4F:AC:1A:CE:D8:1E:B7:A8:46:F2:C9:
                F7:11:0B:47:84:09:1C:CC:A3:3B:2A:15:01:48:61:2B
parm:           debug_level:Enable debug tracing if > 0 (int)
parm:           msi_x:0 - don't use MSI-X, 1 - use MSI-X, >1 - limit number of MSI-X irqs to msi_x (int)
parm:           num_vfs:enable #num_vfs functions if num_vfs > 0
num_vfs=port1,port2,port1+2 (array of byte)
parm:           probe_vf:number of vfs to probe by pf driver (num_vfs > 0)
probe_vf=port1,port2,port1+2 (array of byte)
parm:           log_num_mgm_entry_size:log mgm size, that defines the num of qp per mcg, for example: 10 gives 248.range: 7 <= log_num_mgm_entry_size <= 12. To activate device managed flow steering when available, set to -1 (int)
parm:           enable_64b_cqe_eqe:Enable 64 byte CQEs/EQEs when the FW supports this (default: True) (bool)
parm:           enable_4k_uar:Enable using 4K UAR. Should not be enabled if have VFs which do not support 4K UARs (default: false) (bool)
parm:           log_num_mac:Log2 max number of MACs per ETH port (1-7) (int)
parm:           log_num_vlan:Log2 max number of VLANs per ETH port (0-7) (int)
parm:           use_prio:Enable steering by VLAN priority on ETH ports (deprecated) (bool)
parm:           log_mtts_per_seg:Log2 number of MTT entries per segment (0-7) (default: 0) (int)
parm:           port_type_array:Array of port types: HW_DEFAULT (0) is default 1 for IB, 2 for Ethernet (array of int)
parm:           enable_qos:Enable Enhanced QoS support (default: off) (bool)
parm:           internal_err_reset:Reset device on internal errors if non-zero (default 1) (int)
 

Basriram

New Member
Sep 16, 2021
3
4
3
Hi, I tried your patch with PVE, which kernal is 6.8.12-5, but seems the error 43 still there, is that means it not support ? How can I check if the patch correctly ? The scripts seems running correctly.

Bash:
modinfo  mlx4_core
filename:       /lib/modules/6.8.12-5-pve/updates/dkms/mlx4_core.ko
version:        99.4.0-0
license:        Dual BSD/GPL
description:    Mellanox ConnectX HCA low-level driver
author:         Roland Dreier
srcversion:     4EC12DB41AD0C1ADF790DF4
alias:          pci:v000015B3d00001010sv*sd*bc*sc*i*
alias:          pci:v000015B3d0000100Fsv*sd*bc*sc*i*
alias:          pci:v000015B3d0000100Esv*sd*bc*sc*i*
alias:          pci:v000015B3d0000100Dsv*sd*bc*sc*i*
alias:          pci:v000015B3d0000100Csv*sd*bc*sc*i*
alias:          pci:v000015B3d0000100Bsv*sd*bc*sc*i*
alias:          pci:v000015B3d0000100Asv*sd*bc*sc*i*
alias:          pci:v000015B3d00001009sv*sd*bc*sc*i*
alias:          pci:v000015B3d00001008sv*sd*bc*sc*i*
alias:          pci:v000015B3d00001007sv*sd*bc*sc*i*
alias:          pci:v000015B3d00001006sv*sd*bc*sc*i*
alias:          pci:v000015B3d00001005sv*sd*bc*sc*i*
alias:          pci:v000015B3d00001004sv*sd*bc*sc*i*
alias:          pci:v000015B3d00001003sv*sd*bc*sc*i*
alias:          pci:v000015B3d00001002sv*sd*bc*sc*i*
alias:          pci:v000015B3d0000676Esv*sd*bc*sc*i*
alias:          pci:v000015B3d00006746sv*sd*bc*sc*i*
alias:          pci:v000015B3d00006764sv*sd*bc*sc*i*
alias:          pci:v000015B3d0000675Asv*sd*bc*sc*i*
alias:          pci:v000015B3d00006372sv*sd*bc*sc*i*
alias:          pci:v000015B3d00006750sv*sd*bc*sc*i*
alias:          pci:v000015B3d00006368sv*sd*bc*sc*i*
alias:          pci:v000015B3d0000673Csv*sd*bc*sc*i*
alias:          pci:v000015B3d00006732sv*sd*bc*sc*i*
alias:          pci:v000015B3d00006354sv*sd*bc*sc*i*
alias:          pci:v000015B3d0000634Asv*sd*bc*sc*i*
alias:          pci:v000015B3d00006340sv*sd*bc*sc*i*
depends:
retpoline:      Y
name:           mlx4_core
vermagic:       6.8.12-5-pve SMP preempt mod_unload modversions
sig_id:         PKCS#7
signer:         DKMS module signing key
sig_key:        2D:F8:AE:B4:09:A5:69:3A:96:9A:15:62:C1:0C:F8:7F:E0:FD:B4:A1
sig_hashalgo:   sha512
signature:      58:66:CA:6E:97:EB:0A:09:49:BA:E1:AE:A5:6D:33:8E:87:EB:59:B4:
                E0:A7:81:68:EB:82:FC:29:25:41:06:88:E0:D6:33:C0:FD:68:AB:89:
                89:23:71:20:C2:6B:9A:EE:CE:E5:76:8E:A8:D8:B4:72:E5:28:BF:BA:
                1E:81:94:A6:F1:80:95:DF:F0:B1:12:57:2E:90:B6:D0:54:37:D3:EE:
                19:BA:54:D7:13:62:66:0E:5A:56:E2:94:B6:4E:0F:AD:A5:A2:37:6E:
                99:78:D0:A3:FA:CB:09:99:86:BA:5A:59:F4:A9:C2:EC:89:0F:6A:C2:
                DB:A8:3D:B0:F8:1F:BB:A8:D6:D4:37:B1:3A:2B:3A:03:5E:91:C7:9B:
                B5:10:C3:52:68:71:04:14:12:FB:37:C2:EE:03:5A:C9:CB:4A:4A:E2:
                1F:41:9F:0B:9E:8B:98:C9:EE:20:46:EF:C8:9D:79:71:67:F3:13:99:
                43:2F:53:06:48:3C:AB:E5:0C:12:DF:DC:18:6A:6D:12:03:05:0F:D7:
                0D:EC:88:62:1A:CD:65:38:6A:E6:90:15:EE:D5:90:EB:41:12:66:01:
                D7:F9:3A:38:DB:82:8C:AB:BB:4F:AC:1A:CE:D8:1E:B7:A8:46:F2:C9:
                F7:11:0B:47:84:09:1C:CC:A3:3B:2A:15:01:48:61:2B
parm:           debug_level:Enable debug tracing if > 0 (int)
parm:           msi_x:0 - don't use MSI-X, 1 - use MSI-X, >1 - limit number of MSI-X irqs to msi_x (int)
parm:           num_vfs:enable #num_vfs functions if num_vfs > 0
num_vfs=port1,port2,port1+2 (array of byte)
parm:           probe_vf:number of vfs to probe by pf driver (num_vfs > 0)
probe_vf=port1,port2,port1+2 (array of byte)
parm:           log_num_mgm_entry_size:log mgm size, that defines the num of qp per mcg, for example: 10 gives 248.range: 7 <= log_num_mgm_entry_size <= 12. To activate device managed flow steering when available, set to -1 (int)
parm:           enable_64b_cqe_eqe:Enable 64 byte CQEs/EQEs when the FW supports this (default: True) (bool)
parm:           enable_4k_uar:Enable using 4K UAR. Should not be enabled if have VFs which do not support 4K UARs (default: false) (bool)
parm:           log_num_mac:Log2 max number of MACs per ETH port (1-7) (int)
parm:           log_num_vlan:Log2 max number of VLANs per ETH port (0-7) (int)
parm:           use_prio:Enable steering by VLAN priority on ETH ports (deprecated) (bool)
parm:           log_mtts_per_seg:Log2 number of MTT entries per segment (0-7) (default: 0) (int)
parm:           port_type_array:Array of port types: HW_DEFAULT (0) is default 1 for IB, 2 for Ethernet (array of int)
parm:           enable_qos:Enable Enhanced QoS support (default: off) (bool)
parm:           internal_err_reset:Reset device on internal errors if non-zero (default 1) (int)
This will not solve the error 43 problem. Just disable the fast start up option that was mentioned by @okrasit and make sure both the bus driver and network interface drivers are from mellanox not a mix of Microsoft and mellanox.
 

phyerbarte

New Member
Dec 30, 2024
2
0
1
This will not solve the error 43 problem. Just disable the fast start up option that was mentioned by @okrasit and make sure both the bus driver and network interface drivers are from mellanox not a mix of Microsoft and mellanox.
Thanks for the explaination, it's sad, seems there is no way for the fixing, looks like the best chance may doing the 4.9 version driver build by our own with new Linux kernal, but it may impossible without necessary code changing, even we have the source code...ah..too bad. dead end for the connectx-3 with windows guest under SRIOV...
 

Basriram

New Member
Sep 16, 2021
3
4
3
Thanks for the explaination, it's sad, seems there is no way for the fixing, looks like the best chance may doing the 4.9 version driver build by our own with new Linux kernal, but it may impossible without necessary code changing, even we have the source code...ah..too bad. dead end for the connectx-3 with windows guest under SRIOV...
I should have been a little more clear. With the dkms patch for the proxmox 6.8 kernel you will be able to use the virtual network interface under SRIOV in windows. Here are some steps -
1. Get the dkms scripts installed on the host and build mlx4_core.ko module.
2. Enable vf ports by creating /etc/modprobe.d/mlx4_core.conf file with the required number of virtual functions. In my case below I have 2 vfs
cat /etc/modprobe.d/mlx4_core.conf
options mlx4_core port_type_array=2,2 num_vfs=2 probe_vf=0 enable_64b_cqe_eqe=0 log_num_mgm_entry_size=-1 debug_level=64 msi_x=1 enable_4k_uar=1 enable_qos=1 log_num_mac=7 log_mttr_per_seg=4
3. updateinitramfs -u -k `uname -r`, modprobe mlx4_en etc., and you should see the additional virtual adapters with lspci -nn
09:00.0 Ethernet controller [0200]: Mellanox Technologies MT27500 Family [ConnectX-3] [15b3:1003]
09:00.1 Ethernet controller [0200]: Mellanox Technologies MT27500/MT27520 Family [ConnectX-3/ConnectX-3 Pro Virtual Function] [15b3:1004]
09:00.2 Ethernet controller [0200]: Mellanox Technologies MT27500/MT27520 Family [ConnectX-3/ConnectX-3 Pro Virtual Function] [15b3:1004]
4. Create a snippet file in /var/lib/vz/snippets/ folder for any tweaks you need to apply on the specific virtual function. In my example I use a specific mac address and vlan tag
cat /var/lib/vz/snippets/windows.sh
#!/bin/sh
/usr/bin/ip link set enp9s0 vf 0 mac f4:52:14:68:b5:c0
/usr/bin/ip link set enp9s0 vf 0 vlan 30
5. Add this hookscript to your windows guest conf as well as add the pci node
cat /etc/pve/qemu-server/100.conf
hookscript: local:snippets/windows.sh
hostpci1: 0000:09:00.1
6. Install the Mellanox drivers for both the system bus and the network adapter
1736448908311.png
7. Ensure Turn on fast start-up is unchecked in settings
1736448959634.png
With this you should be able to use your virtual adapter in windows. Hope this helps.