Help with ConnectX 3 SR-IOV with Linux host and windows guest via kvm

okrasit

Member
Jun 28, 2019
40
31
18
I have 8 probed vf's and 8 not probed. Probed are passed through to the LXC containers, not probed - to the VM's. Like I said, linux VM's have zero issues with this.
Code:
num_vfs=16,0,0 port_type_array=2,2 probe_vf=8,0,0
Can you do modinfo mlx4_core, just to verify the driver version you're on.
 

mimino

Active Member
Nov 2, 2018
193
75
28
Can you do modinfo mlx4_core, just to verify the driver version you're on.
I posted before what I've installed, replacing in-tree kernel driver.
Code:
# modinfo mlx4_core
filename:       /lib/modules/5.4.106-1-pve/updates/dkms/mlx4_core.ko
version:        4.9-3.1.5
license:        Dual BSD/GPL
description:    Mellanox ConnectX HCA low-level driver
author:         Roland Dreier
srcversion:     8D66E3E8AB893AB1187D306
 

okrasit

Member
Jun 28, 2019
40
31
18
I posted before what I've installed, replacing in-tree kernel driver.
Code:
# modinfo mlx4_core
filename:       /lib/modules/5.4.106-1-pve/updates/dkms/mlx4_core.ko
version:        4.9-3.1.5
license:        Dual BSD/GPL
description:    Mellanox ConnectX HCA low-level driver
author:         Roland Dreier
srcversion:     8D66E3E8AB893AB1187D306
I see a problem there, the windows driver doesn't play nice with that linux driver version.
You could try adding the log_num_mgm_entry_size=-1 parameter to mlx4_core.
I'd expect it to fail on port query, though.
 
Last edited:

mimino

Active Member
Nov 2, 2018
193
75
28
I see a problem there, the windows driver doesn't play nice with that linux driver version.
You could try adding the log_num_mgm_entry_size=-1 parameter to mlx4_core.
I'd expect it to fail on port query, though.
So which version does it play nice with? I thought this was the one.

Also, the flow steering is disabled by default which I think is the best choice. I had it set to -1 with the default kernel driver but it didn't seem to make any difference.
 

okrasit

Member
Jun 28, 2019
40
31
18
So which version does it play nice with? I thought this was the one.

Also, the flow steering is disabled by default which I think is the best choice. I had it set to -1 with the default kernel driver but it didn't seem to make any difference.
It's either patch the stock kernel driver or install the latest & greatest ofed. :(
That vhcr cmd, it failed executing, was MLX4_CMD_ALLOC_RES. So, before doing anything, I'd try that param.
 

mimino

Active Member
Nov 2, 2018
193
75
28
Thanks @okrasit, will this patch apply to 5.4 kernel?
I'm still curious how @efschu3 got this to work, with latest LTS Mellanox OFED and 5.4 kernel? Supposedly I did the same but w/o much luck :(
 

okrasit

Member
Jun 28, 2019
40
31
18
Thanks @okrasit, will this patch apply to 5.4 kernel?
I'm still curious how @efschu3 got this to work, with latest LTS Mellanox OFED and 5.4 kernel? Supposedly I did the same but w/o much luck :(
Well, you've got the LTS version of OFED. The latest is 5.3-1, where yours is the LTS version 4.9-3 ?
Yes, it should work with 5.4 also!

I just realised, they've dropped support for connectx-3 on 5.1. Kind of weird :oops:
 
Last edited:

mimino

Active Member
Nov 2, 2018
193
75
28
Well, you've got the LTS version of OFED. The latest is 5.3-1, where yours is the LTS version 4.9-3 ?
Yes, it should work with 5.4 also!
5.3-1 is the latest, not latest LTS. That's why I wanted a confirmation of the exact version he installed.
EDIT: just realized you said the exact same thing
 
  • Like
Reactions: okrasit

okrasit

Member
Jun 28, 2019
40
31
18
This is very weird, as if Mellanox (nvidia) is trying to sabotage their older products. I downloaded the 4.9-3.1.5.0 LTS. By looking at the source code, it seems incompatible with the Windows driver. :oops:
 

mimino

Active Member
Nov 2, 2018
193
75
28
This is very weird, as if Mellanox (nvidia) is trying to sabotage their older products. I downloaded the 4.9-3.1.5.0 LTS. By looking at the source code, it seems incompatible with the Windows driver. :oops:
Perhaps I should be chasing the matching windows driver? The choices are pretty limited though, I tried a few and gave up :(
 

klui

Well-Known Member
Feb 3, 2019
589
279
63
This is very weird, as if Mellanox (nvidia) is trying to sabotage their older products. I downloaded the 4.9-3.1.5.0 LTS. By looking at the source code, it seems incompatible with the Windows driver. :oops:
They want people to buy new products, not use old ones.
 
  • Like
Reactions: Stephan and okrasit

okrasit

Member
Jun 28, 2019
40
31
18
@mimino

I think I know what's happening with your error code 10. There's two devices in windows, vpi and ethernet. If the VPI device is loaded with microsoft provided driver and the ethernet has winof driver, it'll give you the error code 10.

Uninstall the devices and check the delete driver option.
 

klui

Well-Known Member
Feb 3, 2019
589
279
63
I think I know what's happening with your error code 10. There's two devices in windows, vpi and ethernet. If the VPI device is loaded with microsoft provided driver and the ethernet has winof driver, it'll give you the error code 10.

Uninstall the devices and check the delete driver option.
Would that happen only if a port is set to VPI? I set my CX3's ports to ethernet and I don't see any VPI devices.

What about updating the VPI device(s) with the WinOF driver?
 

mimino

Active Member
Nov 2, 2018
193
75
28
@mimino

I think I know what's happening with your error code 10. There's two devices in windows, vpi and ethernet. If the VPI device is loaded with microsoft provided driver and the ethernet has winof driver, it'll give you the error code 10.

Uninstall the devices and check the delete driver option.
There's only one ethernet device "Mellanox ConnectX-3 Virtual Function Ethernet Adapter". I've tried deleting/installing different drivers w/o success.
 

mimino

Active Member
Nov 2, 2018
193
75
28
Yes, the "other" is in System Devices
You're right, there is a second VPI device and it's functioning properly according to Windows!
But uninstalling it removes the Ethernet Adapter too.

1623154830850.png
 
Last edited:

okrasit

Member
Jun 28, 2019
40
31
18
You're right, there is a second VPI device! But uninstalling it removes the Ethernet Adapter too.
Yes, after uninstalling them do the hw scan from the menu. The thing is, if you have driver version mismatch between the vpi & ethernet device, it will not work. So uninstall + remove driver and begin with the ethernet adapter.
 
Last edited:

mimino

Active Member
Nov 2, 2018
193
75
28
Yes, after uninstalling them do the hw scan from the menu. The thing is, if you have driver version mismatch between the vpi & ethernet device, it will not work.
I did that and they both came back exactly as they were before.
I've tried with default Windows driver and with MLNX_VPI_WinOF-5_50_53000_All_Win2019_x64, in which case there was a driver version mismatch between VPI and ETH. So what's the correct process here?