Mellanox NICs in Debian 11 not supported?

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

fohdeesha

Kaini Industries
Nov 20, 2016
2,796
3,186
113
33
fohdeesha.com
They work perfectly and don't need a driver download from mellanox, debian has and will have the mlx kernel driver included in it like, forever probably. Removing that would be like removing the Intel ixgb driver
 
  • Like
Reactions: tinfoil3d

CheatingLiar

New Member
Jan 20, 2022
3
0
1
Late reply, but...

AFAIK, it depends on what you are trying to do. Just using the card in the host to do average network card things should work out of the box as others said. Now if you are going to do SR-IOV, you will need to get the source code and massage it.
 

ectoplasmosis

Active Member
Jul 28, 2021
117
54
28
Late reply, but...

AFAIK, it depends on what you are trying to do. Just using the card in the host to do average network card things should work out of the box as others said. Now if you are going to do SR-IOV, you will need to get the source code and massage it.
This is incorrect. SR-IOV works fine without having to do or install anything, 'massage' or otherwise.

The included kernel module accepts the same commands relating to SR-IOV as any other Mellanox driver. I'm running ConnectX-4/5 SFP28 cards with SR-IOV in several Debian 11 systems without a problem.
 

efschu3

Active Member
Mar 11, 2019
181
77
28
This is incorrect. SR-IOV works fine without having to do or install anything, 'massage' or otherwise.

The included kernel module accepts the same commands relating to SR-IOV as any other Mellanox driver. I'm running ConnectX-4/5 SFP28 cards with SR-IOV in several Debian 11 systems without a problem.
This is incorrect, SR-IOV on kernel inbox driver does work well with non-windows-guests. But it does NOT work for windows guests.
 
  • Like
Reactions: ectoplasmosis

efschu3

Active Member
Mar 11, 2019
181
77
28
Could you pls provide
modinfo mlx4_en

And your exact Version of your Windows Driver?

B/c sriov for Windows guests stoped working with Kernel 4.19 (actualy).
 

ectoplasmosis

Active Member
Jul 28, 2021
117
54
28
Could you pls provide
modinfo mlx4_en

And your exact Version of your Windows Driver?

B/c sriov for Windows guests stoped working with Kernel 4.19 (actualy).

mlx4_en version 4.0-0

Win10 21H2 guest VMs, WinOFED2 latest version


Works OK, no Code 10/43 errors, all assigned VF devices appear normally within the Windows guest.

Mellanox card is a MCX4121A-ACAT.
 

crackelf

Member
Apr 11, 2021
74
6
8
Anyone have Mellanox NICs with vf's in Debian working with the mlx5_core driver?

I'm treating this like the ixgbe driver and not getting far. I see Kernel driver in use: mlx5_core and am trying to create virtual functions in /etc/modprobe.d/mlx5_core.conf with some options like options mlx5_core num_vfs=12 port_type_array=2 probe_vf=12, but predictably dmesg says it's ignoring those options as they are "unknown".

Reading the docs over at nvidia show these as valid options, but only for their MLNX_OFED driver. Any tips and or resources here? I'm not finding much for the kernel driver other than the kernel.org docs, but not seeing much about vfs. Is the core driver not enough and I need to enable the EN driver in the kernel? Thanks for any help!
EDIT / ANSWER: echo 8 > /sys/devices/pci0000\:00/0000\:00\:1d.0/0000\:03\:00.0/sriov_numvfs
check this thread for more info
 
Last edited:

firemeteor

New Member
Jun 18, 2022
16
7
3
mlx4_en version 4.0-0

Win10 21H2 guest VMs, WinOFED2 latest version


Works OK, no Code 10/43 errors, all assigned VF devices appear normally within the Windows guest.

Mellanox card is a MCX4121A-ACAT.
I think I run into a bug that with Linux stock driver (mlx4_en 4.0.0) does not handle promiscuous mode well in SRIOV PF ports.
Switching to the vendor driver (v4.9-4.1.7) immediately solved my problem.
I really hate this driver policy of releasing old && buggy driver to public while keep the up-to-date driver and fixed driver version out-of tree...
And now, the up-to-date version is stuck with ancient kernel version...
 

firemeteor

New Member
Jun 18, 2022
16
7
3
Thanks for the reminder, Stephan.
Actually I came across that post before but I didn't realize that this may have anything to do with my issue.
I didn't try any window guest yet and thus didn't run into any explicit error code.
My issue is that the NIC just eat my packets in silent when the SR-IOV PF is put into a host side bridge.
If I understand correctly, this setup does not need FDB manipulation which is more about setting up bridge after a VF in the guest.

BTW, when I was on the vendor driver v4.9, I did run into a FDB manipulation related problem that the 'ip fdb add xxx ' command does not seem to make any difference... But anyway I was able to find a workaround to live with it....


I have to say that my experience from this card is pretty bad...
There is no perfect driver that I can use with ease.
For the stock driver I tried to contact the maintainer but got no response yet.
For the vendor driver I'm still trying to build on 5.x kernel without much success.
(I heard from a different forum that newer driver versions which are supposed to not compatible with CX3 may actually work.
But ironically for unknown reason I can't even build the latest driver which is supposed to compile on my kernel version...)

I'm wondering if I grabbed a X520-DA2 instead of CX3, will it behave any better than the Mellanox ones...
 

Stephan

Well-Known Member
Apr 21, 2017
979
746
93
Germany
@firemeteor I am using stock drivers with EMC CX3 rev A3+A4 cards on kernel 5.4 with okrasit's patches. Looked into compiling OFED vendor drivers once, but holy moly what a mess. Since I don't want another hell that is RDMA, which for me has no advantages, I cancelled all plans to make an Archlinux package and just went with the stock driver.

From reading your post my own feeling is SR-IOV is already thin ice, with every device. Putting a bridge on top looks against the KISS principle to me. I like the better trampled paths and when there are only two hits in Google for SRIOV FDB linux bridge mellanox you know the roadrunner has moved over the edge, waiting for gravity to assert itself. If you can simplify the concept, I would.

X520-DA2 also ok, only 10 Gbps of course. Very mature Intel product. CX3 of course of interest to people with FDR cable to run 56 Gbps for low prices.
 

firemeteor

New Member
Jun 18, 2022
16
7
3
@firemeteor I am using stock drivers with EMC CX3 rev A3+A4 cards on kernel 5.4 with okrasit's patches. Looked into compiling OFED vendor drivers once, but holy moly what a mess. Since I don't want another hell that is RDMA, which for me has no advantages, I cancelled all plans to make an Archlinux package and just went with the stock driver.
I would rather avoid the vendor driver mess if the stock one worked for me...
I wasn't looking for any advanced feature from vendor driver either, just for bug fixes.
I know very little about RDMA, but if it requires any client-side cooperation it won't be helpful to me...

Thankfully my driver building efforts had been paid-off and I was able to get the 4.9 LTS driver running on Debian 11.2 (Kernel 5.10).
I had to fix some configuration issue and some code compatibility issue though.

From reading your post my own feeling is SR-IOV is already thin ice, with every device. Putting a bridge on top looks against the KISS principle to me. I like the better trampled paths and when there are only two hits in Google for SRIOV FDB linux bridge mellanox you know the roadrunner has moved over the edge, waiting for gravity to assert itself. If you can simplify the concept, I would.
I think we should blame the poor quality of the stock driver, and those people behind the decision of such version separation.
As an old, mature (I suppose?) and well-defined feature, SRIOV should 'just works' as long as the driver stack if solid.
The concept of SRIOV should be clear and from implementation perspective it just comes with an embedded bridge for traffic routing purpose.
Attaching one bridge to another really shouldn't have caused too much trouble....
This should be true at least on the PF side of the NIC, as it should be plug-n-play and configuration-free.

The FDB management is for a more advanced feature related to security concern on VF side promiscuous mode.
The same security model should apply to other SRIOV NIC vendor like Intel too.
I don't really need this and just to have a try -- maybe I run into another driver bug, or maybe I used wrong configuration...

X520-DA2 also ok, only 10 Gbps of course. Very mature Intel product. CX3 of course of interest to people with FDR cable to run 56 Gbps for low prices.
Glad that I can stick to my current CX3 card without seeking for Intel replacement.
I didn't realize that the CX3 is capable of doing 56Gbps. Is this specific to some of the models only?
What I have should be the low-end CX311/312 cards. I didn't see 56Gbps mentioned anyway in the spec.
I also have no idea what a FDR cable really means since I'm a totally freshman to SFP+ NICs.
What I have is a Finisar AOC cable, probably not qualify that 56Gbps either.
But anyway, it's not what I was shooting at and other part of my system will probably won't able to keep up the pace. :)