Solarflare SR-IOV on Linux (Partially Solved)

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

arglebargle

H̸̖̅ȩ̸̐l̷̦͋l̴̰̈ỏ̶̱ ̸̢͋W̵͖̌ò̴͚r̴͇̀l̵̼͗d̷͕̈
Jul 15, 2018
657
244
43
I'm having some trouble getting VF's going with an SFN5122F (SFN5000 series.) I realized there basically isn't any documentation on troubleshooting these cards and I thought I'd document the issue so it'll show up in the future when anyone else turns to Google for an answer.

I've installed the vendor dkms module and loaded it, and enabled everything for SR-IOV using sfboot per the solarflare documentation. The only thing that's weird is that the documentation states that the firmware needs to be in "full-feature" mode, which I don't seem to be able to actually use on an SFN5/6000 card. I can set the mode but on shutdown/reboot ethtool and sfboot don't report full-feature as active.

The documentation states that SFN5000/6000 cards don't fully support SR-IOV in the same way that 7k/8k models do, but I'm unable to find any documentation re: SR-IOV configuration on these older boards.

@WANg Any thoughts? I didn't want to hijack your build thread for technical support on this but I remember you got this working on one of your boards.

Code:
root@ted:~# ethtool -i enp1s0f0np0
driver: sfc
version: 4.13.1.1034
firmware-version: 3.3.2.1000
expansion-rom-version:
bus-info: 0000:01:00.0
supports-statistics: yes
supports-test: yes
supports-eeprom-access: no
supports-register-dump: yes
supports-priv-flags: yes
Code:
root@ted:~# sfboot
Solarflare boot configuration utility [v7.1.3]
Copyright Solarflare Communications 2006-2018, Level 5 Networks 2002-2005

enp1s0f0np0:
  Boot image                            Disabled
  PF MSI-X interrupt limit              32
  SR-IOV                                Enabled
  Virtual Functions on each PF          4
  VF MSI-X interrupt limit              8

enp1s0f1np1:
  Boot image                            Disabled
  PF MSI-X interrupt limit              32
  SR-IOV                                Enabled
  Virtual Functions on each PF          4
  VF MSI-X interrupt limit              8
Code:
root@ted:~# dmesg | grep sfc
[    3.829429] sfc: loading out-of-tree module taints kernel.
[    3.838992] sfc 0000:01:00.0 (unnamed net_device) (uninitialized): Solarflare NIC detected
[    3.846934] sfc 0000:01:00.0 (unnamed net_device) (uninitialized): Part Number : SFN5122F
[    3.846947] sfc 0000:01:00.0: enabling device (0000 -> 0003)
[    3.877240] sfc 0000:01:00.0 (unnamed net_device) (uninitialized): no PTP support
[    4.624154] sfc 0000:01:00.1 (unnamed net_device) (uninitialized): Solarflare NIC detected
[    4.626797] sfc 0000:01:00.1 (unnamed net_device) (uninitialized): Part Number : SFN5122F
[    4.626810] sfc 0000:01:00.1: enabling device (0000 -> 0003)
[    4.656173] sfc 0000:01:00.1 (unnamed net_device) (uninitialized): no PTP support
[    4.666793] sfc 0000:01:00.0 enp1s0f0np0: renamed from eth0
[    4.680311] sfc 0000:01:00.1 enp1s0f1np1: renamed from eth1
Code:
root@ted:~# lspci -tv
-[0000:00]-+-00.0  Advanced Micro Devices, Inc. [AMD] Family 15h (Models 30h-3fh) Processor Root Complex
          +-00.2  Advanced Micro Devices, Inc. [AMD] Family 15h (Models 30h-3fh) I/O Memory Management Unit
          +-01.0  Advanced Micro Devices, Inc. [AMD/ATI] Kaveri [Radeon R7 Graphics]
          +-01.1  Advanced Micro Devices, Inc. [AMD/ATI] Kaveri HDMI/DP Audio Controller
          +-02.0  Advanced Micro Devices, Inc. [AMD] Device 1424
          +-02.1-[01]--+-00.0  Solarflare Communications SFC9020 [Solarstorm]
          |            \-00.1  Solarflare Communications SFC9020 [Solarstorm]
          +-03.0  Advanced Micro Devices, Inc. [AMD] Device 1424
          +-03.2-[02]----00.0  Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller
          +-04.0  Advanced Micro Devices, Inc. [AMD] Device 1424
          +-10.0  Advanced Micro Devices, Inc. [AMD] FCH USB XHCI Controller
          +-10.1  Advanced Micro Devices, Inc. [AMD] FCH USB XHCI Controller
          +-11.0  Advanced Micro Devices, Inc. [AMD] FCH SATA Controller [AHCI mode]
          +-12.0  Advanced Micro Devices, Inc. [AMD] FCH USB OHCI Controller
          +-12.2  Advanced Micro Devices, Inc. [AMD] FCH USB EHCI Controller
          +-13.0  Advanced Micro Devices, Inc. [AMD] FCH USB OHCI Controller
          +-13.2  Advanced Micro Devices, Inc. [AMD] FCH USB EHCI Controller
          +-14.0  Advanced Micro Devices, Inc. [AMD] FCH SMBus Controller
          +-14.1  Advanced Micro Devices, Inc. [AMD] FCH IDE Controller
          +-14.2  Advanced Micro Devices, Inc. [AMD] FCH Azalia Controller
          +-14.3  Advanced Micro Devices, Inc. [AMD] FCH LPC Bridge
          +-14.4-[03]--
          +-18.0  Advanced Micro Devices, Inc. [AMD] Family 15h (Models 30h-3fh) Processor Function 0
          +-18.1  Advanced Micro Devices, Inc. [AMD] Family 15h (Models 30h-3fh) Processor Function 1
          +-18.2  Advanced Micro Devices, Inc. [AMD] Family 15h (Models 30h-3fh) Processor Function 2
          +-18.3  Advanced Micro Devices, Inc. [AMD] Family 15h (Models 30h-3fh) Processor Function 3
          +-18.4  Advanced Micro Devices, Inc. [AMD] Family 15h (Models 30h-3fh) Processor Function 4
          \-18.5  Advanced Micro Devices, Inc. [AMD] Family 15h (Models 30h-3fh) Processor Function 5

Edit: I'm striking out on this so far, here's what pops up in dmesg before SR-IOV fails:

Code:
root@ted:~/tmp/mlx# modprobe sfc max_vfs=4

dmesg:

[  830.657340] pps_core: LinuxPPS API ver. 1 registered
[  830.657343] pps_core: Software ver. 5.3.6 - Copyright 2005-2007 Rodolfo Giometti <giometti@linux.it>
[  830.658749] PTP clock support registered
[  830.666762] Solarflare NET driver v4.13.1.1034
[  830.666805] Registered efx_mcdi_proxy genl family as 28
[  830.666901] sfc 0000:01:00.0 (unnamed net_device) (uninitialized): Solarflare NIC detected
[  830.668735] sfc 0000:01:00.0 (unnamed net_device) (uninitialized): Part Number : SFN5122F
[  830.705680] sfc 0000:01:00.0 (unnamed net_device) (uninitialized): no PTP support
[  831.544552] pci 0000:01:00.2: [1924:1803] type 00 class 0x020000
[  831.544988] iommu: Adding device 0000:01:00.2 to group 15
[  831.545024] iommu: Using direct mapping for device 0000:01:00.2
[  831.545194] pci 0000:01:00.4: [1924:1803] type 00 class 0x020000
[  831.545506] iommu: Adding device 0000:01:00.4 to group 16
[  831.545532] iommu: Using direct mapping for device 0000:01:00.4
[  831.545627] pci 0000:01:00.6: [1924:1803] type 00 class 0x020000
[  831.545938] iommu: Adding device 0000:01:00.6 to group 17
[  831.545961] iommu: Using direct mapping for device 0000:01:00.6
[  831.546065] pci 0000:01:01.0: [1924:1803] type 7f class 0xffffff
[  831.546081] pci 0000:01:01.0: unknown header type 7f, ignoring device
[  831.546254] iommu: Removing device 0000:01:00.6 from group 17
[  831.546377] iommu: Removing device 0000:01:00.4 from group 16
[  831.546602] iommu: Removing device 0000:01:00.2 from group 15
[  832.552640] sfc 0000:01:00.0 (unnamed net_device) (uninitialized): SR-IOV can't be enabled rc -5
[  832.565729] sfc 0000:01:00.1 (unnamed net_device) (uninitialized): Solarflare NIC detected
[  832.566333] sfc 0000:01:00.0 enp1s0f0np0: renamed from eth0
[  832.567644] sfc 0000:01:00.1 (unnamed net_device) (uninitialized): Part Number : SFN5122F
[  832.602788] sfc 0000:01:00.1 (unnamed net_device) (uninitialized): no PTP support
[  832.716523] pci 0000:01:00.3: [1924:1803] type 00 class 0x020000
[  832.716967] iommu: Adding device 0000:01:00.3 to group 15
[  832.717000] iommu: Using direct mapping for device 0000:01:00.3
[  832.717171] pci 0000:01:00.5: [1924:1803] type 00 class 0x020000
[  832.717487] iommu: Adding device 0000:01:00.5 to group 16
[  832.717512] iommu: Using direct mapping for device 0000:01:00.5
[  832.717605] pci 0000:01:00.7: [1924:1803] type 00 class 0x020000
[  832.717896] iommu: Adding device 0000:01:00.7 to group 17
[  832.717918] iommu: Using direct mapping for device 0000:01:00.7
[  832.718003] pci 0000:01:01.1: [1924:1803] type 7f class 0xffffff
[  832.718019] pci 0000:01:01.1: unknown header type 7f, ignoring device
[  832.718169] iommu: Removing device 0000:01:00.7 from group 17
[  832.718272] iommu: Removing device 0000:01:00.5 from group 16
[  832.718365] iommu: Removing device 0000:01:00.3 from group 15
[  833.736574] sfc 0000:01:00.1 (unnamed net_device) (uninitialized): SR-IOV can't be enabled rc -5
[  833.741315] sfc 0000:01:00.1 enp1s0f1np1: renamed from eth0
I've tried adding "pci=realloc" per the solarflare docs but that doesn't help.

SR-IOV works fine on other NICs in this machine.
 
Last edited:

arglebargle

H̸̖̅ȩ̸̐l̷̦͋l̴̰̈ỏ̶̱ ̸̢͋W̵͖̌ò̴͚r̴͇̀l̵̼͗d̷͕̈
Jul 15, 2018
657
244
43
Alright, I've got the driver creating VFs. Apparently I can't create more than 3 VFs per port without the driver failing with "SR-IOV can't be enabled rc -5".

Note: max_vfs=5,1 doesn't work, 3 VFs is the hard limit for any port.

Here's the relevant section from an older copy of the solarflare adapter user guide:

Screenshot 2018-09-06 at 10.14.55 AM.png

Code:
root@ted:~# modprobe sfc max_vfs=3

relevant dmesg:
[ 2233.105076] pps_core: LinuxPPS API ver. 1 registered
[ 2233.105079] pps_core: Software ver. 5.3.6 - Copyright 2005-2007 Rodolfo Giometti <giometti@linux.it>
[ 2233.105883] PTP clock support registered
[ 2233.113302] Solarflare NET driver v4.13.1.1034
[ 2233.113344] Registered efx_mcdi_proxy genl family as 28
[ 2233.113439] sfc 0000:01:00.0 (unnamed net_device) (uninitialized): Solarflare NIC detected
[ 2233.115259] sfc 0000:01:00.0 (unnamed net_device) (uninitialized): Part Number : SFN5122F
[ 2233.160207] sfc 0000:01:00.0 (unnamed net_device) (uninitialized): no PTP support
[ 2234.100777] pci 0000:01:00.2: [1924:1803] type 00 class 0x020000
[ 2234.101203] iommu: Adding device 0000:01:00.2 to group 15
[ 2234.101239] iommu: Using direct mapping for device 0000:01:00.2
[ 2234.101448] pci 0000:01:00.4: [1924:1803] type 00 class 0x020000
[ 2234.101764] iommu: Adding device 0000:01:00.4 to group 16
[ 2234.101787] iommu: Using direct mapping for device 0000:01:00.4
[ 2234.101879] pci 0000:01:00.6: [1924:1803] type 00 class 0x020000
[ 2234.102205] iommu: Adding device 0000:01:00.6 to group 17
[ 2234.102228] iommu: Using direct mapping for device 0000:01:00.6
[ 2234.102332] sfc 0000:01:00.0 (unnamed net_device) (uninitialized): enabled SR-IOV for 3 VFs, 1 VI per VF
[ 2234.113322] sfc 0000:01:00.1 (unnamed net_device) (uninitialized): Solarflare NIC detected
[ 2234.115542] sfc 0000:01:00.1 (unnamed net_device) (uninitialized): Part Number : SFN5122F
[ 2234.157244] sfc 0000:01:00.1 (unnamed net_device) (uninitialized): no PTP support
[ 2234.264781] pci 0000:01:00.3: [1924:1803] type 00 class 0x020000
[ 2234.265213] iommu: Adding device 0000:01:00.3 to group 18
[ 2234.265249] iommu: Using direct mapping for device 0000:01:00.3
[ 2234.265465] pci 0000:01:00.5: [1924:1803] type 00 class 0x020000
[ 2234.265842] iommu: Adding device 0000:01:00.5 to group 19
[ 2234.265866] iommu: Using direct mapping for device 0000:01:00.5
[ 2234.265996] pci 0000:01:00.7: [1924:1803] type 00 class 0x020000
[ 2234.266384] iommu: Adding device 0000:01:00.7 to group 20
[ 2234.266409] iommu: Using direct mapping for device 0000:01:00.7
[ 2234.266577] sfc 0000:01:00.1 (unnamed net_device) (uninitialized): enabled SR-IOV for 3 VFs, 1 VI per VF
[ 2234.275661] sfc 0000:01:00.1 enp1s0f1np1: renamed from eth1
[ 2234.297080] sfc 0000:01:00.0 enp1s0f0np0: renamed from eth0
I'm a bit stumped here, I'm able to create more VFs on my Mellanox cards on the same machine though IIRC I think I'm still limited to something like ~8 per card there as well.
 
Last edited:

WANg

Well-Known Member
Jun 10, 2018
1,302
967
113
46
New York, NY
Huh. I'll look into it this weekend - right now the t730 is up and running as my home hypervisor - but I'll take it down and have a look in Proxmox. I don't remember the firmware release on my SFN5122 being that old, though...
 

arglebargle

H̸̖̅ȩ̸̐l̷̦͋l̴̰̈ỏ̶̱ ̸̢͋W̵͖̌ò̴͚r̴͇̀l̵̼͗d̷͕̈
Jul 15, 2018
657
244
43
Alright, I dug into this a bit more today with a second adapter (SFN7122F.) Same deal with the VF limitations, no more than 3 VFs per port.

Some notes on VF passthrough to guest operating systems: Passthrough to a Ubuntu 18.04 guest works fine, no issue with the driver, but passthrough to FreeBSD fails miserably. Both the in-tree FreeBSD sfxge driver and the vendor supplied driver fail to attach to the device when working with a VF passed through to the guest.

Edit: A little more on the 3 VF per port limit; the driver creates VFs for each port on alternating PCIe subaddresses. Physical port 1 gets subaddress .0, physical port 2 gets .1 and then all VFs for port 1 are assigned to even numbered subaddress: VF 1: .2, VF 2: .4, VF 3: .6.

Any attempt to assign VFs that would assign a subaddress > 7 (addresses are zero indexed) causes the entire block assignment to fail. You can verify this by loading the driver with "max_vfs=3,6", VFs for port 1 should be assigned correctly while VFs for port 2 fail.

For some reason I'm unable to address more than 8 PCIe subaddresses on the card, I'm not sure if this is a Solarflare issue or a BIOS issue on my test machine.

Alright, I've figured out the 8 subaddress limit too. Apparently the subaddress is a 3 bit field limited to addresses 0-7. If your upstream pcie bridge or port supports ARI forwarding (Alternative Routing-ID Interpretation) then you can address up to 256 devices per PCIe address. My test machine doesn't support ARIFwd so that's that.
 
Last edited:

vanfawx

Active Member
Jan 4, 2015
365
67
28
45
Vancouver, Canada
That's strange. Even on Intel's gigabit cards (82576 and greater) they support 7VF per physical function. If you're using their 10gbe cards, I think you get 31VF's per physical port.
 

WANg

Well-Known Member
Jun 10, 2018
1,302
967
113
46
New York, NY
That's strange. Even on Intel's gigabit cards (82576 and greater) they support 7VF per physical function. If you're using their 10gbe cards, I think you get 31VF's per physical port.
That's not a limitation on the card itself - it has to do with the PCIe root complex's inability to provide the alternative routing-id interpretation (ARI) so the VFs can be created off the PFs and referenced. The hardware is in theory able to do it (it's AMD Kaveri based) - the BIOS simply didn't setup the hardware and MMIO space to allow it.
 
  • Like
Reactions: llowrey

arglebargle

H̸̖̅ȩ̸̐l̷̦͋l̴̰̈ỏ̶̱ ̸̢͋W̵͖̌ò̴͚r̴͇̀l̵̼͗d̷͕̈
Jul 15, 2018
657
244
43
Yeah, I've been over that post too. My initial testing was in an HP T730 thin client that I setup as a low power virtualization host but I've tested on a bunch of chipsets at this point (H170, Z97, Q87, C216, etc.)
 

llowrey

Active Member
Feb 26, 2018
167
138
43
That's not a limitation on the card itself - it has to do with the PCIe root complex's inability to provide the alternative routing-id interpretation (ARI) so the VFs can be created off the PFs and referenced. The hardware is in theory able to do it (it's AMD Kaveri based) - the BIOS simply didn't setup the hardware and MMIO space to allow it.
My apologies for reviving a 2 year old thread, but this just saved me a bunch of time and pain.

I'm running an ASRock Rack EPYCD8 and have my Mellanox CX3 set to 16 VFs. I updated the bios to 2.40 and suddenly could only get 7 VFs to work and got the dreaded "!!! Unknown header type 7f" for the remaining VFs. Turns out the bios defaulted ARI to disabled. I flipped it to 'auto' and got my 16 VFs back. I probably would have given up if I had not found this. So, THANK YOU!
 
  • Like
Reactions: Microcosm and gb00s