Solarflare SR-IOV on Linux (Partially Solved)

Discussion in 'Linux Admins, Storage and Virtualization' started by arglebargle, Sep 5, 2018.

  1. arglebargle

    arglebargle H̸̖̅ȩ̸̐l̷̦͋l̴̰̈ỏ̶̱ ̸̢͋W̵͖̌ò̴͚r̴͇̀l̵̼͗d̷͕̈

    Joined:
    Jul 15, 2018
    Messages:
    245
    Likes Received:
    75
    I'm having some trouble getting VF's going with an SFN5122F (SFN5000 series.) I realized there basically isn't any documentation on troubleshooting these cards and I thought I'd document the issue so it'll show up in the future when anyone else turns to Google for an answer.

    I've installed the vendor dkms module and loaded it, and enabled everything for SR-IOV using sfboot per the solarflare documentation. The only thing that's weird is that the documentation states that the firmware needs to be in "full-feature" mode, which I don't seem to be able to actually use on an SFN5/6000 card. I can set the mode but on shutdown/reboot ethtool and sfboot don't report full-feature as active.

    The documentation states that SFN5000/6000 cards don't fully support SR-IOV in the same way that 7k/8k models do, but I'm unable to find any documentation re: SR-IOV configuration on these older boards.

    @WANg Any thoughts? I didn't want to hijack your build thread for technical support on this but I remember you got this working on one of your boards.

    Code:
    root@ted:~# ethtool -i enp1s0f0np0
    driver: sfc
    version: 4.13.1.1034
    firmware-version: 3.3.2.1000
    expansion-rom-version:
    bus-info: 0000:01:00.0
    supports-statistics: yes
    supports-test: yes
    supports-eeprom-access: no
    supports-register-dump: yes
    supports-priv-flags: yes
    
    Code:
    root@ted:~# sfboot
    Solarflare boot configuration utility [v7.1.3]
    Copyright Solarflare Communications 2006-2018, Level 5 Networks 2002-2005
    
    enp1s0f0np0:
      Boot image                            Disabled
      PF MSI-X interrupt limit              32
      SR-IOV                                Enabled
      Virtual Functions on each PF          4
      VF MSI-X interrupt limit              8
    
    enp1s0f1np1:
      Boot image                            Disabled
      PF MSI-X interrupt limit              32
      SR-IOV                                Enabled
      Virtual Functions on each PF          4
      VF MSI-X interrupt limit              8
    
    Code:
    root@ted:~# dmesg | grep sfc
    [    3.829429] sfc: loading out-of-tree module taints kernel.
    [    3.838992] sfc 0000:01:00.0 (unnamed net_device) (uninitialized): Solarflare NIC detected
    [    3.846934] sfc 0000:01:00.0 (unnamed net_device) (uninitialized): Part Number : SFN5122F
    [    3.846947] sfc 0000:01:00.0: enabling device (0000 -> 0003)
    [    3.877240] sfc 0000:01:00.0 (unnamed net_device) (uninitialized): no PTP support
    [    4.624154] sfc 0000:01:00.1 (unnamed net_device) (uninitialized): Solarflare NIC detected
    [    4.626797] sfc 0000:01:00.1 (unnamed net_device) (uninitialized): Part Number : SFN5122F
    [    4.626810] sfc 0000:01:00.1: enabling device (0000 -> 0003)
    [    4.656173] sfc 0000:01:00.1 (unnamed net_device) (uninitialized): no PTP support
    [    4.666793] sfc 0000:01:00.0 enp1s0f0np0: renamed from eth0
    [    4.680311] sfc 0000:01:00.1 enp1s0f1np1: renamed from eth1
    
    Code:
    root@ted:~# lspci -tv
    -[0000:00]-+-00.0  Advanced Micro Devices, Inc. [AMD] Family 15h (Models 30h-3fh) Processor Root Complex
              +-00.2  Advanced Micro Devices, Inc. [AMD] Family 15h (Models 30h-3fh) I/O Memory Management Unit
              +-01.0  Advanced Micro Devices, Inc. [AMD/ATI] Kaveri [Radeon R7 Graphics]
              +-01.1  Advanced Micro Devices, Inc. [AMD/ATI] Kaveri HDMI/DP Audio Controller
              +-02.0  Advanced Micro Devices, Inc. [AMD] Device 1424
              +-02.1-[01]--+-00.0  Solarflare Communications SFC9020 [Solarstorm]
              |            \-00.1  Solarflare Communications SFC9020 [Solarstorm]
              +-03.0  Advanced Micro Devices, Inc. [AMD] Device 1424
              +-03.2-[02]----00.0  Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller
              +-04.0  Advanced Micro Devices, Inc. [AMD] Device 1424
              +-10.0  Advanced Micro Devices, Inc. [AMD] FCH USB XHCI Controller
              +-10.1  Advanced Micro Devices, Inc. [AMD] FCH USB XHCI Controller
              +-11.0  Advanced Micro Devices, Inc. [AMD] FCH SATA Controller [AHCI mode]
              +-12.0  Advanced Micro Devices, Inc. [AMD] FCH USB OHCI Controller
              +-12.2  Advanced Micro Devices, Inc. [AMD] FCH USB EHCI Controller
              +-13.0  Advanced Micro Devices, Inc. [AMD] FCH USB OHCI Controller
              +-13.2  Advanced Micro Devices, Inc. [AMD] FCH USB EHCI Controller
              +-14.0  Advanced Micro Devices, Inc. [AMD] FCH SMBus Controller
              +-14.1  Advanced Micro Devices, Inc. [AMD] FCH IDE Controller
              +-14.2  Advanced Micro Devices, Inc. [AMD] FCH Azalia Controller
              +-14.3  Advanced Micro Devices, Inc. [AMD] FCH LPC Bridge
              +-14.4-[03]--
              +-18.0  Advanced Micro Devices, Inc. [AMD] Family 15h (Models 30h-3fh) Processor Function 0
              +-18.1  Advanced Micro Devices, Inc. [AMD] Family 15h (Models 30h-3fh) Processor Function 1
              +-18.2  Advanced Micro Devices, Inc. [AMD] Family 15h (Models 30h-3fh) Processor Function 2
              +-18.3  Advanced Micro Devices, Inc. [AMD] Family 15h (Models 30h-3fh) Processor Function 3
              +-18.4  Advanced Micro Devices, Inc. [AMD] Family 15h (Models 30h-3fh) Processor Function 4
              \-18.5  Advanced Micro Devices, Inc. [AMD] Family 15h (Models 30h-3fh) Processor Function 5
    

    Edit: I'm striking out on this so far, here's what pops up in dmesg before SR-IOV fails:

    Code:
    root@ted:~/tmp/mlx# modprobe sfc max_vfs=4
    
    dmesg:
    
    [  830.657340] pps_core: LinuxPPS API ver. 1 registered
    [  830.657343] pps_core: Software ver. 5.3.6 - Copyright 2005-2007 Rodolfo Giometti <giometti@linux.it>
    [  830.658749] PTP clock support registered
    [  830.666762] Solarflare NET driver v4.13.1.1034
    [  830.666805] Registered efx_mcdi_proxy genl family as 28
    [  830.666901] sfc 0000:01:00.0 (unnamed net_device) (uninitialized): Solarflare NIC detected
    [  830.668735] sfc 0000:01:00.0 (unnamed net_device) (uninitialized): Part Number : SFN5122F
    [  830.705680] sfc 0000:01:00.0 (unnamed net_device) (uninitialized): no PTP support
    [  831.544552] pci 0000:01:00.2: [1924:1803] type 00 class 0x020000
    [  831.544988] iommu: Adding device 0000:01:00.2 to group 15
    [  831.545024] iommu: Using direct mapping for device 0000:01:00.2
    [  831.545194] pci 0000:01:00.4: [1924:1803] type 00 class 0x020000
    [  831.545506] iommu: Adding device 0000:01:00.4 to group 16
    [  831.545532] iommu: Using direct mapping for device 0000:01:00.4
    [  831.545627] pci 0000:01:00.6: [1924:1803] type 00 class 0x020000
    [  831.545938] iommu: Adding device 0000:01:00.6 to group 17
    [  831.545961] iommu: Using direct mapping for device 0000:01:00.6
    [  831.546065] pci 0000:01:01.0: [1924:1803] type 7f class 0xffffff
    [  831.546081] pci 0000:01:01.0: unknown header type 7f, ignoring device
    [  831.546254] iommu: Removing device 0000:01:00.6 from group 17
    [  831.546377] iommu: Removing device 0000:01:00.4 from group 16
    [  831.546602] iommu: Removing device 0000:01:00.2 from group 15
    [  832.552640] sfc 0000:01:00.0 (unnamed net_device) (uninitialized): SR-IOV can't be enabled rc -5
    [  832.565729] sfc 0000:01:00.1 (unnamed net_device) (uninitialized): Solarflare NIC detected
    [  832.566333] sfc 0000:01:00.0 enp1s0f0np0: renamed from eth0
    [  832.567644] sfc 0000:01:00.1 (unnamed net_device) (uninitialized): Part Number : SFN5122F
    [  832.602788] sfc 0000:01:00.1 (unnamed net_device) (uninitialized): no PTP support
    [  832.716523] pci 0000:01:00.3: [1924:1803] type 00 class 0x020000
    [  832.716967] iommu: Adding device 0000:01:00.3 to group 15
    [  832.717000] iommu: Using direct mapping for device 0000:01:00.3
    [  832.717171] pci 0000:01:00.5: [1924:1803] type 00 class 0x020000
    [  832.717487] iommu: Adding device 0000:01:00.5 to group 16
    [  832.717512] iommu: Using direct mapping for device 0000:01:00.5
    [  832.717605] pci 0000:01:00.7: [1924:1803] type 00 class 0x020000
    [  832.717896] iommu: Adding device 0000:01:00.7 to group 17
    [  832.717918] iommu: Using direct mapping for device 0000:01:00.7
    [  832.718003] pci 0000:01:01.1: [1924:1803] type 7f class 0xffffff
    [  832.718019] pci 0000:01:01.1: unknown header type 7f, ignoring device
    [  832.718169] iommu: Removing device 0000:01:00.7 from group 17
    [  832.718272] iommu: Removing device 0000:01:00.5 from group 16
    [  832.718365] iommu: Removing device 0000:01:00.3 from group 15
    [  833.736574] sfc 0000:01:00.1 (unnamed net_device) (uninitialized): SR-IOV can't be enabled rc -5
    [  833.741315] sfc 0000:01:00.1 enp1s0f1np1: renamed from eth0
    
    I've tried adding "pci=realloc" per the solarflare docs but that doesn't help.

    SR-IOV works fine on other NICs in this machine.
     
    #1
    Last edited: Sep 6, 2018
  2. arglebargle

    arglebargle H̸̖̅ȩ̸̐l̷̦͋l̴̰̈ỏ̶̱ ̸̢͋W̵͖̌ò̴͚r̴͇̀l̵̼͗d̷͕̈

    Joined:
    Jul 15, 2018
    Messages:
    245
    Likes Received:
    75
    Alright, I've got the driver creating VFs. Apparently I can't create more than 3 VFs per port without the driver failing with "SR-IOV can't be enabled rc -5".

    Note: max_vfs=5,1 doesn't work, 3 VFs is the hard limit for any port.

    Here's the relevant section from an older copy of the solarflare adapter user guide:

    Screenshot 2018-09-06 at 10.14.55 AM.png

    Code:
    root@ted:~# modprobe sfc max_vfs=3
    
    relevant dmesg:
    [ 2233.105076] pps_core: LinuxPPS API ver. 1 registered
    [ 2233.105079] pps_core: Software ver. 5.3.6 - Copyright 2005-2007 Rodolfo Giometti <giometti@linux.it>
    [ 2233.105883] PTP clock support registered
    [ 2233.113302] Solarflare NET driver v4.13.1.1034
    [ 2233.113344] Registered efx_mcdi_proxy genl family as 28
    [ 2233.113439] sfc 0000:01:00.0 (unnamed net_device) (uninitialized): Solarflare NIC detected
    [ 2233.115259] sfc 0000:01:00.0 (unnamed net_device) (uninitialized): Part Number : SFN5122F
    [ 2233.160207] sfc 0000:01:00.0 (unnamed net_device) (uninitialized): no PTP support
    [ 2234.100777] pci 0000:01:00.2: [1924:1803] type 00 class 0x020000
    [ 2234.101203] iommu: Adding device 0000:01:00.2 to group 15
    [ 2234.101239] iommu: Using direct mapping for device 0000:01:00.2
    [ 2234.101448] pci 0000:01:00.4: [1924:1803] type 00 class 0x020000
    [ 2234.101764] iommu: Adding device 0000:01:00.4 to group 16
    [ 2234.101787] iommu: Using direct mapping for device 0000:01:00.4
    [ 2234.101879] pci 0000:01:00.6: [1924:1803] type 00 class 0x020000
    [ 2234.102205] iommu: Adding device 0000:01:00.6 to group 17
    [ 2234.102228] iommu: Using direct mapping for device 0000:01:00.6
    [ 2234.102332] sfc 0000:01:00.0 (unnamed net_device) (uninitialized): enabled SR-IOV for 3 VFs, 1 VI per VF
    [ 2234.113322] sfc 0000:01:00.1 (unnamed net_device) (uninitialized): Solarflare NIC detected
    [ 2234.115542] sfc 0000:01:00.1 (unnamed net_device) (uninitialized): Part Number : SFN5122F
    [ 2234.157244] sfc 0000:01:00.1 (unnamed net_device) (uninitialized): no PTP support
    [ 2234.264781] pci 0000:01:00.3: [1924:1803] type 00 class 0x020000
    [ 2234.265213] iommu: Adding device 0000:01:00.3 to group 18
    [ 2234.265249] iommu: Using direct mapping for device 0000:01:00.3
    [ 2234.265465] pci 0000:01:00.5: [1924:1803] type 00 class 0x020000
    [ 2234.265842] iommu: Adding device 0000:01:00.5 to group 19
    [ 2234.265866] iommu: Using direct mapping for device 0000:01:00.5
    [ 2234.265996] pci 0000:01:00.7: [1924:1803] type 00 class 0x020000
    [ 2234.266384] iommu: Adding device 0000:01:00.7 to group 20
    [ 2234.266409] iommu: Using direct mapping for device 0000:01:00.7
    [ 2234.266577] sfc 0000:01:00.1 (unnamed net_device) (uninitialized): enabled SR-IOV for 3 VFs, 1 VI per VF
    [ 2234.275661] sfc 0000:01:00.1 enp1s0f1np1: renamed from eth1
    [ 2234.297080] sfc 0000:01:00.0 enp1s0f0np0: renamed from eth0
    
    I'm a bit stumped here, I'm able to create more VFs on my Mellanox cards on the same machine though IIRC I think I'm still limited to something like ~8 per card there as well.
     
    #2
    Last edited: Sep 6, 2018
  3. WANg

    WANg Active Member

    Joined:
    Jun 10, 2018
    Messages:
    184
    Likes Received:
    69
    Huh. I'll look into it this weekend - right now the t730 is up and running as my home hypervisor - but I'll take it down and have a look in Proxmox. I don't remember the firmware release on my SFN5122 being that old, though...
     
    #3
  4. arglebargle

    arglebargle H̸̖̅ȩ̸̐l̷̦͋l̴̰̈ỏ̶̱ ̸̢͋W̵͖̌ò̴͚r̴͇̀l̵̼͗d̷͕̈

    Joined:
    Jul 15, 2018
    Messages:
    245
    Likes Received:
    75
    Alright, I dug into this a bit more today with a second adapter (SFN7122F.) Same deal with the VF limitations, no more than 3 VFs per port.

    Some notes on VF passthrough to guest operating systems: Passthrough to a Ubuntu 18.04 guest works fine, no issue with the driver, but passthrough to FreeBSD fails miserably. Both the in-tree FreeBSD sfxge driver and the vendor supplied driver fail to attach to the device when working with a VF passed through to the guest.

    Edit: A little more on the 3 VF per port limit; the driver creates VFs for each port on alternating PCIe subaddresses. Physical port 1 gets subaddress .0, physical port 2 gets .1 and then all VFs for port 1 are assigned to even numbered subaddress: VF 1: .2, VF 2: .4, VF 3: .6.

    Any attempt to assign VFs that would assign a subaddress > 7 (addresses are zero indexed) causes the entire block assignment to fail. You can verify this by loading the driver with "max_vfs=3,6", VFs for port 1 should be assigned correctly while VFs for port 2 fail.

    For some reason I'm unable to address more than 8 PCIe subaddresses on the card, I'm not sure if this is a Solarflare issue or a BIOS issue on my test machine.

    Alright, I've figured out the 8 subaddress limit too. Apparently the subaddress is a 3 bit field limited to addresses 0-7. If your upstream pcie bridge or port supports ARI forwarding (Alternative Routing-ID Interpretation) then you can address up to 256 devices per PCIe address. My test machine doesn't support ARIFwd so that's that.
     
    #4
    Last edited: Sep 13, 2018
Similar Threads: Solarflare SR-IOV
Forum Title Date
Linux Admins, Storage and Virtualization Proxmox - I350 and sr-iov issues Mar 20, 2018
Linux Admins, Storage and Virtualization Proxmox SR-IOV for ConnectX3 Nov 25, 2017
Linux Admins, Storage and Virtualization Next-gen AMD graphics platform Vega might have SR-IOV baked into Linux drivers Mar 22, 2017

Share This Page