ESXi under Proxmox + pci-passthrough CX3 VF

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

_alex

Active Member
Jan 28, 2016
866
97
28
Bavaria / Germany
Hi,
for a poc-setup i installed ESXi nested on a proxmox host.
I use sr-iov of my cx-3 and passthrough 3 vf's to the ESXi.

The problem is that i have very little experience with ESXi and would like to avoid installing other / older ofed if ever possible.

The result looks 'not so wrong', but the vf's are not showing up in the vmnic's:

Code:
[root@localhost:~] esxcfg-nics -l
Name    PCI          Driver      Link Speed      Duplex MAC Address       MTU    Description                   
vmnic3  0000:06:12.0 e1000       Up   1000Mbps   Full   2e:82:49:d1:e4:97 1500   Intel Corporation 82540EM Gigabit Ethernet Controller
[root@localhost:~] lspci |grep -i mellan
0000:01:00.0 Network controller: Mellanox Technologies MT27500 Family [ConnectX-3 Virtual Function] [vmnic0]
0000:02:00.0 Network controller: Mellanox Technologies MT27500 Family [ConnectX-3 Virtual Function] [vmnic1]
0000:04:00.0 Network controller: Mellanox Technologies MT27500 Family [ConnectX-3 Virtual Function] [vmnic2]
vmnic seems to count up, so the vf's are obviously somehow recognized as nics, but don't show in the GUI or when listing the nic on the cli :(

dmesg shows this (extracted the parts that look suspicious):

Code:
2017-12-18T20:22:49.173Z cpu7:65956)Loading module nmlx4_core ...
2017-12-18T20:22:49.174Z cpu7:65956)Elf: 2043: module nmlx4_core has license BSD
2017-12-18T20:22:49.178Z cpu7:65956)nmlx: nmlx4_core: init_module called
2017-12-18T20:22:49.178Z cpu7:65956)Device: 191: Registered driver 'nmlx4_core' from 16
2017-12-18T20:22:49.178Z cpu7:65956)Mod: 4968: Initialization of nmlx4_core succeeded with module ID 16.
2017-12-18T20:22:49.178Z cpu7:65956)nmlx4_core loaded successfully.
2017-12-18T20:22:49.179Z cpu5:65927)nmlx4_core: 0000:01:00.0: nmlx4_BarsMap - (vmkdrivers/native/BSD/Network/mlnx/nmlx4/nmlx4_core/nmlx4_core_pci.c:174) failed to map bar 0: Out of resources
2017-12-18T20:22:49.179Z cpu5:65927)nmlx4_core: 0000:01:00.0: nmlx4_PciStart - (vmkdrivers/native/BSD/Network/mlnx/nmlx4/nmlx4_core/nmlx4_core_pci.c:404) nmlx4_BarsMap failed: Out of resources
2017-12-18T20:22:49.179Z cpu5:65927)nmlx4_core: nmlx4_core_Attach - (vmkdrivers/native/BSD/Network/mlnx/nmlx4/nmlx4_core/nmlx4_core_main.c:2469) nmlx4_PciStart failed: Out of resources
2017-12-18T20:22:49.212Z cpu5:65927)nmlx4_core: 0000:02:00.0: nmlx4_BarsMap - (vmkdrivers/native/BSD/Network/mlnx/nmlx4/nmlx4_core/nmlx4_core_pci.c:174) failed to map bar 0: Out of resources
2017-12-18T20:22:49.212Z cpu5:65927)nmlx4_core: 0000:02:00.0: nmlx4_PciStart - (vmkdrivers/native/BSD/Network/mlnx/nmlx4/nmlx4_core/nmlx4_core_pci.c:404) nmlx4_BarsMap failed: Out of resources
2017-12-18T20:22:49.212Z cpu5:65927)nmlx4_core: nmlx4_core_Attach - (vmkdrivers/native/BSD/Network/mlnx/nmlx4/nmlx4_core/nmlx4_core_main.c:2469) nmlx4_PciStart failed: Out of resources
2017-12-18T20:22:49.213Z cpu5:65927)nmlx4_core: 0000:04:00.0: nmlx4_BarsMap - (vmkdrivers/native/BSD/Network/mlnx/nmlx4/nmlx4_core/nmlx4_core_pci.c:174) failed to map bar 0: Out of resources
2017-12-18T20:22:49.213Z cpu5:65927)nmlx4_core: 0000:04:00.0: nmlx4_PciStart - (vmkdrivers/native/BSD/Network/mlnx/nmlx4/nmlx4_core/nmlx4_core_pci.c:404) nmlx4_BarsMap failed: Out of resources
2017-12-18T20:22:49.213Z cpu5:65927)nmlx4_core: nmlx4_core_Attach - (vmkdrivers/native/BSD/Network/mlnx/nmlx4/nmlx4_core/nmlx4_core_main.c:2469) nmlx4_PciStart failed: Out of resources


[....]
2017-12-18T20:22:51.820Z cpu0:66020)<6>mlx4_core: Initializing 0000:04:00.0
2017-12-18T20:22:51.822Z cpu0:66020)<4>mlx4_core 0000:04:00.0: Detected virtual function - running in slave mode
2017-12-18T20:22:51.822Z cpu0:66020)<4>mlx4_core 0000:04:00.0: Sending reset
2017-12-18T20:22:51.823Z cpu0:66020)<4>mlx4_core 0000:04:00.0: Sending vhcr0
2017-12-18T20:22:51.824Z cpu0:66020)<4>mlx4_core 0000:04:00.0: HCA minimum page size:512
2017-12-18T20:22:51.825Z cpu0:66020)<3>mlx4_core 0000:04:00.0: Unknown pf context behaviour
2017-12-18T20:22:51.825Z cpu0:66020)<3>mlx4_core 0000:04:00.0: Failed to obtain slave caps
2017-12-18T20:22:51.825Z cpu0:66020)WARNING: vmklinux: pci_announce_device:1486: PCI: driver mlx4_core probe failed for device 0000:04:00.0
Guess fixing the first 'failed to map bar 0: Out of resources' - Error fixed might solve the following.

I wonder if the root-cause is my vm-config in pve or the mlx_ - drivers in ESXi not working on a VF ...

Any idea's / help how to get this running would be greatly appreciated.

Alex
 

_alex

Active Member
Jan 28, 2016
866
97
28
Bavaria / Germany
add-on / more information:

- i performed the necessary steps to enable nested virtualization in kvm and enabled iommu etc.
- machine-type on the ESXi vm is q35, i use hostpci0: 02:00.7,pcie=1 in the vm's conf-file, cpu=host (a Haswell)
- i can create linux vm's on the nested ESXi that run fine and reasonable fast
- i can passthrough 02:00.7 with the same config from proxmox to a debian-vm and have mlx4_en up and running on it, assign an ip and ping - so the passtrough for the vf seems to work


@MiniKnight: Never heard of it, maybe the BIOS proxmox passes to ESXi is not right ?
 

_alex

Active Member
Jan 28, 2016
866
97
28
Bavaria / Germany
hm,seems to be more pcie related, on the memory space mapping.
maybe i was to spare with giving only 4gb of RAM to ESXi, as the cx-3 might require some space for the buffers.
will un-rack the thing, put some more ram in and assign more to ESXi and see if it's kernel is happy with more RAM.