ConnectX VPI Infiniband card and Solaris 11.1

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

rune-san

Member
Feb 7, 2014
81
18
8
Hello all,

Long-time lurker, first time poster. I was unsure whether this would be more suitable in the Networking or Solaris section but settled over here. To the point, I am trying to configure my Infiniband cards between my Solaris ZFS storage appliance and my ESXi 5.5 host. I started on the storage side to try and get the card to work. Specifically the cards are Voltaire HCA 500Ex-D HCA-00001 units. Dual port DDR 4x Infiniband gear. At first, there was no reference to the card, but I noticed that under prtconf, the system listed that there was no driver attached to the card. After adding the ib packages (basically anything with the label infiband or ib in the package repository, which included the Mellanox ConnectX driver), I now see the card under prtconf

A paste of prtconf -v, limited to only the pci instance of the card.

Code:
pci15b3,634a, instance #0
                System software properties:
                    name='ddi-forceattach' type=int items=1
                        value=00000001
                    name='active-dma-flush' type=int items=1
                        value=00000001
                    name='ddi-vhci-class' type=string items=1
                        value='ib'
                    name='iommu-dvma-mode' type=string items=1
                        value='unity'
                Driver properties:
                    name='fm-accchk-capable' type=boolean dev=none
                    name='fm-ereport-capable' type=boolean dev=none
                Hardware properties:
                    name='assigned-addresses' type=int items=10
                        value=83020010.00000000.fbd00000.00000000.00100000.c3020018.00000000.fa000000.00000000.00800000
                    name='reg' type=int items=15
                        value=00020000.00000000.00000000.00000000.00000000.03020010.00000000.00000000.00000000.00100000.43020018.00000000.00000000.00000000.00800000
                    name='compatible' type=string items=13
                        value='pciex15b3,634a.15b3.634a.a0' + 'pciex15b3,634a.15b3.634a' + 'pciex15b3,634a.a0' + 'pciex15b3,634a' + 'pciexclass,0c0600' + 'pciexclass,0c06' + 'pci15b3,634a.15b3.634a.a0' + 'pci15b3,634a.15b3.634a' + 'pci15b3,634a' + 'pci15b3,634a.a0' + 'pci15b3,634a' + 'pciclass,0c0600' + 'pciclass,0c06'
                    name='model' type=string items=1
                        value='InfiniBand'
                    name='power-consumption' type=int items=2
                        value=00000001.00000001
                    name='devsel-speed' type=int items=1
                        value=00000000
                    name='interrupts' type=int items=1
                        value=00000001
                    name='subsystem-vendor-id' type=int items=1
                        value=000015b3
                    name='subsystem-id' type=int items=1
                        value=0000634a
                    name='unit-address' type=string items=1
                        value='0'
                    name='class-code' type=int items=1
                        value=000c0600
                    name='revision-id' type=int items=1
                        value=000000a0
                    name='vendor-id' type=int items=1
                        value=000015b3
                    name='device-id' type=int items=1
                        value=0000634a
                Device Minor Nodes:
                    dev=(307,0)
                        dev_path=/pci@0,0/pci8086,3c04@2/pci15b3,634a@0:devctl
                            spectype=chr type=minor
                            dev_link=/dev/infiniband/hca/hermon0
Namely, it mentions that it how now got the proper hermon0 driver attached (which should be default now that Tavor is gone in 11.1). The problem is that cfgadm is completely empty. Shouldn't there be an HCA listed? There are currently no lights on any of the ports of my IB card. The host and storage device are directly attached via CX4. No IB switch is involved (I intend to run OpenSM off the ESXi host). I have not worked on the ESXi host side of things yet.

Is there an obvious step I'm missing to get a configurable HCA in cfgadm or some sort of NIC port? While ultimately I want to run SRP, I'd like to have some sort of baseline that I know works. Obviously right now it doesn't work. I appreciate any advice or clues that can be given, and of course I'm happy to pull up any more information that might help! :D

Thanks Community!
 

Chuckleb

Moderator
Mar 5, 2013
1,017
331
83
Minnesota
Not necessarily a Solaris answer, but I think when I connected my CentOS nodes together w/o a subnet manager, they wouldn't light up or anything. I think that the opensm is needed to get them on the same fabric. I don't remember for sure though, I just fire up opensm by habit now.
 

cactus

Moderator
Jan 25, 2011
830
75
28
CA
what does dladm show?

I only have experience with OmniOS and getting them to work as IPoIB.
 

rune-san

Member
Feb 7, 2014
81
18
8
Hey all, as always, I appreciate the responses! In regards to getting the SM up and running, I'm working on that presently, and hopefully it won't take too long to get up on that side (I have work and taxes all through the weekend so executing some of this stuff may be delayed!

To answer ya'lls questions, I do not see anything referencing IB at all under cfgadm:

Code:
:~# cfgadm
Ap_Id                          Type         Receptacle   Occupant     Condition
c8                             scsi-sas     connected    configured   unknown
c9                             scsi-sas     connected    unconfigured unknown
sata0/0                        sata-port    empty        unconfigured ok
sata0/1                        sata-port    empty        unconfigured ok
sata0/2                        sata-port    empty        unconfigured ok
sata0/3                        sata-port    empty        unconfigured ok
sata0/4                        sata-port    empty        unconfigured ok
sata0/5                        sata-port    empty        unconfigured ok
usb2/1                         usb-hub      connected    configured   ok
usb2/1.1                       unknown      empty        unconfigured ok
usb2/1.2                       unknown      empty        unconfigured ok
usb2/1.3                       unknown      empty        unconfigured ok
usb2/1.4                       unknown      empty        unconfigured ok
usb2/1.5                       unknown      empty        unconfigured ok
usb2/1.6                       usb-device   connected    configured   ok
usb2/2                         unknown      empty        unconfigured ok
usb3/1                         usb-hub      connected    configured   ok
usb3/1.1                       unknown      empty        unconfigured ok
usb3/1.2                       unknown      empty        unconfigured ok
usb3/1.3                       unknown      empty        unconfigured ok
usb3/1.4                       unknown      empty        unconfigured ok
usb3/1.5                       unknown      empty        unconfigured ok
usb3/1.6                       unknown      empty        unconfigured ok
usb3/1.7                       unknown      empty        unconfigured ok
usb3/1.8                       unknown      empty        unconfigured ok
usb3/2                         unknown      empty        unconfigured ok
You can see the SAS, SATA, and USB ports quite clearly, but no reference to IB like the Oracle documentation shows should be there.

Dladm is equally empty:

Code:
:~# dladm
LINK                CLASS     MTU    STATE    OVER
net0                phys      1500   up       --
net2                phys      9000   up       --
net3                phys      9000   up       --
net1                phys      9000   up       --
aggr1               aggr      1500   up       net0 net1 net2 net3
No sign of an IB referencing interface. Just the 4 links of Ethernet. As for SRP vs IPoIB, I'm flexible and am actually trying to get IPoIB working right now so that I have an actual working base. Then I thought I'd try experimenting with SRP. So feel free to talk from the position at which you're comfortable :)