Infiniband noob with some routing issues

solon

Member
Apr 1, 2021
53
3
8
Hey all,

I've been slowly working on a sort of covid-keep-oneself occupoed setup for my home network for a while, of which the scope eventually crept to a 40gbit infiniband network, which is working, in any case for what I originally intended, which is faster storage, both smb and nfs-rdma. Going to do some srp things too, but as that's all 1 hop, that seems unlikely to be a problem.

One thing which is a nice to have, is having the internet connection working over ipoib too, which it currently refuses to do, so I still have a gbit network also running to each machine.

The server runs two VM's one for pfsense, and one for pihole, which is de network's dns server. Getting the connectX-3 to work on pfsense was a bit of a hassle eventually requiring me to build the mellanox kernel modules on a sandbox freebsd 12 install, then copying them over to the pfsense box. Pfsense now recognizes the SR-IOV split connectX-3 cards on the server, and it's happy to talk to the server and my remote workstation - sort of.

Despite pfsense having the ib interface listed and working, it doesn't route internet traffic over it, for instance. Firewall settings for all LANs on the pfsense guest are identical.

Is there something really basic I'm missing why pfsense won't route internet traffic over the ib adapter like it does for the other LANs? Opensm is enabled, and obviously the adapter does have some connectivity.

I've spent a number of hours reading about infiniband configuraion, which usually gives me some direction, but here I still feel like I have no idea why it isn't working.

As an example of something that mistifies me is that can ssh from my workstation or server (that runs the pfsense guest) to pfsense LAN over the gbe network, but not over the ipoib one. I can ssh the other way fine, from pfsense -> workstation or server. Is this something wrong with the SR-IOV setup that only allows partial connectivity?

Anyone who has a suggestion on some reading material that might help, or perhaps just knows what the problem is, I'd really like to hear about it.

Everything is running in connected mode.

Oh the switch is an unmanaged IS5022. It seemed fine for my faster storage goal, but it would be nice to have if it was also capable of moving internet traffic, or at least, that I understand why it can't.
 

necr

Active Member
Dec 27, 2017
124
39
28
122
So you can ssh over IB from pfsense to some other servers? Can you ping pfsense over IB?
 

solon

Member
Apr 1, 2021
53
3
8
I have not been succesful in pinging pfsense over ib from any source, though I have to admit some confusion about the whole ib ping command, which, I should probably look into a little more throughtly now that you mention it.

I can ssh/ping from pfsense to my workstation aswell (over ipoib ip's), though not the other way around.

I can reach the host that runs the pfsense over ib, pin/sshg (ipoib, normal ping command) in both directions over the link. Pfsense responds to pings over the parallel gbit ethernet network from all the same machines. just not to pings over ib.

Expanding a little, with my recent forays trying to get some insight into ib commands, I have noticed that ibnodes doesn't list the SR-IOV virtual ib adapter as a node on the network, but then I don't really know if it's supposed to.
 

necr

Active Member
Dec 27, 2017
124
39
28
122
I can ssh/ping from pfsense to my workstation aswell (over ipoib ip's), though not the other way around.
This could indicate a firewall or a routing issue on pfsense, IPoIB doesn’t seem to be broken. ip r s, iptables dump would be great to see
 

solon

Member
Apr 1, 2021
53
3
8
Right well running into some limitations of the pfsense image here. There is no iptables, instead one needs to use pfctl to get an idea of what's going on with the fw, the results of which are:

Code:
ipsec rules/nat contents:

miniupnpd rules/nat contents:

natearly rules/nat contents:

natrules rules/nat contents:

openvpn rules/nat contents:

tftp-proxy rules/nat contents:

userrules rules/nat contents:
So probably not very useful.

The ip command doesn't exist on pfsense and can't be installed from the pkg's available.

I'm thinking of giving opnsense a try as it can be installed on a regular freebsd install, which should at least give me access to all the normal tools and things with a minimum of hassle, and I've already gotten ib working on a freebsd sandbox vm, that's where I built the modules to make ib work at all on the pfsense VM.
 

Fallen Kell

Member
Mar 10, 2020
45
14
8
What is your exact ConnectX-3 card? Also, if you are really wanting to get the full speed of the network, unless you have an extremely fast CPU, using ipoib will be bottlenecked by your CPU as it requires the CPU to perform many of the standard network features (i.e. it is not using a hardware offload capability of the network card). Most CPUs will max at around 20-28gbps when doing this. If you really want 40gbps links, you need to have a ConnectX-3 VPI or PRO card and change the port type via the mellanox tools to be "ethernet" on the ConnectX-3 card, and then you will be able to use the offload engines in those cards to get proper 40gbe speeds (and pfsense will actually recognize the ports as network interfaces when set this way).
 

solon

Member
Apr 1, 2021
53
3
8
The card in the server is an MCX-354a-QCBT originally Dell or whatever their IS prefix subbrand is, now flashed to latest official firmware as described somewhere here. Haven't taken the step to try and flash it to an FCBT (56gbit), though what I'm reading seems to suggest it might be possible. The switch is an IS5022 without a subnet manager. These are connectx-3pro VPI cards. I also have some connectX-2's but they are in the windows PC's which I use for gaming and I can spare cores nf those to run storage access if need be, they both have far more threads than I'll need for the next few years. All the hardware, exccept a few of the cables are Mellanox.

My board has 2x 12 core Xeon E5-2650L-V3's, currently 32Gb RAM, upgrade to 64Gb is on the way. Network traffic other than RDMA based stuff will probably never exceed 110mbit as that's as far as the internet connection goes around here. I'm considering a second cable internet connection which is advertised as 600mbit to load balance via pfsense, but quite honestly the cable companies are liars and the odds of anything I get from them even remotely approaching 600mbit are rather low. Load balanced by pfsense, I'd expect no more than 400mbit tops even with the second connection. I was considering upgrading to two e5-2650L-V4's but the performance gain looks almost non existent other than that if would give 4 more cores. I could go for higher tdp e5's I suppose, but my feeling is that performance should be adequate.

I've kind of made peace with having 2 cables to each PC, one at 1gbit for the internet connection and infiniband for the rdma stuff. if can get the storage running at 25gbits, that would be fine, the nvme drives aren't likely to be much faster than that anyway. Also, every time I break something and want to google a solution it's nice when one of the two network connections still works.

My use case is mostly storage, hence the infiniband. It was mainly caused because my gpu passthrough was limiting my addon card options on my workstation, combined with irritation when moving data to and from my server at 1gbit.