Switching seems not working on Mellanox SX6012

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

vanfront

Member
Jun 5, 2020
36
14
8
I have an SX6012 (MLNX OS) and starting to be desperate to get it up and working. The switch seems not to switch at all — which is, frankly, quite an issue with a switch :D

The desired setup is as follows: I have a 4-node Supermicro server, each with a ConnectX-4 in a SIOM slot, plus a ConnectX-3 in a PCIe slot. CX4s are connected to ports 1 to 4, CX3s to ports 9 to 12. Ports 5 to 8 are unoccupied. Cables are Mellanox for 56 Gbit Ethernet. All cards are in Ethernet mode. WinOF-2 drivers installed for CX4, but CX3 cards use Microsoft drivers. OS is Windows Server 2019 Datacenter with roles and features needed for hyper-converged infrastructure. Using WAC to setup the cluster, but am stuck on network connectivity.

I am using two VLANs — 48 for CX4s and 58 for CXs. Switch configuration has VLAN interfaces defined, each with an IP address assigned (192.168.48.1, 192.168.58.1 respectively). Ports are members of VLAN. All cards are configured in the OS with the respective VLAN number. MTU in the OS is set to 9000.

All ports are configured as 56 Gbit and shown as so in Windows. To avoid issues with 56 Gbit, I downgraded speeds on ports 1 and 2 to 40 Gbit but to no avail.

Ping doesn't work. I do have firewall exceptions and servers can ping each other using 1 Gbit connections.

TCP doesn't work. I am using iPerf to run a test data stream. Again, works via 1 Gbit.

ARP resolution doesn't work — using arp-ping.exe, won't resolve anything.

So far, all tests on CX4s only.

Switch OS version: PPC_M460EX 3.6.1002 2016-06-09 20:24:26 ppc

System profile is vpi-single-switch.
1609194995787.png

Current running switch config:
Code:
##
## Running database "boldbrick-dc-3"
## Generated at 2020/12/29 00:24:08 +0100
## Hostname: switch02
##

##
## Running-config temporary prefix mode setting
##
no cli default prefix-modes enable

##
## Port configuration
##
   port 1/1 type ethernet force
   port 1/2 type ethernet force
   port 1/3 type ethernet force
   port 1/4 type ethernet force
   port 1/5 type ethernet force
   port 1/6 type ethernet force
   port 1/7 type ethernet force
   port 1/8 type ethernet force
   port 1/9 type ethernet force
   port 1/10 type ethernet force
   port 1/11 type ethernet force
   port 1/12 type ethernet force
  
##
## Interface Ethernet configuration
##
   interface ethernet 1/1 mtu 9000 force
   interface ethernet 1/2 mtu 9000 force
   interface ethernet 1/3 mtu 9000 force
   interface ethernet 1/3 speed 56000 force
   interface ethernet 1/4 mtu 9000 force
   interface ethernet 1/4 speed 56000 force
   interface ethernet 1/5 speed 56000 force
   interface ethernet 1/6 speed 56000 force
   interface ethernet 1/7 speed 56000 force
   interface ethernet 1/8 speed 56000 force
   interface ethernet 1/9 mtu 9000 force
   interface ethernet 1/9 speed 56000 force
   interface ethernet 1/10 mtu 9000 force
   interface ethernet 1/10 speed 56000 force
   interface ethernet 1/11 mtu 9000 force
   interface ethernet 1/11 speed 56000 force
   interface ethernet 1/12 mtu 9000 force
   interface ethernet 1/12 speed 56000 force
   interface ethernet 1/1 switchport mode hybrid
   interface ethernet 1/2 switchport mode hybrid
   interface ethernet 1/3 switchport mode hybrid
   interface ethernet 1/4 switchport mode hybrid
   interface ethernet 1/9 switchport mode hybrid
   interface ethernet 1/10 switchport mode hybrid
   interface ethernet 1/11 switchport mode hybrid
   interface ethernet 1/12 switchport mode hybrid
  
##
## VLAN configuration
##
   vlan 48
   vlan 58
  
   interface ethernet 1/1 switchport access vlan 48
   interface ethernet 1/2 switchport access vlan 48
   interface ethernet 1/3 switchport access vlan 48
   interface ethernet 1/4 switchport access vlan 48
   interface ethernet 1/9 switchport access vlan 58
   interface ethernet 1/10 switchport access vlan 58
   interface ethernet 1/11 switchport access vlan 58
   interface ethernet 1/12 switchport access vlan 58
   vlan 48 name "DC 40 GbE A (primary)"
   vlan 58 name "DC 40 GbE B (secondary)"
  
##
## STP configuration
##
   spanning-tree port type edge default
  
##
## L3 configuration
##
   ip l3
   ip routing vrf default
   interface vlan 48
   interface vlan 58
   interface vlan 48 ip address 192.168.48.1 255.255.255.0
   interface vlan 48 mtu 9000
   interface vlan 58 ip address 192.168.58.1 255.255.255.0
   interface vlan 58 mtu 9000
  
##
## DCBX PFC configuration
##
   dcb priority-flow-control enable force
   dcb priority-flow-control priority 3 enable
   interface ethernet 1/1 dcb priority-flow-control mode on force
   interface ethernet 1/2 dcb priority-flow-control mode on force
   interface ethernet 1/3 dcb priority-flow-control mode on force
   interface ethernet 1/4 dcb priority-flow-control mode on force
   interface ethernet 1/9 dcb priority-flow-control mode on force
   interface ethernet 1/10 dcb priority-flow-control mode on force
   interface ethernet 1/11 dcb priority-flow-control mode on force
   interface ethernet 1/12 dcb priority-flow-control mode on force
  
##
## LLDP configuration
##
   lldp
  
##
## Network interface configuration
##
no interface mgmt0 dhcp
   interface mgmt0 ip address 192.168.8.3 /24
  
##
## Network interface IPv6 configuration
##
no interface mgmt0 ipv6 address autoconfig default
no interface mgmt0 ipv6 enable
  
##
## Other IP configuration
##
   ip name-server 192.168.8.21
   ip name-server 192.168.8.22
   ip name-server 8.8.8.8
   ip route 0.0.0.0 0.0.0.0 192.168.8.1
   hostname switch02
   ip domain-list corp.boldbrick.com
  
##
## Other IPv6 configuration
##
no ipv6 enable
All ports are green in MLNX OS web interface:
1609195104605.png

All networking config as shown in WAC:
1609194780238.png

Any help will be greatly appreciated.
 

Rand__

Well-Known Member
Mar 6, 2014
6,634
1,767
113
Does it work without RDMA setup?

And you are sure you set up the VLANs properly?
 

Rand__

Well-Known Member
Mar 6, 2014
6,634
1,767
113
I noticed you have DCB/PFC set up, so i wanted to see if its working without, just to get a working starting point
 

vanfront

Member
Jun 5, 2020
36
14
8
I noticed you have DCB/PFC set up, so i wanted to see if its working without, just to get a working starting point
Good point. I will try removing both. Note that I do have DCB feature installed in Windows Server.
 

vanfront

Member
Jun 5, 2020
36
14
8
So I disabled DCB PFC for ports 1 to 4. Running continuous ping from server A (port 1) to server B (port 2) and vice-versa. Also disabled flow control in CX4s' drivers.

Current confing (relevant parts):

Code:
##
## Interface Ethernet configuration
##
   interface ethernet 1/1 mtu 9000 force
   interface ethernet 1/2 mtu 9000 force
   interface ethernet 1/3 flowcontrol receive on force
   interface ethernet 1/3 flowcontrol send on force
   interface ethernet 1/3 mtu 9000 force
   interface ethernet 1/3 speed 56000 force
   interface ethernet 1/4 flowcontrol receive on force
   interface ethernet 1/4 flowcontrol send on force
   interface ethernet 1/4 mtu 9000 force
   interface ethernet 1/4 speed 56000 force
   interface ethernet 1/5 speed 56000 force
   interface ethernet 1/6 speed 56000 force
   interface ethernet 1/7 speed 56000 force
   interface ethernet 1/8 speed 56000 force
   interface ethernet 1/9 mtu 9000 force
   interface ethernet 1/9 speed 56000 force
   interface ethernet 1/10 mtu 9000 force
   interface ethernet 1/10 speed 56000 force
   interface ethernet 1/11 mtu 9000 force
   interface ethernet 1/11 speed 56000 force
   interface ethernet 1/12 mtu 9000 force
   interface ethernet 1/12 speed 56000 force
   interface ethernet 1/1 switchport mode hybrid
   interface ethernet 1/2 switchport mode hybrid
   interface ethernet 1/3 switchport mode hybrid
   interface ethernet 1/4 switchport mode hybrid
   interface ethernet 1/9 switchport mode hybrid
   interface ethernet 1/10 switchport mode hybrid
   interface ethernet 1/11 switchport mode hybrid
   interface ethernet 1/12 switchport mode hybrid
  
##
## VLAN configuration
##
   vlan 48
   vlan 58
  
   interface ethernet 1/1 switchport access vlan 48
   interface ethernet 1/2 switchport access vlan 48
   interface ethernet 1/3 switchport access vlan 48
   interface ethernet 1/4 switchport access vlan 48
   interface ethernet 1/9 switchport access vlan 58
   interface ethernet 1/10 switchport access vlan 58
   interface ethernet 1/11 switchport access vlan 58
   interface ethernet 1/12 switchport access vlan 58
   vlan 48 name "DC 40 GbE A (primary)"
   vlan 58 name "DC 40 GbE B (secondary)"
  
##
## STP configuration
##
   spanning-tree port type edge default
  
##
## L3 configuration
##
   ip l3
   ip routing vrf default
   interface vlan 48
   interface vlan 58
   interface vlan 48 ip address 192.168.48.1 255.255.255.0
   interface vlan 48 mtu 9000
   interface vlan 58 ip address 192.168.58.1 255.255.255.0
   interface vlan 58 mtu 9000
  
##
## DCBX PFC configuration
##
   dcb priority-flow-control enable force
   dcb priority-flow-control priority 3 enable
   interface ethernet 1/9 dcb priority-flow-control mode on force
   interface ethernet 1/10 dcb priority-flow-control mode on force
   interface ethernet 1/11 dcb priority-flow-control mode on force
   interface ethernet 1/12 dcb priority-flow-control mode on force
Switch shows 2 packets per second on either interface:

1609198014301.png
1609198029891.png
 

vanfront

Member
Jun 5, 2020
36
14
8
Interesting thing: whenever I run iPerf3 to try connecting from server B (.14 / Eth 1/2) to server A (.13 / Eth 1/1), pings start timing out, instead of "host unreachable":

1609198440672.png
 

vanfront

Member
Jun 5, 2020
36
14
8
@Rand__ I reset the switch into the eth-single-switch mode. Then I removed CX4 devices on both A and B servers and readded them. Started working! :) Then I configured VLAN 48 in the switch, so far so good, both ping and TCP connections work. However, whenever I configured VLAN ID 48 in the driver's advanced properties, it stops working. I'm not sure what's going on here.
 

vanfront

Member
Jun 5, 2020
36
14
8
If you set vlan id in the driver's properties then you should set the switch port to trunk mode.
Ah, thanks for the hint. Then that might be really necessary to do, because Windows Admin Center seems to always try setting a VLAN ID for each NIC it configures.
 

Rand__

Well-Known Member
Mar 6, 2014
6,634
1,767
113
If You have a single vlan configured at nic level you need access with that vlan, else i usually set Hybrid
 

vanfront

Member
Jun 5, 2020
36
14
8
If You have a single vlan configured at nic level you need access with that vlan, else i usually set Hybrid
I did try both access 48 and hybrid, but didn't work. Maybe I was missing some subtle configuration detail.
 

Rand__

Well-Known Member
Mar 6, 2014
6,634
1,767
113
Weird, the '12 (or whole line) is the least picky switch I ever used
 

klui

Well-Known Member
Feb 3, 2019
838
459
63
But 8012 isn't completely working. VLAN creation can't be done through the cli and there could be other non-obvious issues.
 

vanfront

Member
Jun 5, 2020
36
14
8
I was able to get it up and running. What I did is:
  1. reset the switch to the eth-single-switch system profile
  2. did all configuration step-by-step, observing if pings work
  3. at the server-side, I switched from Microsoft to Mellanox Win-OF and Win-OF2 drivers
  4. configured networking from PowerShell with the same properties as the switch, not needing any Mellanox tools
  5. skipped VLAN specification for adapters
Now it works like a charm, RDMA going 52 Gbit /sec. I'm running a Storage Spaces Direct cluster on top. So far the only drawback I discovered is that I can't team CX3 and CX4 cards together, as the Switch Embedded Teaming feature requires identical adapters.

This is my final working configuration:

Code:
##
## Interface Ethernet configuration
##
   interface ethernet 1/1 mtu 9000 force
   interface ethernet 1/1 speed 56000 force
   interface ethernet 1/2 mtu 9000 force
   interface ethernet 1/2 speed 56000 force
   interface ethernet 1/3 mtu 9000 force
   interface ethernet 1/3 speed 56000 force
   interface ethernet 1/4 mtu 9000 force
   interface ethernet 1/4 speed 56000 force
   interface ethernet 1/5 mtu 9000 force
   interface ethernet 1/5 speed 56000 force
   interface ethernet 1/6 mtu 9000 force
   interface ethernet 1/6 speed 56000 force
   interface ethernet 1/7 mtu 9000 force
   interface ethernet 1/7 speed 56000 force
   interface ethernet 1/8 mtu 9000 force
   interface ethernet 1/8 speed 56000 force
   interface ethernet 1/9 mtu 9000 force
   interface ethernet 1/9 speed 56000 force
   interface ethernet 1/10 mtu 9000 force
   interface ethernet 1/10 speed 56000 force
   interface ethernet 1/11 mtu 9000 force
   interface ethernet 1/11 speed 56000 force
   interface ethernet 1/12 mtu 9000 force
   interface ethernet 1/12 speed 56000 force
   interface ethernet 1/1 switchport mode hybrid
   interface ethernet 1/2 switchport mode hybrid
   interface ethernet 1/3 switchport mode hybrid
   interface ethernet 1/4 switchport mode hybrid
   interface ethernet 1/9 switchport mode hybrid
   interface ethernet 1/10 switchport mode hybrid
   interface ethernet 1/11 switchport mode hybrid
   interface ethernet 1/12 switchport mode hybrid
  
##
## VLAN configuration
##
   vlan 48
   vlan 58
  
   interface ethernet 1/1 switchport hybrid allowed-vlan 48
   interface ethernet 1/2 switchport hybrid allowed-vlan 48
   interface ethernet 1/3 switchport hybrid allowed-vlan 48
   interface ethernet 1/4 switchport hybrid allowed-vlan 48
   interface ethernet 1/9 switchport hybrid allowed-vlan 58
   interface ethernet 1/10 switchport hybrid allowed-vlan 58
   interface ethernet 1/11 switchport hybrid allowed-vlan 58
   interface ethernet 1/12 switchport hybrid allowed-vlan 58
   vlan 48 name "DC 40 GbE A (primary)"
   vlan 58 name "DC 40 GbE B (secondary)"
  
##
## L3 configuration
##
   interface vlan 48
   interface vlan 58
   interface vlan 48 ip address 192.168.48.1 255.255.255.0
   interface vlan 48 mtu 9000
   interface vlan 58 ip address 192.168.58.1 255.255.255.0
   interface vlan 58 mtu 9000
  
##
## DCBX PFC configuration
##
   dcb priority-flow-control enable force
   dcb priority-flow-control priority 3 enable
   interface ethernet 1/1 dcb priority-flow-control mode on force
   interface ethernet 1/2 dcb priority-flow-control mode on force
   interface ethernet 1/3 dcb priority-flow-control mode on force
   interface ethernet 1/4 dcb priority-flow-control mode on force
   interface ethernet 1/5 dcb priority-flow-control mode on force
   interface ethernet 1/6 dcb priority-flow-control mode on force
   interface ethernet 1/7 dcb priority-flow-control mode on force
   interface ethernet 1/8 dcb priority-flow-control mode on force
   interface ethernet 1/9 dcb priority-flow-control mode on force
   interface ethernet 1/10 dcb priority-flow-control mode on force
   interface ethernet 1/11 dcb priority-flow-control mode on force
   interface ethernet 1/12 dcb priority-flow-control mode on force
  
##
## LLDP configuration
##
   lldp
  
##
## Network interface configuration
##
no interface mgmt0 dhcp
   interface mgmt0 ip address 192.168.8.3 /24
  
##
## Network interface IPv6 configuration
##
no interface mgmt0 ipv6 address autoconfig default
no interface mgmt0 ipv6 enable
  
##
## Other IP configuration
##
   ip name-server 192.168.8.21
   ip name-server 192.168.8.22
   ip name-server 8.8.8.8
   ip route 0.0.0.0 0.0.0.0 192.168.8.1
   hostname switch02
   ip domain-list corp.boldbrick.com
  
##
## Other IPv6 configuration
##
no ipv6 enable
So I guess my next project will be adding another SX6012, configuring a MLAG and getting some high-availability :)
 

vanfront

Member
Jun 5, 2020
36
14
8
Very glad to hear its working now;
any idea which the actual culprit was?
I think it was caused by the VPN setting on adaptors. However, can't confirm for sure. I am planning to test it in the future, but getting production traffic on the new cluster has priority now.

I have a bunch of quite reusable scripts to setup networking in WS2019 for S2D which I want to share in a separate thread, as time allows. Using Windows Admin Center, which is supposed to be the preferred way, just didn't work out at all.