Mellanox Switches - Tips & Tricks

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

leonderooij

New Member
Feb 19, 2023
10
4
3
Wow @NablaSquaredG thank you for quick reply ! Ok I will order two of the ssd's right away. About the fae cable-stamping unlock, what is that ? I currently have a whole bunch of (q)sfp28 modules and dac cables that work - they were purchased from fs.com (type mellanox) - do I still need to do some things to make the modules work when the switches are upgraded ?

thanks again, Leon
 

leonderooij

New Member
Feb 19, 2023
10
4
3
Ahh I found some info, that fae cable-stamping-unlock is to use modules not specific for Mellanox, right ?



switch-080852 [vip: master] (config) # fae cable-stamping-unlock ?
<type> Cable or transceiver speed
40g_lr4
eth_100g
100g_lr4
eth_sfp_25g


Interesting :) alright, I'll post here if^H^H when everything has worked out!
 

leonderooij

New Member
Feb 19, 2023
10
4
3
One final question if you don't mind:

You will need a MicroUSB to USB OTG adapter for the USB stick with ONIE and the Mellanox OS installer
I see only a regular USB Type-A on the front of the switch. How would I use an OTG adapter ? Should I not put the stick in the front ? Or should I use a MicroUSB connector on the inside of the switch ?
 

NablaSquaredG

Layer 1 Magician
Aug 17, 2020
1,360
829
113
I see only a regular USB Type-A on the front of the switch. How would I use an OTG adapter ? Should I not put the stick in the front ? Or should I use a MicroUSB connector on the inside of the switch ?
Oh, hold on. I thought you have SN2100 (sorry, SN2010 and SN2100 is too similar).

For SN2100, you need an OTG adapter. For SN2010, you do not need an OTG adapter.
 

leonderooij

New Member
Feb 19, 2023
10
4
3
Alright clear, I've ordered the ssd's, will try the upgrade when they arrive, thanks again, will report back :)
 

NablaSquaredG

Layer 1 Magician
Aug 17, 2020
1,360
829
113
Alright clear, I've ordered the ssd's, will try the upgrade when they arrive, thanks again, will report back :)
Oh hold on, those SSDs you linked are for SN2700 and full width switches only.

Half width switches like SN2100 or SN2010 use M.2 2242 format. For M.2 I recommend Transcend MTS552T2-I
 

snek

New Member
Oct 7, 2023
3
0
1
snek.dev
Has anyone experienced SX6012's management port behaving weirdly? It works fine if I plug it directly into my laptop but it refuses to connect to my lan. Both dhcp and static do not work. Looking at wireshark captures it is just completely ignoring arp, not making any dhcp requests, etc. The rest of the router works perfectly fine.
 

leonderooij

New Member
Feb 19, 2023
10
4
3
Oh hold on, those SSDs you linked are for SN2700 and full width switches only.

Half width switches like SN2100 or SN2010 use M.2 2242 format. For M.2 I recommend Transcend MTS552T2-I
Ah I see, thanks! ordered those as well now, will return the other ones.
 

leonderooij

New Member
Feb 19, 2023
10
4
3
It worked out!! One switch is upgraded :) Want to thank you again @NablaSquaredG !

I have one more question, I see now during boot (grub) that i can choose to boot from two partitions, one is on version 3.9.3202 while two is on the upgraded 3.10.4206. Should I remove the old one ? Or just leave it as-is ?
Code:
switch [standalone: master] (config) # show images

Installed images:
  Partition 1:
    version: X86_64 3.9.3202 2021-08-11 15:04:27 x86_64

  Partition 2:
    version: X86_64 3.10.4206 2023-03-08 19:11:42 x86_64

Last boot partition: 2
Next boot partition: 2

Images available to be installed:
  No image files are available to be installed.

Serve image files via HTTP/HTTPS: no

No image install currently in progress.
Boot manager password is set.

Image signing              : trusted signature always required
Admin require signed images: yes

Settings for next boot only:
  Fallback reboot on configuration failure: yes (default)
Next I'll do the other switch and write down every step, will paste here when done.
 

leonderooij

New Member
Feb 19, 2023
10
4
3
Maybe useful for someone else - some things about mgmt vrf were not literally same as in docs, I followed these steps:

Preparation:

Download recovery and upgrade from 1.73 GB folder on MEGA

Extract the downloaded Mellanox.zip
Code:
unzip Mellanox.zip
Files are now in Mellanox/ dir

Rename the upgrade image from .zip to .img:
Code:
mv Mellanox/Upgrade/3.10.4206/onyx-X86_64-3.10.4206.zip \
Mellanox/Upgrade/3.10.4206/onyx-X86_64-3.10.4206.img
Download ONIE, two versions are available - see doc Mellanox/SSDReplacementWithONIENOSInstall.pdf page 7:
* 115200 bps: onie-recovery-x86_64-mlnx_x86-r0.iso | Powered by Box
* 9600 bps: https://mellanox.box.com/s/dtydz931fa7l6t6ndtn00j43xrron3ck
(click download)

Seeing now that ONIE is also available in the Mellanox.zip file :)

Write the ONIE iso to USB stick:
Code:
sudo dd if=onie-recovery-x86_64-mlnx_x86-r0.iso of=/dev/sdx bs=1M
sudo sync
Replace the SSD in the switch - I used the following: Transcend TS128GMSA452T2

--

Recovery procedure:

Documented in Mellanox/SSDReplacementWithONIENOSInstall.pdf

Make sure serial and ethernet (mgmt) are connected - I'm using DHCP.

Put USB stick in switch and turn it on, during boot, press CTRL-B, when asked for password, enter "admin"

In BIOS, go to right most tab named "Save & Exit" and select your USB drive under "Boot Override" - do NOT choose the UEFI variant.

In following GRUB menu choose "ONIE: Embed ONIE".

After ONIE is installed, the switch reboots.

Remove USB drive.

In GRUB choose "ONIE: Install OS".

After it boots press enter, then - in order to stop auto-discovery - type:
Code:
onie-stop
You should have networking, ping your gateway.

Now scp the recovery image onto the switch into /tmp:
Code:
scp user@workstation:~/Mellanox/Recovery/3.9.3202/X86_64-3.9.3202-installer.bin /tmp
Install the image:
Code:
onie-nos-install /tmp/X86_64-3.9.3202-installer.bin
After that is done, the switch boots from partition 1 containing the new 3.9.3202 software.

Login with user admin pass admin.

I chose following options:

* Use the wizard: yes
* Hostname, default <enter>
* Use DHCP, default yes <enter>
* Enable IPv6, default yes <enter>
* Enable IPv6 SLAAC on mgmt0, default no <enter>
* Enable IPv6 DHCPv6 on mgmt0, default yes <enter>
* Update time <enter>
* Enable password hardening, type "no"
* Type admin password: admin
* Confirm admin password: admin
* Type monitor password: monitor
* Confirm monitor password: monitor
* press <enter> again to confirm config

--

Upgrade procedure:
Mostly taken from here: https://enterprise-support.nvidia.c...switch-os-software-on-mellanox-switch-systems

You're booted in Onyx 3.9.3202.

Check network, ping your gateway:
Code:
ping vrf mgmt 1.2.3.4
Steps to upgrade:
Code:
enable
configure terminal
image fetch vrf mgmt scp://user:pass@workstation/dir/to/Mellanox/Upgrade/3.10.4206/onyx-X86_64-3.10.4206.img
image install onyx-X86_64-3.10.4206.img
image boot next
configuration write
reload
System now reboots -- it chooses the 2nd option (partition) in Grub for booting by default, let it continue
System further upgrades during boot
Login

Code:
switch [standalone: master] > show version

Product name:      Onyx
Product release:   3.10.4206
Build ID:          #1-dev
Build date:        2023-03-08 19:11:42
Target arch:       x86_64
Target hw:         x86_64
Built by:          sw-r2d2-bot@8503df9ba338
Version summary:   X86_64 3.10.4206 2023-03-08 19:11:42 x86_64

Product model:     x86onie
Host ID:           ************
System serial num: ************
System UUID:       ********-****-****-****-************

Uptime:            2m 15.120s
CPU load averages: 1.05 / 0.40 / 0.15
Number of CPUs:    4
System memory:     2574 MB used / 5229 MB free / 7803 MB total
Swap:              0 MB used / 0 MB free / 0 MB total
Done! :)

Code:
enable
reload halt
 

y.smirnov

New Member
Jul 5, 2023
6
1
3
35
Has anyone set up RoCEv2 at Onyx? Could anyone help?

my set up is:
mellanox sn2410 with onyx 3.10.4206
4 * HP dl360 gen10 with ethernet card HP 25G 640FLR-sfp28


I've set at vmware:
Enable PFC and DSCP trust mode
#esxcli system module parameters set -m nmlx5_core -p "dcbx=1 pfctx=0x08 pfcrx=0x08 trust_state=2"
Set DSCP value to 26
#esxcli system module parameters set -m nmlx5_rdma -p "dscp_force=26"
rebooted
vsan and rdma has no vlan at my setup.

DV switch setup like here at nvidia guide

at mellanox sn2410 side:
Code:
configure terminal
roce lossless
lldp
interface ethernet 1/1-1/32 qos trust L3
dcb priority-flow-control enable force
dcb priority-flow-control priority 3 enable
no advanced buffer management force
traffic pool roce-reserved type lossless
traffic pool roce-reserved memory percent 50.00
traffic pool roce-reserved map switch-priority 3
interface ethernet 1/1-1/32 traffic-class 6 dcb ets strict
interface ethernet 1/1-1/32 traffic-class 3 congestion-control ecn minimum-absolute 150 maximum-absolute 1500
Stats show there is no RoCEv2 traffic but lossy persists:

Code:
mellanox [standalone: master] # sh roce

RoCE mode      : lossless
LLDP           : enabled
Port trust mode: L3

Application TLV:
  Selector: udp
  Protocol: 4791
  Priority: 3

Port congestion-control:
  Mode    : ecn, absolute
  Min (KB): 150
  Max (KB): 1500

PFC              : enabled
switch-priority 3: enabled

RoCE used TCs:
  ----------------------------------------------
  Switch-Priority   TC     Application   ETS  
  ----------------------------------------------
  3                 3      RoCE          WRR 50%
  6                 6      CNP           Strict

RoCE buffer pools:
  ----------------------------------------------------------------------------------------------
  Traffic                  Type      Memory   Switch        Memory actual   Usage    Max Usage
  Pool                               [%]      Priorities                                      
  ----------------------------------------------------------------------------------------------
  lossy-default            lossy     auto     0, 1, 2, 4,   2.9M            0        2.8M    
                                              5, 6, 7                              
  roce-reserved            lossless  50.00    3             2.9M            0        0        

Exception list:
N/A
Code:
mellanox [standalone: master] # show interface ethernet 1/1 counters roce

Eth1/1:
  Rx:
    0                     RoCE PG packets
    0                     RoCE PG bytes
    0                     RoCE no buffer discard
    264368130             CNP PG packets
    1489578479208         CNP PG bytes
    0                     CNP no buffer discard
    0                     RoCE PFC pause packets
    0                     RoCE PFC pause duration
    0                     RoCE buffer usage (bytes)
    0                     RoCE buffer max usage (bytes)
    0                     CNP buffer usage (bytes)
    2492448               CNP buffer max usage (bytes)
    0                     RoCE PG usage (bytes)
    0                     RoCE PG max usage (bytes)
    0                     CNP PG usage (bytes)
    2492448               CNP PG max usage (bytes)

  Tx:
    0                     ECN marked packets
    0                     RoCE TC packets
    0                     RoCE TC bytes
    0                     RoCE unicast no buffer discard
    0                     CNP TC packets
    0                     CNP TC bytes
    0                     CNP unicast no buffer discard
    0                     RoCE PFC pause packets
    0                     RoCE PFC pause duration
    0                     RoCE buffer usage (bytes)
    0                     RoCE buffer max usage (bytes)
    0                     CNP buffer usage (bytes)
    2126016               CNP buffer max usage (bytes)
    0                     RoCE TC usage (bytes)
    0                     RoCE TC max usage (bytes)
    0                     CNP TC usage (bytes)
    0                     CNP TC max usage (bytes)
at esxi side
Code:
[root@esxi11:~] vsish -e cat /net/pNics/vmnic5/stats | grep -e "Pause\|PerPrio"
rxPauseCtrlPhy: 0
txPauseCtrlPhy: 0
txPauseStormWarningEvents: 0
txPauseStormErrorEvents: 0
Code:
[root@esxi11:~] esxcli rdma device list
Name     Driver      State    MTU  Speed    Paired Uplink  Description
-------  ----------  ------  ----  -------  -------------  -----------
vmrdma0  nmlx5_rdma  Active  4096  25 Gbps  vmnic4         MT27710 Family  [ConnectX-4 Lx]
vmrdma1  nmlx5_rdma  Active  4096  25 Gbps  vmnic5         MT27710 Family  [ConnectX-4 Lx]
[root@esxi11:~] esxcli system module parameters list -m nmlx5_core | grep 'trust_state\|pfcrx\|pfctx'
pfcrx                int            0x08   Priority based Flow Control policy on RX.
   Notes: Must be equal to pfctx.
pfctx                int            0x08   Priority based Flow Control policy on TX.
   Notes: Must be equal to pfcrx.
trust_state          int            2      Port policy to calculate the switch priority and packet color based on incoming packet
[root@esxi11:~] esxcli system module parameters list -m nmlx5_rdma | grep 'dscp_force'
dscp_force         int   26     DSCP value to force on outgoing RoCE traffic.
[root@esxi11:~] esxcfg-nics -l |grep -E 'Name|Mellanox'
Name    PCI          Driver      Link Speed      Duplex MAC Address       MTU    Description                   
vmnic4  0000:5d:00.0 nmlx5_core  Up   25000Mbps  Full   e0:07:1b:66:d1:d0 9000   Mellanox Technologies MT27710 Family [ConnectX-4 Lx]
vmnic5  0000:5d:00.1 nmlx5_core  Up   25000Mbps  Full   e0:07:1b:66:d1:d1 9000   Mellanox Technologies MT27710 Family [ConnectX-4 Lx]

esxcli network nic dcb status get -n vmnic4
   Nic Name: vmnic4
   Mode: 3 - IEEE Mode
   Enabled: true
   Capabilities:
         Priority Group: true
         Priority Flow Control: true
         PG Traffic Classes: 8
         PFC Traffic Classes: 8
   PFC Enabled: true
   PFC Configuration: 0 0 0 1 0 0 0 0
   IEEE ETS Configuration:
         Willing Bit In ETS Config TLV: 0
         Supported Capacity: 8
         Credit Based Shaper ETS Algorithm Supported: 0x0
         TX Bandwidth Per TC: 50 0 0 50 0 0 0 0
         RX Bandwidth Per TC: 50 0 0 50 0 0 0 0
         TSA Assignment Table Per TC: 2 2 2 2 2 2 2 2
         Priority Assignment Per TC: 0 0 0 3 0 0 6 0
         Recommended TC Bandwidth Per TC: 50 0 0 50 0 0 0 0
         Recommended TSA Assignment Per TC: 2 2 2 2 2 2 2 2
         Recommended Priority Assignment Per TC: 0 0 0 3 0 0 6 0
   IEEE PFC Configuration:
         Number Of Traffic Classes: 8
         PFC Configuration: 0 0 0 1 0 0 0 0
         Macsec Bypass Capability Is Enabled: 0
         Round Trip Propagation Delay Of Link: 0
         Sent PFC Frames: 0 0 0 0 0 0 0 0
         Received PFC Frames: 0 0 0 0 0 0 0 0
   DCB Apps:
 
Last edited:

y.smirnov

New Member
Jul 5, 2023
6
1
3
35
Has anyone set up RoCEv2 at Onyx? Could anyone help?

my set up is:
mellanox sn2410 with onyx 3.10.4206
4 * HP dl360 gen10 with ethernet card HP 25G 640FLR-sfp28


I've set at vmware:
Enable PFC and DSCP trust mode
#esxcli system module parameters set -m nmlx5_core -p "dcbx=1 pfctx=0x08 pfcrx=0x08 trust_state=2"
Set DSCP value to 26
#esxcli system module parameters set -m nmlx5_rdma -p "dscp_force=26"
rebooted
vsan and rdma has no vlan at my setup.

DV switch setup like here at nvidia guide

at mellanox sn2410 side:
Code:
configure terminal
roce lossless
lldp
interface ethernet 1/1-1/32 qos trust L3
dcb priority-flow-control enable force
dcb priority-flow-control priority 3 enable
no advanced buffer management force
traffic pool roce-reserved type lossless
traffic pool roce-reserved memory percent 50.00
traffic pool roce-reserved map switch-priority 3
interface ethernet 1/1-1/32 traffic-class 6 dcb ets strict
interface ethernet 1/1-1/32 traffic-class 3 congestion-control ecn minimum-absolute 150 maximum-absolute 1500
Stats show there is no RoCEv2 traffic but lossy persists:

Code:
mellanox [standalone: master] # sh roce

RoCE mode      : lossless
LLDP           : enabled
Port trust mode: L3

Application TLV:
  Selector: udp
  Protocol: 4791
  Priority: 3

Port congestion-control:
  Mode    : ecn, absolute
  Min (KB): 150
  Max (KB): 1500

PFC              : enabled
switch-priority 3: enabled

RoCE used TCs:
  ----------------------------------------------
  Switch-Priority   TC     Application   ETS  
  ----------------------------------------------
  3                 3      RoCE          WRR 50%
  6                 6      CNP           Strict

RoCE buffer pools:
  ----------------------------------------------------------------------------------------------
  Traffic                  Type      Memory   Switch        Memory actual   Usage    Max Usage
  Pool                               [%]      Priorities                                      
  ----------------------------------------------------------------------------------------------
  lossy-default            lossy     auto     0, 1, 2, 4,   2.9M            0        2.8M    
                                              5, 6, 7                              
  roce-reserved            lossless  50.00    3             2.9M            0        0        

Exception list:
N/A
Code:
mellanox [standalone: master] # show interface ethernet 1/1 counters roce

Eth1/1:
  Rx:
    0                     RoCE PG packets
    0                     RoCE PG bytes
    0                     RoCE no buffer discard
    264368130             CNP PG packets
    1489578479208         CNP PG bytes
    0                     CNP no buffer discard
    0                     RoCE PFC pause packets
    0                     RoCE PFC pause duration
    0                     RoCE buffer usage (bytes)
    0                     RoCE buffer max usage (bytes)
    0                     CNP buffer usage (bytes)
    2492448               CNP buffer max usage (bytes)
    0                     RoCE PG usage (bytes)
    0                     RoCE PG max usage (bytes)
    0                     CNP PG usage (bytes)
    2492448               CNP PG max usage (bytes)

  Tx:
    0                     ECN marked packets
    0                     RoCE TC packets
    0                     RoCE TC bytes
    0                     RoCE unicast no buffer discard
    0                     CNP TC packets
    0                     CNP TC bytes
    0                     CNP unicast no buffer discard
    0                     RoCE PFC pause packets
    0                     RoCE PFC pause duration
    0                     RoCE buffer usage (bytes)
    0                     RoCE buffer max usage (bytes)
    0                     CNP buffer usage (bytes)
    2126016               CNP buffer max usage (bytes)
    0                     RoCE TC usage (bytes)
    0                     RoCE TC max usage (bytes)
    0                     CNP TC usage (bytes)
    0                     CNP TC max usage (bytes)
at esxi side
Code:
[root@esxi11:~] vsish -e cat /net/pNics/vmnic5/stats | grep -e "Pause\|PerPrio"
rxPauseCtrlPhy: 0
txPauseCtrlPhy: 0
txPauseStormWarningEvents: 0
txPauseStormErrorEvents: 0
also found this at /var/log/vmkernel.log

Code:
2023-11-03T04:01:29.072Z Wa(180) vmkwarning: cpu57:2098375)WARNING: rdmaDriver: RDMAFindTeamDeviceByPortID:3147: Unspported team policy = 8 status = Success                                                                   
2023-11-03T04:01:29.072Z Wa(180) vmkwarning: cpu57:2098375)WARNING: rdmaDriver: RDMACM_BindLegacy:4306: The provided interface (192.168.205.11) does not have a registered rdma device.                                       
2023-11-03T04:01:29.072Z In(182) vmkernel: cpu57:2098375)RDT: RDTCreateRDMAServer:2754: vmk_RDMACMBind() failed for server Bad parameter                                                                                       
2023-11-03T04:01:29.072Z In(182) vmkernel: cpu57:2098375)RDT: RDTCreateRDMAServer:2787: RDTCreateRDMAServer() exiting with failure                                                                                             
2023-11-03T04:01:29.072Z In(182) vmkernel: cpu57:2098375)RDT: RDTEnableRdmaInt:642: Failed to create listener for address 192.168.205.11, protocol 2, status Bad parameter                                                     
2023-11-03T04:01:34.074Z In(182) vmkernel: cpu62:2098375)RDT: RDTDisableRdmaInt:722: SupportedTransportProtocolsMask removes RDMA
 

y.smirnov

New Member
Jul 5, 2023
6
1
3
35
Ok, if you got the same error as i do at /var/log/vmkernel.log
Code:
2023-11-03T04:01:29.072Z Wa(180) vmkwarning: cpu57:2098375)WARNING: rdmaDriver: RDMAFindTeamDeviceByPortID:3147: Unspported team policy = 8 status = Success
go to DV switch vSan Distributed Port Group - teaming and failover. set one nic to active and one to standby. set load balancing as "use explicit failover order"
 
  • Like
Reactions: Maddox

Maddox

New Member
May 28, 2022
5
2
3
Ok, if you got the same error as i do at /var/log/vmkernel.log


go to DV switch vSan Distributed Port Group - teaming and failover. set one nic to active and one to standby. set load balancing as "use explicit failover order"
Indeed, I seem to have read that in an Nvidia documentation, but I can no longer say in which...

Thanks for sharing the solution of the problem.
 
  • Like
Reactions: y.smirnov

hanisirfan

New Member
Nov 14, 2023
5
0
1
I think it's time for a public release of these, the released version is 3.9.3202
ETH supports SPC1, SPC2, SPC3 (not support SPC4, SN5×××)
IB supports SIB1, SIB2, QTM, QTM2

I've just downloaded all the files from this and embedded ONIE with onie-recovery-x86_64-mlnx_x86-r0.iso file. May I know how can I build the 3.10.4302 binary for both SN2410 and SN2700? Thanks :)
 

NablaSquaredG

Layer 1 Magician
Aug 17, 2020
1,360
829
113
I've just downloaded all the files from this and embedded ONIE with onie-recovery-x86_64-mlnx_x86-r0.iso file. May I know how can I build the 3.10.4302 binary for both SN2410 and SN2700? Thanks :)
You'll have to do some research yourself if you want to build your own installers...