Mellanox Switches - Tips & Tricks

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

DJDN123

New Member
Feb 18, 2025
1
0
1
I think it's time for a public release of these, the released version is 3.9.3202
ETH supports SPC1, SPC2, SPC3 (not support SPC4, SN5×××)
IB supports SIB1, SIB2, QTM, QTM2

This thread is a little scattered. Just to clarify, are the files in this download for the SN2410 or do they have to be modified? On a side note, I will be doing a clean install using the onie iso image and firmware. Also, will it work for the SN2100 Series as well?
 
Last edited:

goodt

New Member
Jan 21, 2025
24
0
1
Hello, I'm new here, and I just want to thank everybody who have contributed to this thread, and to share my story with a pair of SB7700, just in case it is useful for anybody else:

I have a pair of SB7700 that have been powered on since one fellow of mine installed them almost 8 years ago. Recently, I had the need to move them from the rack where they were placed, so I had to turn them off. Of course, they didn't restart when I pluged them again. One of them showed in the console a message about not having any boot partition. The other one showed nothing at all.
After discovering this thread, I opened both of them. The one with no console output has a blown out capacitor, so that's the end for that one. In the other one, I was able to enter the bios, and after rebooting with the ONIE 2020.11-5.3.0005-115200 in a usb, I was able to start tinkering with it. I checked both of the SSDs, but they both were dead. I bought a Kinkston SSD, and I started the process that's described in this thread (for the SB7700, it seems that the right combination is CSM enabled + Non-UEFI ONIE, btw). I must mention that I was receiving some warnings related to the system not being capable of reading the serial number from TLV (or something like that), but the installation carried on. Searching about it I discovered that it was related to the EEPROM. I checked the content of the EEPROM with onie-syseeprom and it only had the verification checksum, all the other values were empty. But I carried on...
Even though in the first post it's said that the last version for the SB7700 is the 3.9.3124, I saw in the nvidia support page that they list the 3.9.3302 as the last supported version for the SB7700, so I tried with the installer located in the mega folder, the X86_64-3.9.3202-installer.bin. Unfortunately, after filling all the new partitions, near the end, the installation process failed with an error related to the mlxi2c command (MLXI2C_AUTO_DETECT_FAILED). However, I could boot MLNX-OS after the installation, but I had to wait for around 15 minutes from login in until I got to the prompt, and the output of 'show asic-version' showed that no managed switches were detected. I flashed many versions of mlnx-os via the update procedure, but none of them worked.
Then I started to suspect that the eeprom was more important than what I thought... I started to fill some fields with the information I could retrieve from the smbios, with dmidecode -t1. But the installation from onie to X86_64-3.9.3202-installer.bin always failed in the same place, with the same error (mlxi2c related). After more searches, I tried to fill some of the eeprom values that I had no idea how to fill... I located the onie-syseeprom dump from someone with a SN2100, and I noticed that there is a field Vendor Extension (0xfd) fields with hex numbers. I naively copied these fields, and after that the installation with the X86_64-3.9.3202-installer.bin worked... but it got all the values from a SN2100, so after restarting the system, mlnx-os still doesn't detect the asic. But I think that those values are the key to succesfully recover the switch.
Could anybody provide the onie-syseeprom dump from a working sb7700? Or maybe there is a way to get those values from the smbios and encode them in the right way, I don't know... Either way, any help would be greatly appreciated.
Did you ever get a response for this as I have a similar problem with a SN2700 switch someone gave me :)
 
D

Deleted member 24947

Guest
Anyone have thoughts on the SX series vs SN in 2025? The 25/100 route is still pretty pricey by comparison. Not sure if it’s more useful to homelab Infiniband vs Cumulus / Sonic.
 

Freebsd1976

Active Member
Feb 23, 2018
425
78
28
anyone has onie 5.3.0008 or 5.3.0010 or 5.3.0011? on some newest mellanox switch ,5.3.0005 refuse to partition the new ssd and install onie on new ssd (just like /lib/onie-updater no run or execute . also when run into onie embeded it will not reboot like normal ,both under UEFI or Legacy.
 

cy384

Member
Aug 19, 2022
36
50
18
cy384.com
Anyone have thoughts on the SX series vs SN in 2025?
My opinion:

SX6036 plus a few connectx3: very cheap ($200 or less), you get infiniband or ethernet up to 56gb, still pretty fast, EOL software doesn't realy matter if you're just doing basic switching/routing, reasonable power usage

SN2700 plus a few connectx4: somewhat pricey for hobby purposes (maybe $1300 or so), 25/100 is clearly the future (or present), support for basically any linux and can run whatever other software you want, has hardware support for newer protocols like rocev2

Personally, I'm not in a rush to replace my SX6012, but if I see a SN2100 at a good price I'll be very tempted.
 

NablaSquaredG

Bringing 100G switches to homelabs
Aug 17, 2020
1,883
1,263
113
Anyone have thoughts on the SX series vs SN in 2025? The 25/100 route is still pretty pricey by comparison. Not sure if it’s more useful to homelab Infiniband vs Cumulus / Sonic.
SX series can be perfectly fine for homelab.

You might want to grab an SX6710 / SX6720, as they have the newer control plane and run much smoother than the PowerPC SX6036...

There are some known limitations, though.

1. MAGP (All-Active Inter-VLAN Routing with MLAG) on SX series is buggy. Don't use it.
2. SwitchX ASIC does not support ACLs on SVI interfaces (no ACLs for Inter-VLAN routing)
 
Anyone have thoughts on the SX series vs SN in 2025? The 25/100 route is still pretty pricey by comparison. Not sure if it’s more useful to homelab Infiniband vs Cumulus / Sonic.
40/56 is the roll imo for the price. Cheap cards and switches. QSFP DAC are a bit fat routing in a rack but that's the biggest downside. 25 is overrated and you can hit 32 easy no tweaks which also makes 25 a bit underperforming for its premium price.
 

svvolf

New Member
Mar 30, 2017
17
8
3
41
Onyx v3.10.4606 LTS has been officially released for a while now. Has anyone obtained and shared it yet?
 
D

Deleted member 24947

Guest
It depends, 40G/56G is still really fast for homelab.
Funny thing is that I’m using desktop hardware for my servers, so I end up PCIe limited pretty quickly, especially with PCIe 3.0 systems. The bottleneck is either the NIC’s PCIe lanes or the NVMe storage PCIe lanes. Either way it gets hard and/or expensive to be bottlenecked by a 40 gig switch.
 

Stephan

Well-Known Member
Apr 21, 2017
1,105
862
113
Germany
I see the other resident #MellanoxUltras have already answered nicely. ;-)

I'm a SX6012 user myself, with Mellanox and compatible cables for 56 Gbps ethernet. Even through PCIe 3.0 that is around 4 GiB/s of throughput. Of course if money and noise is no concern, could buy Robin's three SN2700 with SSD swap already done, or two SN3700C.

Problem these days is that supply has almost dried up. Prices of gear from like Israel asking 2000 or even >500 and change is fresh from fantasy land. CX3 and cables are available, but the switches... Well here is one MELLANOX SX6012 12-Port FDR Infiniband QSFP+ Switch 2x Power Used | eBay Otherwise check out geo-ship.com regularly. I am just in the process of importing two SX6005 into Europe to complete an MSX60-DKIT with two more SX6012 which had the inner rails missing.

Bunch of suitable DAC- and fiber cables if you choose that path:

MC2207130-00A Mellanox Passive Copper Cable VPI UP TO 56GB/S QSFP 0.5M
MC2207130-001 Mellanox Passive Copper Cable VPI UP TO 56GB/S QSFP 1M
MC2207130-0A1 Mellanox Passive Copper Cable VPI UP TO 56GB/S QSFP 1.5M
MC2207130-002 Mellanox Passive Copper Cable VPI UP TO 56GB/S QSFP 2M
MC2207128-0A2 Mellanox Passive Copper Cable VPI UP TO 56GB/S QSFP 2.5M
MC2207128-003 Mellanox Passive Copper Cable VPI UP TO 56GB/S QSFP 3M
MC2207126-004 Mellanox Passive Copper Cable VPI UP TO 56GB/S QSFP 4M
MC2207310-XXX Mellanox Active Fiber Cable VPI UP TO 56GB/S QSFP from 3M up to 100M
MC2207312-XXX Mellanox Active Fiber Cable VPI UP TO 56GB/S QSFP from 3M up to 300M
MC220731V-XXX Mellanox Active Fiber Cable VPI UP TO 56GB/S QSFP from 3M up to 100M
MC2207411-SR4L Mellanox Optical Module IB FDR 56GB/S QSFP MPO 850NM UP TO 30M

038-004-236-01 EMC FDR QSFP+ to QSFP+ copper cable 0.5M
038-004-065-01 EMC FDR QSFP+ to QSFP+ copper cable 1M
038-004-066-01 EMC FDR QSFP+ to QSFP+ copper cable 2M
038-004-067-01 EMC FDR QSFP+ to QSFP+ copper cable 3M
038-900-027-01 EMC FDR QSFP+ to QSFP+ copper cable 5M
038-004-069-01 EMC FDR QSFP+ to QSFP+ copper cable 5M
038-900-030-01 EMC FDR QSFP+ to QSFP+ copper cable 8M

IBM 00W0061 0.5M
IBM 00W0049 1M
IBM 00W0057 3M

And for some 40G long haul links to the tree house and the secret hideout who doesn't like KAIAM XQX2502 QSFP+ 40G LR4 Lite modules for around 4 fiat a piece.
 
D

Deleted member 24947

Guest
Are there any issues with HPE or IBM SN2xxx switches? I.e. the way that the EMC SX6000 have completely different software and need a bit of work to convert.
 

Freebsd1976

Active Member
Feb 23, 2018
425
78
28
Are there any issues with HPE or IBM SN2xxx switches? I.e. the way that the EMC SX6000 have completely different software and need a bit of work to convert.
HPE SN2xxx also use onyx , but some type of them need license to actice port or upgrade speed.
 
D

Deleted member 24947

Guest
Well, I’ve got an IBM SN2410 on the way. Once I get it I’ll start looking into the firmware, updates, etc… plus getting faster NICs and media.

Is there such a thing as reverse breakout, i.e. a 100gbe QSFP28 NIC going to four SFP28 25gbe switch ports?
 

dhf0dXT

New Member
Nov 8, 2023
1
0
1
I have the HPE version of the SN2100.
HPE SN2100M 100GbE 16QSFP28 Half Width Switch Q2F23A

It has a failed SSD, and using the forums here I was successful in installing ONIE (X86_64-3.9.3202-installer.bin), then the recovery image (onie-recovery-x86_64-mlnx_x86-r0.iso), then upgrading to the latest version of 3.10.4504.
I used a 3ME4 DEM24-64GM4 SSD.

My question is... through all of this... it still has the HPE graphics in the web console.
How is this possible?
Does it have some onboard flash separate than the SSD?

Also with the EOL for ONYX, does anyone know if there is a path to Cumulus?
How does one even buy Cumulus?
Of course HPE wants to sell me a 4 new switches. (HPE S2T76A)
 

Attachments

Freebsd1976

Active Member
Feb 23, 2018
425
78
28
“Does it have some onboard flash separate than the SSD?”
no,the os itself has customized file for oem
in the opt/tms/customization_files/ , for 2100M the file is customization.210015
Code:
console_banner_name_login: "HPE SN2100M Management Console"
console_banner_name: "HPE SN2100M Management Console"
local_login_msg: "HPE Management Console"
remote_login_msg: "HPE Management Console"
motd: "HPE Switch"
hide_welcome_page: "true"
display_mellanox_on_front_panal: "false"
logo_name: "logo_HPE.png"
hide_eula: "true"
hide_xml_api_link: "true"
copyright_msg: ""
copyright_url: ""
eula_doc_name: ""
eula_main_link: ""
user_manual_name: "HP_help_docs/HP_ETH_help_docs/MLNX-OS_Ethernet_User_Manual_for_HP.pdf"
release_notes_name: "HP_help_docs/HP_ETH_help_docs/MLNX-OS_Ethernet_Release_Notes_for_HP.pdf"
xml_api_name: "HP_help_docs/HP_ETH_help_docs/MLNX-OS_HP_ETH_XML_API_Reference_Guide.pdf"
os_name: "Onyx"
tab_name: "Onyx"
show_favicon: "false"
snmp_system_description: "HPE SN2100M 100GbE 16QSFP28 Half Width Switch"
CUSTOMIZATION_FILE_LAST_LINE: "KEEP THIS LINE AT THE BOTTOM FO THE CUSTOMIZATION FILE"
and the normal sn2100 file is
Code:
console_banner_name_login:  "NVIDIA Onyx Management Console"
console_banner_name: "NVIDIA Onyx Management Console"
os_name: "Onyx"
tab_name: "NVIDIA Onyx"
block_xml_tree: "false"
logo_name: "logo.png"
welcome_page_frame: "FRAME_BIG_OnyX.png"
local_login_msg: "NVIDIA Onyx Switch Management"
remote_login_msg: "NVIDIA Onyx Switch Management"
motd: "NVIDIA Switch"
show_favicon: "true"
snmp_system_description: "NVIDIA MSN2100, 16-Port 100GbE Switch System"
hide_welcome_page: "false"
manufacturer_name: "NVIDIA"
physical_mfg_name: "NVIDIA"
hide_product_docs: "false"
display_mellanox_on_front_panal: "true"
hide_eula: "false"
hide_xml_api_link: "true"
copyright_msg: ""
copyright_url: ""
local_logout_msg: ""
remote_logout_msg: ""
xml_api_name: ""
eula_doc_name: "Onyx_EULA.pdf"
eula_main_link: "http://www.mellanox.com/related-docs/prod_management_software/MLNX_Onyx_EULA.pdf"
user_manual_name: "ETH_help_docs/Onyx_Ethernet_User_Manual.pdf"
release_notes_name: "ETH_help_docs/Onyx_Ethernet_Release_Notes.pdf"
CUSTOMIZATION_FILE_LAST_LINE: "KEEP THIS LINE AT THE BOTTOM FO THE CUSTOMIZATION FILE"
IMO, you can mod the customization.210015 file to change show information
 
Last edited: