Mellanox Switches - Tips & Tricks

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

NablaSquaredG

Bringing 100G switches to homelabs
Aug 17, 2020
1,618
1,072
113
So I upgraded to 3.9 and things went awry.
have you backed up the SSD before upgrading?

Last version that works is 3.6.8012

You need to downgrade the BIOS to recover the switch.

Any way, you will have to make some file level modifications either by booting a suitable live linux distribution or by taking out the SSD.


And yes, I have made this mistake myself and know how to fix it ;)
 

linuxsrc

Member
Oct 1, 2018
34
4
8
Brownsburg, IN
have you backed up the SSD before upgrading?

Last version that works is 3.6.8012

You need to downgrade the BIOS to recover the switch.

Any way, you will have to make some file level modifications either by booting a suitable live linux distribution or by taking out the SSD.


And yes, I have made this mistake myself and know how to fix it ;)
Sadly, no I did not back it up. I was in too much of a hurry. :(

I would appreciate any help, this is going to be a big part of my backbone and at the price, I could not miss out on the purchase. I am not usually very cautious with this type of thing, but I am replacing most of my systems and have been pushing too hard to get things done.
 

NablaSquaredG

Bringing 100G switches to homelabs
Aug 17, 2020
1,618
1,072
113
Alright, so here's the short version, as I'm in a hurry. I'll leave out detailed commands and all that stuff.

Okay, so the crucial part is booting in single user mode (because the cli won't let you login or execute the _shell command because it fails to connect to the ASIC) and executing /opt/tms/bin/bios_update.sh default no_reboot with the right BIOS version.

The 3.6.8012 x86 image can be found here: https://www.mellanox.com/downloads/Software/onyx-X86_64-3.6.8012.img

This is the last supported version for SwitchX systems (at minimum all systems of which the name starts with SX). DO NOT UPGRADE BEYOND THIS VERSION!!!!

Another note: Use this opportunity to replace the SSD! The SSDs used by Mellanox in x86 switches like Innodisk 3ME3 or StorFly VSF302XC016G-MLX are prone to failure, even if the firmware upgrade is applied. I recommend Transcend 452T2 SSDs.


@linuxsrc have you upgraded both partitions, or is there one with a version smaller than or equal to 3.6.8012 left?

I'm just trying to figure out whether I need to prepare a fully fledged disk image for you...
 

linuxsrc

Member
Oct 1, 2018
34
4
8
Brownsburg, IN
At your leisure, any help will be great, I am stuck going forward for some of this until it is fixed or I find a replacement.

And yes, I did mess up. I realized that after I started digging and that is the reason why I relegated myself to turning it into a fixture.
 

j_h_o

Active Member
Apr 21, 2015
666
187
43
California, US
I've got an SN2100 deployed in a datacenter.
  1. Is there a way I can tell what SSD is inside, remotely?
  2. If I'm willing to pre-purchase an SSD, which model should I purchase? Is this a 2280 m.2 SATA?
  3. Assuming the drive isn't dead, if I'm able to pop it into suitable enclosure, what command do I use to image the disk over?
I appreciate any help you may be able to provide :)
 
Last edited:

awedio

Active Member
Feb 24, 2012
779
228
43
I've got an SN2100 deployed in a datacenter.
  1. Is there a way I can tell what SSD is inside, remotely?
  2. If I'm willing to pre-purchase an SSD, which model should I purchase? Is this a 2280 m.2 SATA?
  3. Assuming the drive isn't dead, if I'm able to pop it into suitable enclosure, what command do I use to image the disk over?
I appreciate any help you may be able to provide :)
I don't understand #3.
Is the switch working or not working?
 

j_h_o

Active Member
Apr 21, 2015
666
187
43
California, US
Indeed -- but the SN2100 is m.2 SATA right? Do you know if it's 2280? Anyone used an SSD that fits in the SN2100? What did you use?
 

tsteine

Active Member
May 15, 2019
178
85
28
@j_h_o Generate a sysdump. Inside the sysdump, there will be a scsi.log file, this contains your drive model. In my case, you can see it is a StorFly VSF302XC drive.

1687810699649.png
1687810684744.png
 
  • Like
Reactions: j_h_o

j_h_o

Active Member
Apr 21, 2015
666
187
43
California, US
Cool. Will do. Anyone have SN2100 replacement drive recommendations? I don't have the switch here and can't tell if it's a 2280 or another size disk. Is the StorFly also unreliable/problematic? (I bought this unit used and have no warranty coverage)

I guess I'll check the size from whatever part is returned from the sysdump.
 

tsteine

Active Member
May 15, 2019
178
85
28
Is the StorFly also unreliable/problematic? (I bought this unit used and have no warranty coverage)
As far as I am aware, StorFly the brand they used to replace the problematic drives back in 2019/2020

Either way, SSDs are not magic eternity devices, so getting a spare and making a 1:1 copy of the existing drive, testing that it works, then dumping the drive image, and keeping the drive somewhere safe is not a bad idea to get back up and running quickly. Just don't expect the drive to work as a drop in device if years down the line you pop it in with the original image if you haven't powered it back on regularly to "refresh" the cells so they keep data.
 

NablaSquaredG

Bringing 100G switches to homelabs
Aug 17, 2020
1,618
1,072
113
Certain types of StorFly SSDs are also known bad:

But I don't know whether this includes the M.2 model used in the SN2100.

Anyway, I used Transcend TS64GMTS552T-I (MTS552T series with industrial temp range, you can also use the normal one if cheaper, but in Germany the -I is cheaper than the normal one) and when those weren't available I used the TS500GMTS425S

For SN2700/SN2410/SB7700 I always use the TS128GMSA452T2 model.


I wonder whether the SN2100 supports OTG on the front ports for plugging in a USB Stick with a MLNX-OS installer.
@nasbdh9 do you happen to have the recovery guide for SN2100? I wonder how they do it.