Mellanox IB SB7800 Drive Failure

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

barnstormer

New Member
Dec 27, 2022
3
0
1
Good morning,

We've had an SSD on our sb7800 infiniband switch fail, have installed a new drive with a cloned image from one of our other identical switches. So far everything looks ok in the management interface, aside the Host ID. Though we do get 1003, 1006 errors when trying to modify some settings, and the time.

Does anyone know how this can be changed? I was unable to locate it in the manual or menus

Also, in the event of an SSD failure, I also contemplated installing the mlnxOS via usb with a blank SSD installed. Curious if anyone has had to do this and can recommend the best course of action.

Thank you in advance!
 

necr

Active Member
Dec 27, 2017
158
49
28
124
if you're running 7800 in anything resembling a production environment, I'd rather buy support and RMA the switch.
From the SX6012 experience and a few forum threads here, there are some manufacturing scripts that run on the device that populate the internal database which is located on a different partition. That DB can include information about BIOS, serial numbers and other HW-specific information. TBH I wouldn't waste time if the switch performs its functions and monitoring works.
 

barnstormer

New Member
Dec 27, 2022
3
0
1
It is a 6 year old switch we want to keep as a spare for our cluster, was hoping someone had done the process before. I have the ONIE iso and an image of mlnx OS 3.10.4006 that I manage to find online.

I suppose I will try a usb install to a blank drive, hopefully the scripts will run