Beware of EMC switches sold as Mellanox SX6XXX on eBay

Wuddy

New Member
Sep 23, 2022
1
0
1
Hello everybody,
I got SX6036 that has a bad firs partition. I can boot from a second one but can't update the switch.
I would appreciate the guidance how to recover it.
Probably the best would be to put the last FW to the second partition directly.
Thanks in advance
 

NablaSquaredG

Well-Known Member
Aug 17, 2020
739
350
63
WARNING to all SX6720 owners:

Do NOT upgrade to MLNX-OS 3.10 - It will mess up your switch.

When you upgrade to 3.10, it will do a BIOS update. This BIOS update messes up the I²C bus. When the I²C bus is messed up, it will get stuck at "Initializing modules, this may take a few minutes" when you try to login.

Only was back is to downgrade BIOS. Easiest possibility: Use your backup (you have made one, right) with the previous MLNX-OS version. Flash the backup to the SSD, then mount the SSD under Linux and append "single" to the GRUB config (there are multiple partitions on the SSD, one should be called boot - It's really easy to figure out)
When you're back in the OS, you can do
Code:
/opt/tms/bin/bios_update.sh default
It will downgrade the BIOS and you'll be good to go.

I'm currently figuring out the latest supported version, but from what it seems like it is 3.9.0300
I'm testing by grepping for "SX6720" - In supported versions, there are two significant matches in /etc/customer.sh and /etc/customer_rootflop.sh, e.g.
Code:
./etc/customer.sh:    if [ "$SYSTEM_TYPE" = "MSX6720" ]; then
./etc/customer.sh:        [ "$system_type" = "MSX6710" ] || [ "$system_type" = "MSX6720" ] || \
./etc/customer_rootflop.sh:        [ "$system_type" = "MSX6710" ] || [ "$system_type" = "MSX6720" ] || \
In versions newer than 3.9.0300 (3.9.3124 or 3.10.2102), this match is not there, indicating that those versions likely dropped support.

I'll upgrade to 3.9.0300 now - Wish me Luck!

Update: Nope, 3.9.0300 also doesn't work.
I'll try a 3.7 now.

Update 2:
Don't try it.

From "Mellanox MLNX-OS Release Notes Software Ver. 3.7.11xx"

Supported Platforms:
"The last MLNX-OS release to provide support for SwitchX based platforms is 3.6.8010."
(Note: or 3.6.8012)

BTW: Let me know if you want the release notes for MLNX-OS - I have extracted them from the OS, after I couldn't find them anywhere online.
I don't want to share them publicly, because MLNX is probably going to file DMCA or whatever.
 
Last edited:

mmx01

Member
Jan 17, 2020
42
26
18
Has anyone played with 1GB SFP modules via MAM1Q00A-QSA with SX6012/18/36?

10G SFP+ work without a question but I was puzzled by 1G port speed setting yet general opinion it won't work. Also looked here: How can I connect an rj45 plug into a qsfp28 port?

My observations are that in general it does NOT work, MAM1Q00A-QSA in QSFP with port speed set to 1G
- Mikrotik S-RJ01, link up on remote device, no link on the switch port, no communication
- 10Gtek GLC-T/SFP-GE, link up on remote device, no link on the switch port, no communication

Proves the point, no? (but there is a BUT) Then here is the cheapest module on amazon
- 6COMGIGA GLC-T, link up on both ends, communication established...? So it is working just very finnicky


Port 1/9 state
identifier : QSFP to SFP adapter
cable/module type : Twisted pair
ethernet speed and type: 1000BASE - T, Unspecified
vendor : OEM
cable length : -
part number : SFP-GE-T
revision :
serial number : CSGE3LC1488


Eth1/9:
Admin state : Enabled
Operational state : Up
Last change in operational status : 2:44:43 ago (13 oper change)
Boot delay time : 0 sec
Description : WIFI-UPLINK
Mac address : 24:8a:07:f4:82:8f
MTU : 1500 bytes (Maximum packet size 1522 bytes)
Fec : auto
Flow-control : receive off send off
Actual speed : 1 Gbps

No errors... now some may think this is due module eeprom data like vendor locking etc. So I desoldered eeproms from modules, it is standard at24c08 and good old TL866II helped to dump all eeprom vendor images.

- flashing eeprom from working module to a not working did not change anything.
- flashing eeprom from not-working module to a working one did not affect it. It is still working thinking it is another device.

Interesting right? Still not clear what makes most of them not work, it is clear this is not due vendor data. Curious if someone else dig into this and could share some experience.

Also bonus points, 10G Base-T SFP+ modules I tested confuse switch status reporting. It is showing port up even without cable connected. Likely showing link between switch and transceiver itself. Port speed setting follows same logic. With speed set to 10G at the switch, you can still connect 1G device to 10G transceiver and it works fine (as long as transceiver itself supports given end device speed). Now... since SFP+ is connected at 10G to the switch and end device at 1G to SFP+ how that's possible? Unlikely those modules have caching/buffers... where the rest of 9G packets go? Will fiddle with some iperf tests to check.

Regards,
M.
 
Last edited:

Freebsd1976

Active Member
Feb 23, 2018
340
58
28
mmx01 try 1G sfp to rj45 with qsa two year ago ,and it works( the other side is icx6450) , the sfp to rj45 is hp oem but can't remember its pn now
 

NablaSquaredG

Well-Known Member
Aug 17, 2020
739
350
63
A word or warning:
Do NOT try to directly manufacture 3.6.8012 with settings from the conversion guide! It will not work.

I just tried (and ran with verbose) - The RAMDISK is not big enough for the 3.6.8012 tar!
I will give you the following error:

Code:
== Running version: SX_PPC_M460EX SX_3.2.0100 2012-03-06 22:23:53 ppc
== Image version:   PPC_M460EX 3.6.8012 2019-02-22 07:53:42 ppc
== Image size: 357 MB / 1115 MB uncompressed
==== Uncompressing source image file: /tmp/mng_image_wi/tmpfs/unzip/image-PPC_M460EX-ppc-m460ex-20190222-075342.tgz to /tmp/mnt_image_wi/tmpfs/unzip/image-PPC_M460EX-3.6.8012.tar
zcat: write: No space left on device
zcat: crc error
zcat: Incorrect Length
==== Disk partitioning
Let's see why
Code:
RAMDISK size: 262144
Image from guide:
image-SX_PPC_M460EX-ppc-m460ex-20141215-232742.tgz: 239,8 MiB (251.401.030)

3.6.8012:
image-PPC_M460EX-ppc-m460ex-20190222-075342.tgz: 357,0 MiB (374.388.382)
I believe extending the RAMDISK size should work, I will try and let you know later!

Problem: The manufacture script does not stop on error. Rookie mistake from them! Hopefully fixed in later versions though.

Update 1: That didn't work. Not Surprising.

writeimage.sh in 3.2 mfg environment
Code:
1637: TMPFS_SIZE_MB=512
2949: mount -t tmpfs -o size=${TMPFS_SIZE_MB}M,mode=700 none ${target_dir} || FAILURE=1
Update 2:
from writeimage.sh 2965f:
Code:
# If we are manufacturing, we make the partitions a little earlier.
# Of course the wget/curl could fail, but that's a risk we're taking.
Wow. Good job, Tall Maple Systems, Inc. / Mellanox!
 
Last edited:
  • Like
Reactions: klui

mmx01

Member
Jan 17, 2020
42
26
18
mmx01 try 1G sfp to rj45 with qsa two year ago ,and it works( the other side is icx6450) , the sfp to rj45 is hp oem but can't remember its pn now
Thanks, tested few more 6COMGIGA GLC-T which are the cheapest modules on amazon here in EU and all of them work reliably (at least over a week in the home lab)! How unique situation in which more expensive modules fail and the cheapest contender wins.

I only regret lack of netmap support for mlx4 otherwise affordable ConnectX-3 series giving us 56Gbps ethernet... :(
 

Stephan

Well-Known Member
Apr 21, 2017
647
432
63
Germany
@mmx01 Make sure to test those modules under full load i.e. 1 Gbps in and 1 Gbps out simultaneously. There shouldn't be any large drop in throughput. If there is, experiment with flow control on/off on that port.
 

Dade49

New Member
Mar 26, 2021
3
0
1
I read the thread and have a good idea of how this is supposed to work. I'm having trouble getting the tools installed in my VM. Can anyone point me in the right direction? Thanks
 
Last edited:

nbritton

New Member
Nov 19, 2016
26
15
3
44
  1. Dump EEPROMs
    1. /opt/tms/bin/mellaggra _read_fru 1 0x51 1000 fru_backplate.bin
    2. /opt/tms/bin/mellaggra _read_fru 0 0x50 1000 fru_cpu.bin
Did I miss a step, doesn't seem to work...

Code:
[admin@switch-e1c06c ~]# /opt/tms/bin/mellaggra _read_fru 1 0x51 1000 fru_backplate.bin
Read fru 1 - 0x51, sz - 4096 to file fru_backplate.bin
EEPROM 1 - 0x51 was not read.
Also now that we have shell access, why not just install openocd and a full developer userland environment via an nfs mount or usb flash? I want to update the kernel and u-boot, and bolt on extra stuff like open-vswitch, ebtables, iptables, sr-iov, vyos-cli, openwrt, onie, et. al.
 

Stephan

Well-Known Member
Apr 21, 2017
647
432
63
Germany
Try /opt/tms/bin/mellaggra _read_fru 8 0x51 1000 fru_backplate.bin

We not only have shell access, but thanks to the recently passed godfather of U-Boot, Wolfgang Denk, also full PPC4xx debug possibilities. I never said it anywhere else but it was Wolfgang who generously supplied me with the necessary BDI binaries. So we could do neat stuff like single stepping through scheduler code, or the bootloader, or anything.

Why not: You will spend a boat load of time on a rare switch, and if you get lucky five people will flash their switch. Probably more satisfying to contribute THAT much time elsewhere in the open source sphere for the profit of a community. Personally I am content with my four SX6012 switches at 56 Gbps as-is.
 

nbritton

New Member
Nov 19, 2016
26
15
3
44
Why not: You will spend a boat load of time on a rare switch, and if you get lucky five people will flash their switch. Probably more satisfying to contribute THAT much time elsewhere in the open source sphere for the profit of a community. Personally I am content with my four SX6012 switches at 56 Gbps as-is.
I was a principal computer systems engineer but the government forced me into early medical retirement, so I have all the time in the world to play with this unicorn. I'm working on creating an entirely new shared memory instruction set architecture and I need complete control over the switching infrastructure to do that, like I need to keep track of RDMA fabric latencies in the kernel scheduler. 56Gb is fast enough for a proof of concept and dirt cheap relatively speaking.

Realize that the PPC460EX board is the computer itself, and that the SwitchX-2 ASIC switching board that is attached to the PPC460EX board is for all intents and purposes just a regular PCIe device. This means you can simply use the flint command to reprogram the SwitchX ASIC just like you can with any ConnectX PCIe add in card. I have taken the liberty of using flint to dump the firmware image from a known good and genuine Mellanox MSX6012F-2BFS running MLNX-OS 3.6.8012. Simply download the fw_SwitchX2-rel-9_4_5110-MT_1270110020.bin file to the tftp booted manufacturing image and then run:

Code:
mst start;
flint -d /dev/mst/mt51000_pci_cr0 --allow_psid_change --image fw_SwitchX2-rel-9_4_5110-MT_1270110020.bin burn;
Also I have dumped the firmware configuration file for this particular switch (MT_1270110020), it's just an ASCII text configuration file that contains all of the configurable INI parameters for the SwitchX-2 ASIC, you can modify all the parameters individually and then using the mlxburn utility, from the MFT Mellanox Firmware Tools package, you can compile a new custom firmware image bin file from the manufacturing firmware file fw-SX-rel-9_4_5110-FIT.mfa


Code:
mlxburn -wrimage fw_SwitchX2-rel-9_4_5110-MT_1270110020.bin -fw fw-SX-rel-9_4_5110-FIT.mfa -conf fw_SwitchX2-rel-9_4_5110-MT_1270110020.ini
Code:
[admin@sx6012fx2-e1c06c ~]# curl -Oks http://www.qubitdyne.com/storage/app/media/fw_SwitchX2-rel-9_4_5110-MT_1270110020.zip;
[admin@sx6012fx2-e1c06c ~]# unzip -q fw_SwitchX2-rel-9_4_5110-MT_1270110020.zip;
[admin@sx6012fx2-e1c06c ~]# ls -l
total 5870
drwxr-xr-x 2 admin root       0 Dec  5 09:25 __MACOSX
-rw-r--r-- 1 admin root 1254436 Feb 22  2019 fw-SX-rel-9_4_5110-FIT.mfa
-rw-r--r-- 1 admin root    4096 Dec  5 02:57 fw_SwitchX2-fru_backplate-MT_1270110020.bin
-rw-r--r-- 1 admin root      78 Dec  5 02:57 fw_SwitchX2-fru_backplate-MT_1270110020.md5
-rw-r--r-- 1 admin root 1590940 Dec  5 02:57 fw_SwitchX2-rel-9_4_5110-MT_1270110020.bin
-rw-r--r-- 1 admin root   92038 Dec  5 02:57 fw_SwitchX2-rel-9_4_5110-MT_1270110020.ini
-rw-r--r-- 1 admin root      72 Dec  5 02:57 fw_SwitchX2-rel-9_4_5110-MT_1270110020.md5
-rw-r--r-- 1 admin root 2458964 Dec  5 09:25 fw_SwitchX2-rel-9_4_5110-MT_1270110020.zip
-rw-r--r-- 1 admin root  604586 Dec  5 09:09 msx6012fx2-mlnx-os-switch-vpi-config.bin
-rw-r--r-- 1 admin root    2881 Dec  5 09:11 msx6012fx2-mlnx-os-switch-vpi-config.txt
Does any of the MLNX-OS images for the x86 SwitchX-2 contain a newer version of the firmware? If so you can use this configuration file to build a new custom firmware image from this newer version of the SwitchX-2 firmware, allegedly the SwitchX-2 VPI ASIC has the ability to work with FiberChannel, a lot is here to play with.

Code:
[admin@sx6012fx2-e1c06c ~]# mst start
Starting MST (Mellanox Software Tools) driver set
[warn] mst_pci is already loaded, skipping
[warn] mst_pciconf is already loaded, skipping
Create devices
[admin@sx6012fx2-e1c06c ~]# cd /dev/mst
[admin@sx6012fx2-e1c06c mst]# ls -l
total 0
lrwxrwxrwx 1 admin root     10 Dec  5 02:05 dev-i2c-0 -> /dev/i2c-0
lrwxrwxrwx 1 admin root     10 Dec  5 02:05 dev-i2c-1 -> /dev/i2c-1
lrwxrwxrwx 1 admin root     10 Dec  5 02:05 dev-i2c-2 -> /dev/i2c-2
lrwxrwxrwx 1 admin root     10 Dec  5 02:05 dev-i2c-3 -> /dev/i2c-3
lrwxrwxrwx 1 admin root     10 Dec  5 02:05 dev-i2c-4 -> /dev/i2c-4
lrwxrwxrwx 1 admin root     10 Dec  5 02:05 dev-i2c-5 -> /dev/i2c-5
lrwxrwxrwx 1 admin root     10 Dec  5 02:05 dev-i2c-6 -> /dev/i2c-6
lrwxrwxrwx 1 admin root     10 Dec  5 02:05 dev-i2c-7 -> /dev/i2c-7
lrwxrwxrwx 1 admin root     10 Dec  5 02:05 dev-i2c-8 -> /dev/i2c-8
lrwxrwxrwx 1 admin root     10 Dec  5 02:05 dev-i2c-9 -> /dev/i2c-9
crw------- 1 admin root 253, 0 Dec  4 02:00 mt51000_pci_cr0
crw------- 1 admin root 252, 0 Dec  4 02:00 mt51000_pciconf0
[admin@sx6012fx2-e1c06c mst]# flint -d /dev/mst/mt51000_pci_cr0 -override_cache_replacement query full

-W- Firmware flash cache access is enabled. Running in this mode may cause the firmware to hang.
Image type:          FS2
FW Version:          9.4.5110
FW Release Date:     12.2.2019
MIC Version:         2.0.0
Device ID:           51000
Description:         Node             Sys image
GUIDs:               f4521403005a7330 f4521403005a7330
Description:         Base             Switch
MACs:                    f452145a7330     f452145a7390
VSD:                 n/a
PSID:                MT_1270110020
[admin@sx6012fx2-e1c06c ~]# flint -d /dev/mst/mt51000_pci_cr0 -override_cache_replacement hw query

-W- Firmware flash cache access is enabled. Running in this mode may cause the firmware to hang.
HW Info:
  HwDevId               581
  HwRevId               0x2
Flash Info:
  Type                  W25QxxBV
  TotalSize             0x800000
  Banks                 0x2
  SectorSize            0x1000
  WriteBlockSize        0x10
  CmdSet                0x80
  QuadEn                0
  Flash0.WriteProtected   Disabled
  Flash1.WriteProtected   Top,1-SubSectors
[admin@sx6012fx2-e1c06c ~]# flint -d /dev/mst/mt51000_pci_cr0 -override_cache_replacement ri fw_SwitchX2-rel-9_4_5110-MT_1270110020.bin

-W- Firmware flash cache access is enabled. Running in this mode may cause the firmware to hang.
[admin@sx6012fx2-e1c06c ~]# flint -d /dev/mst/mt51000_pci_cr0 -override_cache_replacement dc fw_SwitchX2-rel-9_4_5110-MT_1270110020.ini

-W- Firmware flash cache access is enabled. Running in this mode may cause the firmware to hang.
[admin@sx6012fx2-e1c06c ~]# flint -i fw_SwitchX2-rel-9_4_5110-MT_1270110020.bin cs
-I- Calculating Checksum ...
Checksum: 9867d277d3afab65cf87af679d71ef3c
[admin@sx6012fx2-e1c06c ~]# flint -i fw_SwitchX2-rel-9_4_5110-MT_1270110020.bin cs > fw_SwitchX2-rel-9_4_5110-MT_1270110020.md5

The easy way to cross flash the EMC SwitchX ASIC is to use another Mellanox SX switch, just active the Infiniband Subnet Manager in MLNX-OS and then plug the EMC switch into it. Then SSH into the MLNX-OS switch, run enable and "_shell". Then run ibswitches to find the lid id, then run flint...
msx6012f-screenshot-1.png
Code:
sx6012fx2-e1c06c [standalone: master] > enable
sx6012fx2-e1c06c [standalone: master] # _shell
[admin@sx6012fx2-e1c06c ~]# ibswitches
Switch    : 0xec0d9a030062a980 ports 12 "SwitchX -  Mellanox Technologies" base port 0 lid 2 lmc 0
Switch    : 0xf4521403005a7332 ports 13 "MF0;sx6012fx2-e1c06c:SX6012/U1" enhanced port 0 lid 1 lmc 0
[admin@sx6012fx2-e1c06c ~]# flint -d lid-2 -image fw_SwitchX2-rel-9_4_5110-MT_1270110020.bin --allow_psid_change burn

    Current FW version on flash:  9.9.1260
    New FW version:               9.4.5110

    Note: The new FW version is older than the current FW version on flash.

Do you want to continue ? (y/n) [n] : y


    You are about to replace current PSID on flash - "EMC1270110020" with a different PSID - "MT_1270110020".
    Note: It is highly recommended not to change the PSID.

Do you want to continue ? (y/n) [n] : y
Burning FS2 FW image without signatures - OK
Restoring signature                     - OK
 
Last edited:
  • Like
Reactions: cy384 and klui

nbritton

New Member
Nov 19, 2016
26
15
3
44
The post just prior to this addressed cross flashing the SwitchX-2 ASCI that was preconfigured for EMC to that of a genuine Mellanox, in this post I'll address installing MLNX-OS onto the PPC460EX single board computer itself. For this we will tftp netboot using the MLNX-OS version 3.2.0100 manufacturing image from the EMC U-boot environment.

To do this connect a RJ45 to DB9 serial console cable (i.g. Cisco console cable) to the SX6012, I use a MacBook, so I also use a DB9 to FTDI USB adapter and the iTerm2 running "screen /dev/tty.usbserial..."

Code:
# Enable tftp and http server on your MacBook, document root is /private/tftp/

MacBook:~ nbritton$ sudo su -
Password:
MacBook:~ root# cd /private/tftp;
MacBook:~ root# launchctl load -F /System/Library/LaunchDaemons/tftp.plist;
MacBook:~ root# python -m SimpleHTTPServer 80;
Hit the escape key just before U-boot loads EMC OS, then run
Code:
setenv mfg_extra_args ramdisk=384000
setenv serverip <ip address of your tftp server with MLNX-OS manufacuring image>
saveenv
run mfg
Once your switch loads the MLNX-OS manufacturing image, you want to use the MLNX-OS version 3.4.1124 image. You must use this image as support for SwitchX-2 ASICs wasn't introduced until the release of 3.4.0012, more importantly, MLNX-OS version 3.4.1124 has a working "_shell" command from the switch configuration cli, you just have to enter the license key to activate it.

Code:
manufacture.sh -a -m ppc -u http://<ip_address>/image-PPC_M460EX-3.4.1124.img;
mlnx_mfg_screenshot.png

Reboot after successful imaging, to load the new MLNX-OS installation on the PPC460EX board hit escape at the EMC u-boot screen again and then run the u-boot code block referenced in MLNX-OS IMAGE ROOT_1 section bellow

Code:
MLNX-OS IMAGE ROOT_1:
setenv jffs2_args setenv bootargs root=/dev/mtdblock6 rootfstype=jffs2 rw reset_button=0 loglevel=6 loglevel=3
run jffs2_args boot_common_args;bootm ff000000 - ff1e0000

MLNX-OS IMAGE ROOT_2:
setenv jffs2_args setenv bootargs root=/dev/mtdblock7 rootfstype=jffs2 rw reset_button=0 loglevel=6 loglevel=3
run jffs2_args boot_common_args;bootm ff200000 - ff3e0000
For reference purposes, the flash data structures are as follows...
Code:
4ff000000.nor_flash: Found 1 x16 devices at 0x0 in 16-bit bank
Intel/Sharp Extended Query Table at 0x010A
Intel/Sharp Extended Query Table at 0x010A
Intel/Sharp Extended Query Table at 0x010A
Intel/Sharp Extended Query Table at 0x010A
Intel/Sharp Extended Query Table at 0x010A
Using buffer write method
Using auto-unlock on power-up/resume
cfi_cmdset_0001: Erase suspend on write enabled
erase region 0: offset=0x0,size=0x20000,blocks=127
erase region 1: offset=0xfe0000,size=0x8000,blocks=4
RedBoot partition parsing not available
Creating 6 MTD partitions on "4ff000000.nor_flash":
0x00000000-0x001e0000 : "KERNEL_1"
0x001e0000-0x00200000 : "FDT_1"
0x00200000-0x003e0000 : "KERNEL_2"
0x003e0000-0x00400000 : "FDT_2"
0x00f80000-0x00fa0000 : "UBOOTENV"
0x00fa0000-0x01000000 : "UBOOT"
NAND device: Manufacturer ID: 0x2c, Chip ID: 0xd3 (Micron NAND 1GiB 3,3V 8-bit)
2 NAND chips detected
Scanning device for bad blocks
Creating 4 MTD partitions on "4e0000000.ndfc.nand":
0x00000000-0x20000000 : "ROOT_1"
0x20000000-0x40000000 : "ROOT_2"
0x40000000-0x46400000 : "CONFIG"
0x46400000-0x7c000000 : "VAR"

Please append a correct "root=" boot option; here are the available partitions:
1f00            1920 mtdblock0  (driver?)
1f01             128 mtdblock1  (driver?)
1f02            1920 mtdblock2  (driver?)
1f03             128 mtdblock3  (driver?)
1f04             128 mtdblock4  (driver?)
1f05             384 mtdblock5  (driver?)
1f06          524288 mtdblock6  (driver?)
1f07          524288 mtdblock7  (driver?)
1f08          102400 mtdblock8  (driver?)
1f09          880640 mtdblock9  (driver?)
Code:
reboot
Dec  7 19:45:17 (none) daemon.info init: Starting pid 11588, console /dev/ttyS0: '/sbin/swapoff'
# Dec  7 19:45:17 (none) daemon.info init: The system is going down NOW !!
Dec  7 19:45:17 (none) daemon.info init: Sending SIGTERM to all processes.
Dec  7 19:45:17 (none) syslog.info System log daemon exiting.
Please stand by while rebooting the system.
Restarting system.


U-Boot 2009.01 SX_PPC_M460EX SX_3.2.0330-82-EMC ppc (Feb 27 2013 - 12:13:42)

CPU:   AMCC PowerPC 460EX Rev. B at 1000 MHz (PLB=166, OPB=83, EBC=83 MHz)
       Security/Kasumi support
       Bootstrap Option H - Boot ROM Location I2C (Addr 0x52)
       Internal PCI arbiter disabled
       32 kB I-Cache 32 kB D-Cache
Board: Mellanox PPC460EX Board
FDEF:  No
I2C:   ready
DRAM:   2 GB (ECC enabled, 333 MHz, CL3)
FLASH: 16 MB
NAND:  1024 MiB
PCI:   Bus Dev VenId DevId Class Int
PCIE0: link is not up.
PCIE1: successfully set as root-complex
        01  00  15b3  c738  0c06  00
Net:   ppc_4xx_eth0, ppc_4xx_eth1
Hit any key to stop autoboot:  0
=> setenv jffs2_args setenv bootargs root=/dev/mtdblock6 rootfstype=jffs2 rw reset_button=0 loglevel=6 loglevel=3
=> run jffs2_args boot_common_args;bootm ff000000 - ff1e0000
INIT: version 2.86 booting

Starting: PPC_M460EX 3.4.1124 2015-10-25 18:53:14 ppc
Starting udev: [  OK  ]
Setting clock  (utc): Wed Dec  7 19:46:26 UTC 2016 [  OK  ]
Setting hostname localhost:  [  OK  ]
Checking filesystems
Checking all file systems.
[  OK  ]
Remounting root filesystem in read-write mode:  [  OK  ]
Mounting local filesystems:  [  OK  ]
Running vpart script:  [  OK  ]
Applying file system skeletons: base_var base_config .
Running firstboot script Generating SSH1 RSA host key: [  OK  ]
Generating SSH2 RSA host key: [  OK  ]
Generating SSH2 DSA host key: [  OK  ]
    Starting sx_low_level_if:
Loading i2c_mux_pca954x driver  - Success
Loading glue logic low level  - Success
Loading watchdog  - Success
Loading cpld handler  - Success
Loading mellaggra module  - Success
Loading sx i2c module  - Success
Reloading udev:
Loading SX driver:[  OK  ]
Error: mlxi2c failed: cant read system type
[FAILED]
Enabling /etc/fstab swaps:  [  OK  ]
INIT: Entering runlevel: 3
Starting system services
Starting sx_low_level_if:      Starting sx_low_level_if:
    NOTE: i2c_mux_pca954x already loaded
    NOTE: sx low level if module already loaded
    NOTE: watchdog module already loaded
    NOTE: cpld handler module already loaded
    NOTE: mellaggra module already loaded
    NOTE: i2c sx module already loaded
[  OK  ]
Starting openibd:  IPoIB configuration for embedded system
Loading SX driver:[  OK  ]
Loading HCA driver and Access Layer:[  OK  ]
Setting up InfiniBand network interfaces:
Setting up service network . . .[  done  ]
Reloading udev:
[  OK  ]
Starting system logger: [  OK  ]
Starting kernel logger: [  OK  ]
Running renaming interfaces
Renaming: MAC: 50:6B:4B:15:6F:4E ifindex: 2 name: mgmt0
Renaming: MAC: 50:6B:4B:15:6F:4F ifindex: 3 name: mgmt1
Checking for unexpected shutdown

Probing for HRNG module
Starting rngd: [  OK  ]
Running system image: PPC_M460EX 3.4.1124 2015-10-25 18:53:14 ppc
Applying initial configuration: Dec 07 19:47:41 INFO    LOG: Initializing SX log with STDOUT as output file.
trace_emad type:0x0 max_cnt:0x0 direction:1
trace_reg id:0x0 max_cnt:0x0 :

Applying manufacturing configuration:
Starting internal_startup:  [  OK  ]
Starting tc_ingress_policy:  tc_ingress_policy system name is not obtained - use default IS5600MDC
mDNS policing rate=4000kbit burst=400k
Ingress policing enable on interface mgmt0 rate=9000kbit burst=900k
[  OK  ]
Starting clean_issnvram:  Deleting issnvram.txt
[  OK  ]
Starting intr_hndl:      Starting :
Loading int handler module - Success
[  OK  ]
Starting iss-nvram-mac:  [  OK  ]
Starting copy_rh_files_to_vtmp:  [  OK  ]
Starting sx_pra:      Starting proxy arp management:
Loading proxy arp management module - Success
[  OK  ]
Starting udevd:  Reloading udev...
[  OK  ]
Starting pm: [  OK  ]
Starting oops_dump_reg:      Starting kernel reg dump:
Loading kernel reg dump module - Success
[  OK  ]
Starting lnpuppetvar.sh:  [  OK  ]
Starting mst:  Starting MST (Mellanox Software Tools) driver set
Loading MST PCI module - Success
Loading MST PCI configuration module - Success
Create devices
[  OK  ]


Mellanox MLNX-OS Switch Management

switch-156f4e login: admin
Password:

Mellanox Switch


Mellanox configuration wizard

Do you want to use the wizard for initial configuration? yes

Step 1: Hostname? [switch-156f4e]
Step 2: Use DHCP on mgmt0 interface? yes
Step 3: Enable IPv6? [yes]
Step 4: Enable IPv6 autoconfig (SLAAC) on mgmt0 interface? [no]
Step 5: Enable DHCPv6 on mgmt0 interface? [no] yes
Step 6: Admin password (Enter to leave unchanged)?

You have entered the following information:

   1. Hostname: switch-156f4e
   2. Use DHCP on mgmt0 interface: yes
   3. Enable IPv6: yes
   4. Enable IPv6 autoconfig (SLAAC) on mgmt0 interface: no
   5. Enable DHCPv6 on mgmt0 interface: yes
   6. Admin password (Enter to leave unchanged): (unchanged)

To change an answer, enter the step number to return to.
Otherwise hit <enter> to save changes and exit.

Choice:

Configuration changes saved.

To return to the wizard from the CLI, enter the "configuration jump-start"
command from configure mode.  Launching CLI...

switch-156f4e [standalone: master] > enable
switch-156f4e [standalone: master] # configure terminal
switch-156f4e [standalone: master] (config) # image fetch http://192.168.1.178/image-PPC_M460EX-3.4.1124.img
100.0%  [#################################################################]
switch-156f4e [standalone: master] (config) # image install image-PPC_M460EX-3.4.1124.img 2
Step 1 of 4: Verify Image
100.0%  [#################################################################]
Step 2 of 4: Uncompress Image
100.0%  [#################################################################]
Step 3 of 4: Create Filesystems
100.0%  [#################################################################]
Step 4 of 4: Extract Image
100.0%  [#################################################################]
switch-156f4e [standalone: master] (config) # _shell
License key: LK2-RESTRICTED_CMDS_GEN2-88A1-NEWD-BPNB-1
[admin@switch-156f4e ~]# cd /config/mfg
[admin@switch-156f4e mfg]# ls
mfdb  mfdb.bak  mfincdb-prev  mfincdb.bak
[admin@switch-156f4e mfg]#
# The manufacture.sh script only writes the image to mtdblock6 (ROOT_1), mtdblock8 (CONFIG), and mtdblock9 (VAR), so now write the image again from your new MLNX-OS install to mtdblock7 (ROOT_2) for backup recovery. At this point you will now have direct access to the /config/mfg/mfdb database which contains the configuration information for the switch, i.e. /dev/mtdblock8.

Now you can start customizing the configuration database, such as the following...

Code:
/opt/tms/bin/mddbreq -v /config/mfg/mfdb set modify - /mfg/mfdb/switchx/system/chassis/config/type string SX6012

/opt/tms/bin/mddbreq -v /config/mfg/mfdb set modify - /mfg/mfdb/switchx/system/chassis/config/profile uint8 0

/opt/tms/bin/mddbreq -v /config/mfg/mfdb set modify - /mfg/mfdb/switchx/system/local_mgmt_pn string MSX6012F-2BFS

/opt/tms/bin/mddbreq -v /config/mfg/mfdb set modify - /mfg/mfdb/switchx/system/local_mgmt_sn string MT1803X05452

/opt/tms/bin/mddbreq -v /config/mfg/mfdb set modify - /mfg/mfdb/system/hostid string 656470A16218
I have attached the manufacture.sh "bash -x" debug output below.
 

Attachments

Last edited:
  • Like
Reactions: cy384 and klui

neggles

is 34 Xeons too many?
Sep 2, 2017
62
34
18
Melbourne, AU
omnom.net
You can and should be running `manufacture.sh` with a few more switches, and you can do it straight with 3.6.8012;

Code:
manufacture.sh -v -v -t -a -m ppc -u ftp://ftp-user:ftp@host/image-PPC_M460EX-3.6.8012.img
(or with a HTTP URL if that's what you're using) - this makes it print out progress and skips the reboot at the end, allowing you to mount the filesystem and edit the default/manufacturing database before booting into the image;

Code:
# mount system from mfg and chroot
mkdir -p /mnt/root2 && \
mount -t jffs2 /dev/mtdblock7 /mnt/root2 && \
mount -t jffs2 /dev/mtdblock8 /mnt/root2/config && \
mount -t proc /proc /mnt/root2/proc && \
mount -o bind /sys /mnt/root2/sys && \
mount -o bind /dev /mnt/root2/dev && \
chroot /mnt/root2 /bin/bash

# now we clear the bootloader password
# default val =  $1$yCoib8pn$vSaWSssw2k17iOJRIdmcw/
mddbreq /config/mfg/mfincdb set modify - /system/bootmgr/password string ''
mddbreq /config/mfg/mfincdb.bak set modify - /system/bootmgr/password string ''

# check they're empty now
mddbreq /config/mfg/mfincdb query get - /system/bootmgr/password
mddbreq /config/mfg/mfincdb.bak query get - /system/bootmgr/password

# set the hwname
mddbreq /config/mfg/mfdb set modify - /mfg/mfdb/system/hwname string M460EX

# exit chroot then do this to reboot and run firstboot
umount /mnt/root2/dev && umount /mnt/root2/sys && umount /mnt/root2/proc && umount /mnt/root2/config && umount /mnt/root2 && reboot
License key for the restricted commands and full feature unlock:

Code:
# RESTRICTED_CMDS_GEN2
genlicense 2 RESTRICTED_CMDS_GEN2 '[redacted i guess]'
# output: LK2-RESTRICTED_CMDS_GEN2-88A1-NEWD-BPNB-1

# All switch features - this is actually more than is necessary
genlicense 2 EFM_SX '[redacted i guess]' -o efm_sx_ib_enabled true -o efm_sx_eth_enabled true -o efm_sx_l2_enabled true -o efm_sx_l3_enabled true -o efm_sx_fcf_enabled true -o efm_sx_max_num_hca_ports 64 -o efm_sx_active_ports 64 -o efm_sx_gw_ports 64 -o efm_sx_max_ufm_ports 64
# key:
#   LK2-EFM_SX-5L11-5M11-5K11-5T11-5U11-5G22-05J2-205N-2205-P220-88A2-L5T5-3MYW-9
# or without the unnecessary port limits in there:
#   LK2-EFM_SX-5M11-5K11-5T11-88A1-BBD0-JP82-X
After running through manufacture.sh and the firstboot process, just apply the 3.6.8012 image again through the webUI to update the alternate OS partition.

Does any of the MLNX-OS images for the x86 SwitchX-2 contain a newer version of the firmware?
That's the last existing firmware for SwitchX-2 - and AFAIK none of the Mellanox SwitchX-2 switches are x86, only third-party ones. You can find the `.mfa` file in the 3.6.8012 firmware too btw - /opt/tms/bin/fw-SX-rel-9_4_5110-FIT.mfa

You'll also find that mainline Linux has a switchdev driver for SwitchX-2, and the SoC used on the COM Express module is just a regular old NXP/FreeScale QoriQ PowerPC that has quite good mainline support as well thanks to OpenWrt.

That said if you do end up doing anything particularly weird and wonderful with these, I have one that's partly broken - blew out one of the i2c multiplexers, so 6 out of 12 ports can't read their SFP EEPROMs and therefore don't work - that would be neat to do something with.
 
Last edited:
  • Like
Reactions: klui

neggles

is 34 Xeons too many?
Sep 2, 2017
62
34
18
Melbourne, AU
omnom.net
A word or warning:
Do NOT try to directly manufacture 3.6.8012 with settings from the conversion guide! It will not work.

I just tried (and ran with verbose) - The RAMDISK is not big enough for the 3.6.8012 tar!
I will give you the following error:

[snip]

I believe extending the RAMDISK size should work, I will try and let you know later!
Problem: The manufacture script does not stop on error. Rookie mistake from them! Hopefully fixed in later versions though.

Update 1: That didn't work. Not Surprising.

writeimage.sh in 3.2 mfg environment
Code:
1637: TMPFS_SIZE_MB=512
2949: mount -t tmpfs -o size=${TMPFS_SIZE_MB}M,mode=700 none ${target_dir} || FAILURE=1
Update 2:
from writeimage.sh 2965f:
Code:
# If we are manufacturing, we make the partitions a little earlier.
# Of course the wget/curl could fail, but that's a risk we're taking.
Wow. Good job, Tall Maple Systems, Inc. / Mellanox!
Just saw this post. I see someone (or myself) will have to put all the knowledge of how to do this back in one place again, since things which have been covered multiple times earlier in the thread keep cropping up and causing people trouble.

Code:
# manual mfg boot over tftp from u-boot shell
run mfg_load;if test 0${filesize} -gt 0; then echo Booting mfg ; run mfg_args mfg_common_args;bootm ${kernel_addr_r} ${ramdisk_addr_r} ${fdt_addr_r} ; else ; echo Failed mfg load ; fi

# apply image over ftp from manufacturing environment without rebooting and with progress
manufacture.sh -v -v -t -a -m ppc -u ftp://ftp-user:ftp@host/image-PPC_M460EX-3.6.8012.img
see post #1072 for more details, but this skips the need for enlarging the RAMdisk. It does take longer, though.

More of my notes on pastebin here and further up the thread.
 
  • Like
Reactions: cy384 and klui

nbritton

New Member
Nov 19, 2016
26
15
3
44
Neggles,

Thank you, I have another virgin EMC switch that I can try your instructions on, the trouble is this thread is now 62 pages long so this collective knowledge gets lost in the clutter. I have at least ten Mellanox SX switches in my lab so I'm definitely interested in continuing development on the SwitchX-2 platform (particularly for my new shared memory architecture system) and attempting to consolidating all of the information here into an single source authoritative reference.

After running through manufacture.sh and the firstboot process, just apply the 3.6.8012 image again through the webUI to update the alternate OS partition.

That's the last existing firmware for SwitchX-2 - and AFAIK none of the Mellanox SwitchX-2 switches are x86, only third-party ones. You can find the `.mfa` file in the 3.6.8012 firmware too btw - /opt/tms/bin/fw-SX-rel-9_4_5110-FIT.mfa

You'll also find that mainline Linux has a switchdev driver for SwitchX-2, and the SoC used on the COM Express module is just a regular old NXP/FreeScale QoriQ PowerPC that has quite good mainline support as well thanks to OpenWrt.

That said if you do end up doing anything particularly weird and wonderful with these, I have one that's partly broken - blew out one of the i2c multiplexers, so 6 out of 12 ports can't read their SFP EEPROMs and therefore don't work - that would be neat to do something with.
 

i386

Well-Known Member
Mar 18, 2016
3,530
1,215
113
33
Germany
did i See the mellanox Secret in the command?
I hope Patrick/sth doesn't get a dmca takedown notification from Nvidia for that...