Beware of EMC switches sold as Mellanox SX6XXX on eBay

andvalb

Member
Feb 15, 2021
27
25
13
Ulyanovsk, Russian Federation
Just used the dodgy route method to update a 'new in box' EMC 6012 that I tried the old method on; and got as far as not being able to login to the full mlx-linux to flash the ASIC; so I went back to square one.

Once you have the environment setup, this method is quite painless.. I would say that the FRU reprogram process feels a bit clunky but it worked; i couldn't get my files to match up with the byte counts in the documentation, but after much checking and rechecking, the FRU conversion worked no problem.

Does anyone know if there is a way to make the command line fan speed changes that SGS referenced on post #739 persist between reboots?

I would prefer to add this line to a startup script; than hexedit the tc file..
Add admincli user from the switch's web shell witch password, then in the ssh:
enable
config terminal
_shell
mddbreq /config/db/initial set modify - /auth/passwd/user/admincli/shell string "/bin/bash"
exit
login second time and execute
reload command to reboot the switch
after that use WinScp to modify the switch file system (use admincli login and password for login).
Before modification use built-in WinScp terminal to execute the command
mount -nwo remount,rw /
to remount root as rw.
then modify files as you need.
 

NablaSquaredG

Well-Known Member
Aug 17, 2020
697
311
63
The CLI of my SX6036 is extremely slow. I think it has been like that forever, but I can't say for sure.

I have L3 Ethernet enabled on all of them and they're used in MLAG, could this be the reason?

Also, I've heard that there is some kind of (hardware) difference between standard SX6036 and SX6036G (i.e. an SX6036 that was upgraded with a license vs an SX6036 that already shipped with the L3 license).
Is that true? Could it be that the "original" SX6036G is shipped with the 200MHz clock mentioned earlier?
 

NablaSquaredG

Well-Known Member
Aug 17, 2020
697
311
63
Is there a collection of FRU files?

I think it would be interesting to collect FRU files of as many different switch models as possible for research purposes

Sadly it seems like @SGS is not active any longer
 

dodgy route

Member
Aug 12, 2020
42
57
18
Australia
Is there a collection of FRU files?

I think it would be interesting to collect FRU files of as many different switch models as possible for research purposes

Sadly it seems like @SGS is not active any longer
What do you mean exactly, or what are you trying to achieve?

I got 2 out of 3 relevant ones, one from sx6012 that SGS provided which I then converted to sx6018 and it has been tested.
No one has bothered with the sc6036 yet or it may not be needed, unsure.

It's easy to convert to sx10xx using SGS scripts too.
 

NablaSquaredG

Well-Known Member
Aug 17, 2020
697
311
63
What do you mean exactly, or what are you trying to achieve?
Research - I'd like to experiment with the backplane FRU files and see how the content is structured, etc...

I have a couple of SX6036, possibly different ASIC versions and OEMs, I will read out the FRU files soon
 

saratoga

New Member
Nov 5, 2021
7
2
3
The CLI of my SX6036 is extremely slow. I think it has been like that forever, but I can't say for sure.
I had the same issue, slow CLI and very slow web interface after updating from old firmware to newer on the HP site. Factory restore fixed that problem and restored it back to being responsive. I think probably since mine was an ebay IB switch the previous owner had loaded some custom settings that got screwed up when updating and switching to ethernet.
 

NablaSquaredG

Well-Known Member
Aug 17, 2020
697
311
63
I had the same issue, slow CLI and very slow web interface after updating from old firmware to newer on the HP site. Factory restore fixed that problem and restored it back to being responsive. I think probably since mine was an ebay IB switch the previous owner had loaded some custom settings that got screwed up when updating and switching to ethernet.
I definitely did a factory reset on them, because I managed to more or less brick one of the switches by crashing the CLI process o_O

Even after the factory reset, the switch was basically unusable, but recovered after some time. I suppose this is related to the known flash bug in some MLNX-OS versions

I think the unresponsiveness of the switch is related to my usage of MLAG... Because that seems to be a feature that just made it into the PPC versions by accident, but was initially developed and optimised for the Spectrum switches

But I will check the contents of the bootstrap ROM, somewhere earlier in this thread somebody mentioned that an upgrade to 3.6.8012 messed up the bootstrap ROM contents
 

solon

Member
Apr 1, 2021
57
5
8
Hey all,

I purchased a SX6012 a while ago and have finally gotten around to messing with it. It looks like that despite my best efforts to get ahold of a true melanox one, it appears to be an EMC version.

Code:
* Chasis Type        : DINGO
* Number of Ports    : 12
* U-Boot Revision    :U-Boot 2009.01 SX_PPC_M460EX SX_3.2.0330-82-EMC ppc (Feb 27 2013 - 12:13:42)
* Firmware Revision  : 9.9.1260
* INI file Revision  : 0x31010016
Now I've found the guide referenced in this thread, but it seems all the links to the firmware hosted at support.hpe.com are dead. Does anyone know of an alternative place to get ahold of all those files?

Additionally, I haven't been able to communicate with it over the mgmt port, I can get in with the serial console and give the unit an IP adress (can't seem to find an option to enable dhcp), and then it will ping my router but it refuses port 22 connections and no web interface is available. Is that normal for EMC units? Can't seem to find any commands in the serial help to enable a ssh server or a web interface either.

It'd be great to get a 40gbe network up and running if I can get this thing on a regular MLNX-OS, so I'd greatly appreciate any thoughts.
 

dodgy route

Member
Aug 12, 2020
42
57
18
Australia
Now I've found the guide referenced in this thread, but it seems all the links to the firmware hosted at support.hpe.com are dead. Does anyone know of an alternative place to get ahold of all those files?
Which guide specifically? And the firmware most definitely and absolutely still works from hpe.
 

kellenw

New Member
Jan 15, 2022
15
8
3
Additionally, I haven't been able to communicate with it over the mgmt port, I can get in with the serial console and give the unit an IP adress (can't seem to find an option to enable dhcp), and then it will ping my router but it refuses port 22 connections and no web interface is available. Is that normal for EMC units? Can't seem to find any commands in the serial help to enable a ssh server or a web interface either.

It'd be great to get a 40gbe network up and running if I can get this thing on a regular MLNX-OS, so I'd greatly appreciate any thoughts.
I just configured two SX6036 switches this week (SX6012 works the same way). Try following the instructions in the manual, starting on page 43. You can launch the wizard, and it'll allow you to set the switch up as a dhcp client on the mgmt port. Once you've done that, hook the mgmt port of the switch up to a network with an active DHCP server. Your Mellanox switch will get a dhcp address in this way, and you should be able to do your thing.
 

solon

Member
Apr 1, 2021
57
5
8
Which guide specifically? And the firmware most definitely and absolutely still works from hpe.
Yours I think:

Nothing happens when I follow those links... unless my pihole has decided that hpe is an ad site, it's a little mysterious...
 

solon

Member
Apr 1, 2021
57
5
8
I just configured two SX6036 switches this week (SX6012 works the same way). Try following the instructions in the manual, starting on page 43. You can launch the wizard, and it'll allow you to set the switch up as a dhcp client on the mgmt port. Once you've done that, hook the mgmt port of the switch up to a network with an active DHCP server. Your Mellanox switch will get a dhcp address in this way, and you should be able to do your thing.
Thanks! I'll give that a go tomorrow. I was looking for a way to launch the wizard, so that should be exactly what I need.
 

bentwire

New Member
Feb 19, 2022
9
3
3
Anyone here have any experience with an MSX6720? I just got one off EBAY for next to nothing. It seems to have a really old and apparently IBM version of MLNX-OS on it, 3.5.0100 and what I think is a MLNX version 3.4.3002. I believe that this is an IBM switch because it has an IBM looking FRU Part # but can't find any reference to it anywhere.

I mangled the URL in the awesome guide here (I followed it for an SX6012 back in early 2020) to get the latest X86_64 img for this thing, but I am not sure what the other versions are that I need to stage the process. Is it the same as what you have listed in your guide? I don't think HP has any x86_64 switches, so I think (?) I need to get them from Mellanox/Nvidia. Is this correct?

I basically want to get a VPI capable version of the OS on here so I can just have the one switch instead of this *and* my converted SX6012.
 
  • Like
Reactions: RedX1

bentwire

New Member
Feb 19, 2022
9
3
3
Ahh, nevermind! I got it updated. Have not figured out how to enable VPI yet... I'm pretty sure I saw trhat this switch supported it....

For the others out there that may have this switch, I went from the 3.5.0100 to 3.6.8010 then 3.6.8012. To get the files from the site use the last URL in the doc and s/PPC_M460EX/X86_64/ ...

Edit:

The guide is even more amazing than before! Through further experimentation and your guide I have all the necessary things enabled and installed!
 
Last edited:

dodgy route

Member
Aug 12, 2020
42
57
18
Australia
Yours I think:

Nothing happens when I follow those links... unless my pihole has decided that hpe is an ad site, it's a little mysterious...
I did not get a notification for some reason and only just checked back by chance, weird issue, it works fine here, those links are well and working :/
 

Stephan

Well-Known Member
Apr 21, 2017
562
360
63
Germany
Looks like I just bricked my SX6012 by trying manufacturer.sh with image-PPC_M460EX-3.6.8012.img:

== Calling writeimage to image system
zcat: write: No space left on device
zcat: crc error
zcat: Incorrect length

No more console, U-Boot gone? System appears to have NOR and NAND flash, any way to switch to a backup U-Boot or will a BDI2000 be the only way out of this? Thanks... ^^ :-/
 

andvalb

Member
Feb 15, 2021
27
25
13
Ulyanovsk, Russian Federation
Looks like I just bricked my SX6012 by trying manufacturer.sh with image-PPC_M460EX-3.6.8012.img:

== Calling writeimage to image system
zcat: write: No space left on device
zcat: crc error
zcat: Incorrect length

No more console, U-Boot gone? System appears to have NOR and NAND flash, any way to switch to a backup U-Boot or will a BDI2000 be the only way out of this? Thanks... ^^ :-/
This is strange - looks like you tried to begin manufacturing directly from the last version of the image - which is not working but I am tried this too and it's safe.
Can you provide a full log?
Also if you have a problem with the bootstrap there will be a console but at a different speed - try another setting in the range 1200...115200.
Also, check the serial connection.
There is no such thing as a backup uboot.
# 16 MB NOR Flash:
# /dev/mtd0 [nmp] kernel 1 (raw partition, uImage kernel, cp / dd)
# /dev/mtd1 [nmp] kernel fdt 1 (raw partition, DTB, cp / dd)
# /dev/mtd2 [nmp] kernel 2 (raw partition, uImage kernel, cp / dd)
# /dev/mtd3 [nmp] kernel fdt 2 (raw partition, DTB, cp / dd)
# /dev/mtd4 [nmp] u-boot env (raw partition, U-boot env, cp / dd)
# /dev/mtd5 [nmp] u-boot (raw partition, U-boot, cp / dd)
 
Last edited:
  • Like
Reactions: Stephan

Stephan

Well-Known Member
Apr 21, 2017
562
360
63
Germany
Tried 1200...115200 in the usual steps but no output or console at all. Cable works because another switch works right away at 9600. Console output on that one (which I haven't touched yet) is also basically instant after power-good. Cable is one of those Cisco baby blue console cables.

I didn't run manufacture.sh with -v -v this time so nothing more in the console log. Original story is EMC -> use manufacture.sh to reflash to 3.4 successfully -> boot into OS, login as admin/admin -> do emc_to_6012 + delete u-boot password in config and eeprom using eetool -> reboot into manufacture env to flash 3.6.8012 -> then what you see above with script just hanging there. Mistake was probably to also use -B in "/sbin/manufacture.sh -a -m ppc -B -u http://192.168.128.100:8100/image-PPC_M460EX-SX_3.4.0012.img".

I checked JTAG options for the 460EX, slim. Some Lauterbach TRACE32, but availability on used market for a working bundle looks to me equallying zero. Really leaves me only BDI 2000 with the right firmware i.e. "pp4" (not "ppc") with b20pp4gd + pp4jed20 / pp4jed21 depending BDI revision. Board silkscreen also calling for BDI. Guess they designed the CPU board close to AMCC canyonlands.

Bash:
# /sbin/manufacture.sh -a -m ppc -B -u http://192.168.128.100:8100/image-PPC_M460EX-3.6.8012.img
====== Starting manufacture at 20220225-171318
====== Called as: /sbin/manufacture.sh -a -m ppc -B -u http://192.168.128.100:8100/image-PPC_M460EX-3.6.8012.img

==================================================
 Manufacture script starting
==================================================

== Using model: ppc
== Using kernel type: uni
== Using layout: MFL1
== Using partition name-size list:
== Using device list: /dev/mtd
== Using interface list: mgmt0 mgmt1
== Using interface naming: ifindex-sorted
== Smartd disabled
== Cluster enable: no
== Cluster ID: (none)
== Cluster description: (none)
== Cluster interface: (none)
== Cluster master virtual IP address: 0.0.0.0
== Cluster master virtual IP masklen: 0
== Cluster shared secret: (none)
== Cluster expected number of nodes: 0
- Assigning specified interface names in ifindex-sorted order
-- Mapping MAC: xxxxxxxxxx from: eth0 to: mgmt0
-- Mapping MAC: yyyyyyyyyy from: eth1 to: mgmt1
== Using image from URL: http://192.168.128.100:8100/image-PPC_M460EX-3.6.8012.img

== Calling writeimage to image system
zcat: write: No space left on device
zcat: crc error
zcat: Incorrect length
 
Last edited:
  • Like
Reactions: klui

andvalb

Member
Feb 15, 2021
27
25
13
Ulyanovsk, Russian Federation
Tried 1200...115200 in the usual steps but no output or console at all. Cable works because another switch works right away at 9600. Console output on that one (which I haven't touched yet) is also basically instant after power-good. Cable is one of those Cisco baby blue console cables.

I didn't run manufacture.sh with -v -v this time so nothing more in the console log. Original story is EMC -> use manufacture.sh to reflash to 3.4 successfully -> boot into OS, login as admin/admin -> do emc_to_6012 + delete u-boot password in config and eeprom using eetool -> reboot into manufacture env to flash 3.6.8012 -> then what you see above with script just hanging there. Mistake was probably to also use -B in "/sbin/manufacture.sh -a -m ppc -B -u http://192.168.128.100:8100/image-PPC_M460EX-SX_3.4.0012.img".

I checked JTAG options for the 460EX, slim. Some Lauterbach TRACE32, but availability on used market for a working bundle looks to me equallying zero. Really leaves me only BDI 2000 with the right firmware i.e. "pp4" (not "ppc") with b20pp4gd + pp4jed20 / pp4jed21 depending BDI revision. Board silkscreen also calling for BDI. Guess they designed the CPU board close to AMCC canyonlands.

Bash:
# /sbin/manufacture.sh -a -m ppc -B -u http://192.168.128.100:8100/image-PPC_M460EX-3.6.8012.img
====== Starting manufacture at 20220225-171318
====== Called as: /sbin/manufacture.sh -a -m ppc -B -u http://192.168.128.100:8100/image-PPC_M460EX-3.6.8012.img

==================================================
Manufacture script starting
==================================================

== Using model: ppc
== Using kernel type: uni
== Using layout: MFL1
== Using partition name-size list:
== Using device list: /dev/mtd
== Using interface list: mgmt0 mgmt1
== Using interface naming: ifindex-sorted
== Smartd disabled
== Cluster enable: no
== Cluster ID: (none)
== Cluster description: (none)
== Cluster interface: (none)
== Cluster master virtual IP address: 0.0.0.0
== Cluster master virtual IP masklen: 0
== Cluster shared secret: (none)
== Cluster expected number of nodes: 0
- Assigning specified interface names in ifindex-sorted order
-- Mapping MAC: xxxxxxxxxx from: eth0 to: mgmt0
-- Mapping MAC: yyyyyyyyyy from: eth1 to: mgmt1
== Using image from URL: http://192.168.128.100:8100/image-PPC_M460EX-3.6.8012.img

== Calling writeimage to image system
zcat: write: No space left on device
zcat: crc error
zcat: Incorrect length
You do not need to run manufacturing a second time after the first is successfully done (to the old image version).
You just need to update through the standard sequentially update procedure (max version diff is 2 - i.e 3.2 -> 3.4 -> 3.6) to get the latest version.
Manufacturing directly to the latest image release doesn't work.
Image of the uboot already present in the firmware image archive.
You can use it to program NOR flash using a simple and cheap programmer from the aliexpress (ch341 programmer).
Also, check the content of the bootstrap EEPROM located at the CPU board using a programmer - it can be damaged (better to do this as first step).
Note - DO NOT USE TEST CLIP for reading/writing the content of the EEPROM/NOR chips on the CPU board, this leads to the CPU startup, and chip content can be damaged.
Unsolder the chip instead and then resolder it back after reading/writing.
 
Last edited:

Stephan

Well-Known Member
Apr 21, 2017
562
360
63
Germany
Thank you for all the tips @andvalb, highly appreciated. Been spending the better part of today on research about BDI 2000. Also been messaging people via e-mail, twitter, forums, and also forums I just registered for, about pp4 firmware and logic file. Found ppc version over on eevblog but that is for a different lineage of CPUs. Archive.org says former CEOs of Abatron AG in Switzerland are Max Vock and Rudolf Dummermuth. If desperate enough, might even try to call them up in their retirement and ask for help. Also already owning a sizeable collection of AMCC 460EX BDI configuration files.

Fortunately I have a dump from eetool which seems to be a serial EEPROM storing MAC addresses, serial numbers etc. so re-populating u-boot vars shall not be a problem. My view of the switch's complex is
  • 2 GB DDR2 ECC RAM
  • 16 MB NOR flash (U-Boot etc.) a JS28F128 on the backside of the CPU daughterboard
  • 1024 MB SLC NAND flash a Samsung K9WAG08U1D on topside CPU daughterboard
  • 24C02 256 byte EEPROM holding 0x52 "Bootstrap Option H - Boot ROM Location I2C (Addr 0x52)" topside CPU daughterboard
  • 24C32 4KB EEPROM holding 0x50 "fru_cpu.bin" topside CPU daughterboard
  • 24C32 4KB EEPROM holding 0x51 "fru_backplate.bin" on motherboard U33 next to daughterboard connector
Will be interesting to see what manufacture.sh has damaged. I think I have a backup of every crucial bit of data in those chips.

NOR-Layout:
Creating 6 MTD partitions on "4ff000000.nor_flash":
0x00000000-0x001e0000 : "KERNEL_1"
0x001e0000-0x00200000 : "FDT_1"
0x00200000-0x003e0000 : "KERNEL_2"
0x003e0000-0x00400000 : "FDT_2"
0x00f80000-0x00fa0000 : "UBOOTENV"
0x00fa0000-0x01000000 : "UBOOT"

NAND-Layout:
Creating 4 MTD partitions on "4e0000000.ndfc.nand":
0x00000000-0x20000000 : "ROOT_1"
0x20000000-0x40000000 : "ROOT_2"
0x40000000-0x46400000 : "CONFIG"
0x46400000-0x7c000000 : "VAR"

Will be sidetracked for the next two weeks, switch will soon live again I think. This kind of hardware build quality deserves it.
 
  • Like
Reactions: andvalb and klui