Beware of EMC switches sold as Mellanox SX6XXX on eBay

Discussion in 'Networking' started by mpogr, Aug 2, 2016.

  1. JTF195

    JTF195 New Member

    Joined:
    Nov 15, 2017
    Messages:
    9
    Likes Received:
    4
    After getting past the tty boot loop with the trick above, I'm sitting at 'Step 3: Creating the file systems on the switch and extracting the distro'

    I immediately ran into an issue. The commands
    Code:
    mount -t proc none /proc
    mount -t sysfs none /sys
    both fail.

    I'm not really sure why, and I also noticed they were already included in the /etc/init.d/rcS file in the jffs2 image, so it's possible they've already been run during init?

    I've been taking a break from working on this for a while.

    I'm thinking this might be part of my issue as well. I grabbed the one from the HP support site as well, but mpogr confirmed it was the correct image.

    What differences have you noticed? (And would you be willing to send me a copy by chance?)
     
    #181
  2. Terry Wallace

    Terry Wallace PsyOps SysOp

    Joined:
    Aug 13, 2018
    Messages:
    63
    Likes Received:
    17
    JTF,
    Those first two commands failed because they are already mounted which you can check by running 'mount' , you can safely ignore those.
    The only reason I suspect I got the wrong firmware was becuase when I burned it it had a HPxxxxxxxx SID rather than the MLXxxxxxx SID and now the MLX-OS on the switch says there is no asic switch found.

    I'll PM you about the image
     
    #182
    Last edited: Oct 17, 2018
  3. Rand__

    Rand__ Well-Known Member

    Joined:
    Mar 6, 2014
    Messages:
    3,273
    Likes Received:
    447
    Step 8: (optional) Flash Firmware locally

    Code:
    To flash the firmware locally you need shell access to the running MLNX-OS
    
    The steps are
    
    -Boot into scratch linux and adjust admin shell in passwd to /bin/bash (you will need to remove the symlink of /mnt/root2/etc/passwd and create a local copy in /mnt/root2/etc).
    
    -transfer created firmware file to the switch using tftp or scp (either in scratch Linux or MLNX-OS)
    
    -Boot MLNX-OS
    
    -identify target device
    
    mst status
    
    -make sure one more time you have the correct image
    
    flint –I <firmwarename>.bin q
    
    -flash firmware
    
    flint –allow_psid_change –override_cache_replacement –d <device> -I <firmwarename>.bin  b
    
    Reboot the switch.
     
    #183
    Last edited: Oct 17, 2018
    Matt G and metag like this.
  4. Terry Wallace

    Terry Wallace PsyOps SysOp

    Joined:
    Aug 13, 2018
    Messages:
    63
    Likes Received:
    17
    will try it now.. thanks
     
    #184
  5. Rand__

    Rand__ Well-Known Member

    Joined:
    Mar 6, 2014
    Messages:
    3,273
    Likes Received:
    447
    welcome. Its actually a part I added (with a few other comments/improvements) which I thought would make their way into the "official" guide but might not have. GL
     
    #185
  6. Terry Wallace

    Terry Wallace PsyOps SysOp

    Joined:
    Aug 13, 2018
    Messages:
    63
    Likes Received:
    17
    And we're up and running.. Thanks to @mpogr and @rand_ for the help.

    Has anyone tried upgrading the firmware the normal way using the normal mallanox OS upgrades after getting it converted ?
    say going from 3.6.1002 upto 3.6.5009 ? Or will I run into trouble ? Just thought I would inquire.

    Thanks again to all

    Terry
     
    #186
  7. Rand__

    Rand__ Well-Known Member

    Joined:
    Mar 6, 2014
    Messages:
    3,273
    Likes Received:
    447
    Congrats.
    I have not tried since I expect trouble.
    Anything juicy in the newer fw?
     
    #187
  8. Kal G

    Kal G Active Member

    Joined:
    Oct 29, 2014
    Messages:
    147
    Likes Received:
    38
    Bare metal VXLAN is available starting in version 3.6.3004. Improvements to VXLAN were made in 3.6.4006.
     
    #188
  9. jspuij

    jspuij New Member

    Joined:
    Oct 11, 2018
    Messages:
    2
    Likes Received:
    1
    And another succesful conversion. Many thanks to mpogr for providing the neccessary tools!
    Stupid question maybe, I can't seem to get the link speed on infiniband from 40Gb to 56 Gb/s. Switch seems to support 56Gb/s according to it's profile, but cannot change port speed. Anybody a tip?[​IMG]

    Edit: nevermind, then ibportstate <lid> <port> espeed 1 command and ibportstate <lid> <port> reset did the trick. However it takes a minute or so to reconfigure the link in my case, and by that time I had disabled and enabled it already and was back to 40Gbit.
     

    Attached Files:

    #189
    Last edited: Oct 23, 2018
  10. jspuij

    jspuij New Member

    Joined:
    Oct 11, 2018
    Messages:
    2
    Likes Received:
    1
    I'm trying to get two switches to form a HA IB cluster. Joining the second node fails however with the following messages in the log:

    Code:
    Nov  2 15:18:10 switch1 clusterd[25872]: [clusterd.ERR]: HMAC verification failed
    Nov  2 15:18:10 switch1 clusterd[25872]: [clusterd.ERR]: cl_verify_hmac_md5(), cl_comm.c:1503, build 1: Error code 14613 returned
    Nov  2 15:18:10 switch1 clusterd[25872]: [clusterd.ERR]: cl_ccl_msg_recv(), cl_comm.c:1554, build 1: HMAC verification failed
    
    Could this have anything to do with the shared secret being different between the two switches and does it mean I have to reinstall one of the switches to share the same secret?

    Update: indeed the shared secret during database setup has to be the same:

    Code:
    /mfg/mfdb/cluster/config/shared-secret
    
    Note that a running system has this secret copied to the initial database, and can be adjusted by changing the initial database (if you still have access to a shell):

    Code:
    mddbreq -c /config/db/initial set modify - /cluster/config/shared-secret string <YOURSECRETSTRING>
    
    This unfortunately means that converted switches will never be able to HA with a genuine mellanox switch unless somebody is willing to extract the Mellanox shared secret from a genuine switch.
     
    #190
    Last edited: Nov 3, 2018
    Matt G likes this.
  11. metag

    metag New Member

    Joined:
    Apr 26, 2016
    Messages:
    26
    Likes Received:
    4
    Just to report another successful conversion. Cannot thank @mpogr enough for the work and detailed guide! I also want to thank @JSLEnterprises for the base image, @Rand__ for the local firmware upgrade instructions, also everyone else sharing their experience in this thread: a lot of collective wisdom there!
     
    #191
  12. cecil1783

    cecil1783 New Member

    Joined:
    Aug 9, 2017
    Messages:
    7
    Likes Received:
    8
    Hey all,

    I'm having some troubles flashing my switch. I've gotten everything up through step 6 of the guide, but when I try to boot into the new OS I get an endless stream of errors:

    Code:
    U-Boot 2009.01 SX_PPC_M460EX SX_3.2.0330-82-EMC ppc (Feb 27 2013 - 12:13:42)
    
    CPU:   AMCC PowerPC 460EX Rev. B at 1000 MHz (PLB=166, OPB=83, EBC=83 MHz)
           Security/Kasumi support
           Bootstrap Option H - Boot ROM Location I2C (Addr 0x52)
           Internal PCI arbiter disabled
           32 kB I-Cache 32 kB D-Cache
    Board: Mellanox PPC460EX Board
    FDEF:  No
    I2C:   ready
    DRAM:   2 GB (ECC enabled, 333 MHz, CL3)
    FLASH: 16 MB
    NAND:  1024 MiB
    PCI:   Bus Dev VenId DevId Class Int
    PCIE0: link is not up.
    PCIE1: successfully set as root-complex
            01  00  15b3  c738  0c06  00
    Net:   ppc_4xx_eth0, ppc_4xx_eth1
    Hit any key to stop autoboot:  0
    INIT: version 2.86 booting
    
    Starting: PPC_M460EX 3.6.1002 2016-06-09 20:24:26 ppc
    Starting udev: [  OK  ]
    Setting clock  (utc): Thu Nov 15 05:08:18 UTC 2018 [  OK  ]
    Setting hostname localhost:  [  OK  ]
    Checking filesystems
    Checking all file systems.
    [  OK  ]
    Remounting root filesystem in read-write mode:  [  OK  ]
    Mounting local filesystems:  [  OK  ]
    Running vpart script:  [  OK  ]
    Applying file system skeletons: base_var base_config .
    Running firstboot script usage: /sbin/aiset.sh -i [-l NEXT_BOOT_ID] [-p MD5_PASSWORD] [-r] [-f {true,false}] [-F] [-E]
    usage: /sbin/aiset.sh -m -d BOOT_DISK [-L LAYOUT] [-l NEXT_BOOT_ID]
              [-p MD5_PASSWORD]
    
    -i: not running at manufacture time (generally image install)
    -m: running at manufacture time
    
    -l NEXT_BOOT_ID: image location to boot from: 1 or 2
    -d BOOT_DISK: (mfg only) /dev/sda or /dev/hda
    -L LAYOUT: (mfg only) image layout, like STD
    -w HWNAME: (mfg only) hardware name (usually optional on x86)
    -p MD5_PASSWORD: MD5 encrypted password
    -r: (install only) re-install the bootmgr itself (GRUB or u-boot)
    -f {true,false}: enable or disable fallback reboot behavior for next boot
    -I IMAGE_LOCATION_ID -s IMAGE_LOCATION_STATE : exclusive with -l
            States are: 0=invalid; 1=active; 2=fallback; 3=manual
    -F FIPS: Use this flag to add fips=1 flag for command line run
    -E FIPS_DISABLE: User this flag to set fips=0 for command line run
    
    Writes a grub.conf which use the selected next boot location,
       and which contains the installed image version strings.
    
    Generating SSH1 RSA host key: [  OK  ]
    Generating SSH2 RSA host key: [  OK  ]
    Generating SSH2 DSA host key: [  OK  ]
        Starting sx_low_level_if:
    Loading i2c_mux_pca954x  - Success
    Loading sx_glue_if  - Success
    Loading watchdog  - Success
    Loading cpld_handler  - Success
    Loading mellaggra_mod  - Success
    Loading switchx  - Success
    Reloading udev:
    Loading SX driver:[  OK  ]
    Stopping iss-nvram-mac
    Stopping sx_low_level_if
    switchx module unloaded
    mellaggra_mod module unloaded
    cpld_handler module unloaded
    watchdog module unloaded
    i2c_mux_pca954x module unloaded
    mlx system_profile: 3
    mlx system_type: SX6018
    mlx system_oid: 1.3.6.1.4.1.33049.1.1.1.6018
        Starting sx_low_level_if:
    Loading i2c_mux_pca954x  - Success
    NOTE: sx_glue_if module is already loaded
    Loading watchdog  - Success
    Loading cpld_handler  - Success
    Loading mellaggra_mod  - Success
    Loading switchx  - Success
    [  OK  ]
    Enabling /etc/fstab swaps:  [  OK  ]
    INIT: Entering runlevel: 3
    Starting system services
    Starting sx_low_level_if:      Starting sx_low_level_if:
    NOTE: i2c_mux_pca954x module is already loaded
    NOTE: sx_glue_if module is already loaded
    NOTE: watchdog module is already loaded
    NOTE: cpld_handler module is already loaded
    NOTE: mellaggra_mod module is already loaded
    NOTE: switchx module is already loaded
    [  OK  ]
    Starting openibd:  IPoIB configuration for embedded system
    Loading SX driver:[  OK  ]
    Loading HCA driver and Access Layer:[  OK  ]
    Setting up InfiniBand network interfaces:
    Setting up service network . . .[  done  ]
    Reloading udev:
    [  OK  ]
    Starting system logger: [  OK  ]
    Starting kernel logger: [  OK  ]
    Starting fips_post:  [  OK  ]
    Renaming: no changes required
    /etc/rc3.d/S15rename_ifs: line 326: return: can only `return' from a function or sourced script
    Renaming: no changes for: MAC: 00:02:C9:64:18:4C ifindex: 2 name: mgmt0
    Renaming: no changes for: MAC: 00:02:C9:64:18:4D ifindex: 3 name: mgmt1
    Running renaming interfaces
    /etc/rc3.d/S15rename_ifs: line 384: do_rename_ifs: comma__nand_correct_data: uncorrectable ECC errornd not found
    Checking for une
    __nand_correct_data: uncorrectable ECC errorxp__nand_correct_data: uncorrectable ECC errorec
    __nand_correct_data: uncorrectable ECC errorted shutd__nand_correct_data: uncorrectable ECC errorow
    __nand_correct_data: uncorrectable ECC errorn
    
    __nand_correct_data: uncorrectable ECC errorblk_update_request: 22 callbacks suppressed
    end_request: I/O error, dev mtdblock6, sector 0
    quiet_error: 22 callbacks suppressed
    Buffer I/O error on device mtdblock6, logical block 0
    
    Probing for HRNG module
    Starting rngd: [  OK  ]
    Running syst__nand_correct_data: uncorrectable ECC errorem image: PPC_M460EX 3.6.1002 2016-06-09 20:24:2
    __nand_correct_data: uncorrectable ECC error6 ppc
    __nand_correct_data: uncorrectable ECC error
    __nand_correct_data: uncorrectable ECC error__nand_correct_data: uncorrectable ECC error
    __nand_correct_data: uncorrectable ECC error__nand_correct_data: uncorrectable ECC error
    __nand_correct_data: uncorrectable ECC errorend_request: I/O error, dev mtdblock6, sector 8
    Buffer I/O error on device mtdblock6, logical block 1
    __nand_correct_data: uncorrectable ECC error
    __nand_correct_data: uncorrectable ECC error__nand_correct_data: uncorrectable ECC error
    __nand_correct_data: uncorrectable ECC error__nand_correct_data: uncorrectable ECC error
    __nand_correct_data: uncorrectable ECC error__nand_correct_data: uncorrectable ECC error
    end_request: I/O error, dev mtdblock6, sector 16
    Buffer I/O error on device mtdblock6, logical block 2
    __nand_correct_data: uncorrectable ECC error
    __nand_correct_data: uncorrectable ECC error__nand_correct_data: uncorrectable ECC error
    __nand_correct_data: uncorrectable ECC error__nand_correct_data: uncorrectable ECC error
    __nand_correct_data: uncorrectable ECC errorend_request: I/O error, dev mtdblock6, sector 24
    Buffer I/O error on device mtdblock6, logical block 3
    Applying initial configuration: __nand_correct_data: uncorrectable ECC error
    __nand_correct_data: uncorrectable ECC error__nand_correct_data: uncorrectable ECC error
    __nand_correct_data: uncorrectable ECC error__nand_correct_data: uncorrectable ECC error
    __nand_correct_data: uncorrectable ECC error__nand_correct_data: uncorrectable ECC error
    __nand_correct_data: uncorrectable ECC errorend_request: I/O error, dev mtdblock6, sector 0
    Buffer I/O error on device mtdblock6, logical block 0
    __nand_correct_data: uncorrectable ECC error
    __nand_correct_data: uncorrectable ECC error__nand_correct_data: uncorrectable ECC error
    __nand_correct_data: uncorrectable ECC error__nand_correct_data: uncorrectable ECC error
    __nand_correct_data: uncorrectable ECC error__nand_correct_data: uncorrectable ECC error
    __nand_correct_data: uncorrectable ECC errorend_request: I/O error, dev mtdblock6, sector 0
    
    I can still boot into the EMC OS and the scratch Linux area.

    Any thoughts? Did I edit a file wrong somewhere?


    Thanks!
     
    #192
  13. metag

    metag New Member

    Joined:
    Apr 26, 2016
    Messages:
    26
    Likes Received:
    4
    It's normal. I just waited it out. I believe it's a first time thing.
     
    #193
    cecil1783 likes this.
  14. cecil1783

    cecil1783 New Member

    Joined:
    Aug 9, 2017
    Messages:
    7
    Likes Received:
    8
    Well I'll be a monkey's uncle...that worked. Thanks a ton for the quick reply!
     
    #194
    metag likes this.
  15. Poco

    Poco New Member

    Joined:
    Nov 16, 2018
    Messages:
    3
    Likes Received:
    0
    Hi.

    I just bought one of these SX6005 switches before seeing this post..
    If you're still able to help with this I'd greatly appreciate it.

    Kind regards.
     
    #195
  16. cecil1783

    cecil1783 New Member

    Joined:
    Aug 9, 2017
    Messages:
    7
    Likes Received:
    8
    The SX6005 cannot be converted to a managed switch, as it does not have the hardware necessary. You should be able to flash it to a normal SX6005 firmware, though, using the publicly available firmware (http://www.mellanox.com/page/firmware_table_SwitchX) and flint (http://www.mellanox.com/page/management_tools). You will need to have opensm running on an infiniband connected machine to do the flash.
     
    #196
    Matt G, Poco, herby and 1 other person like this.
  17. herby

    herby Active Member

    Joined:
    Aug 18, 2013
    Messages:
    158
    Likes Received:
    35
    Would this process be less challenging than the flashing a SX6012/6018/6036?

    For my application 40GbE unmanaged would be fine and the SX6005 is cheaper.
     
    #197
  18. cecil1783

    cecil1783 New Member

    Joined:
    Aug 9, 2017
    Messages:
    7
    Likes Received:
    8
    The unmanaged switches (SX6xx5) are not capable of ethernet. Ethernet mode requires a license on the managed switches (though no license is actually required on our flashed switches).

    To your first question, yes, flashing the unmanaged switches would be considerably easier.

    edit: another thought, if you've got ConnectX VPI cards at all ends, and you're willing to run opensm on one system somewhere, you could run IPoIB on the unmanaged switches. That nets the same result, but requires everything throughout support it.

    aa
     
    #198
    Last edited: Nov 21, 2018
    Poco likes this.
  19. herby

    herby Active Member

    Joined:
    Aug 18, 2013
    Messages:
    158
    Likes Received:
    35
    Oh well, that's too bad. I'm running FreeNAS at the moment and it doesn't do InfiniBand.
    Thanks for the info, I'll have to think about if I want to gamble on one of the SX6012's.
     
    #199
  20. cecil1783

    cecil1783 New Member

    Joined:
    Aug 9, 2017
    Messages:
    7
    Likes Received:
    8
    I just finished setting up my FreeNAS, actually, and I was pretty disappointed to see there was no IB support. It does handle 56GbE pretty well (everything connected, anyway, haven't had a chance to test throughput yet), though.


    aa
     
    #200

Share This Page