Modding/upgrading Arista switches?

Discussion in 'Networking' started by oddball, May 18, 2018.

  1. WANg

    WANg Active Member

    Joined:
    Jun 10, 2018
    Messages:
    494
    Likes Received:
    190
    Well, that's the thing - I think most DCS7050QX-32S models are Crow (since it's first announced in late 2015 and pretty close to the Athlon II Neo EOL), but some of the early productions are Raven, at least to the point where they specify it. If you look at the filesystem on your switch where they specify the familial relationships for the NorCal family of switches, you'll see this:

    Code:
    find /usr/share/NorCal/ -type f -name "*.fdl" | xargs grep 7050QX-32S\"
    
    /usr/share/NorCal/ClearlakeCrowS1.fdl:baseSku = "DCS-7050QX-32S"
    /usr/share/NorCal/ClearlakeRavenS1.fdl:baseSku = "DCS-7050QX-32S"
    Arista definitely specified both boards as officially supported in the model.

    And yeah, the machine is called ClearLake(Crow|Raven). ClearLake switch (that's 32 QX/40GBit QSFP+ ports + 4 10GbE SFP+ ports in the S model) + (Crow|Raven) CPU/management board.

    As for whether non-ECC is kosher, that's an even trickier question. Technically the GX420CA APU can support both ECC and non-ECC consumer (aka desktop) DDR3/3L RAM, and in most installations that's usually 1.35v (whether it support 1.5v depends on the hardware implementation). The ECC is the unbuffered, unregistered type. I know that the APU has a built-in single channel RAM controller, and it's supposedly quad rank max. 4 and 8GB ECC RAM denominations will definitely work (since 8GB ECC RAM is shipped on the ClearLakePlusCrow models, that is, the DCS-7050QX2-32S).
    The thing I am not sure about is whether the APU can deal with 16GB DIMMs (since most implementors of GX420CA hardware like the HP t620 Plus will roll out dual DIMM slots, and in the case of the t620, 2 8GB laptop DIMMs will get it to 16GB max. @fossexplorer ordered a pair of 16GB SODIMMs, so if the RAM works on the t620 Plus, then we know that there is a chance of getting it to work on the Crow. This is useful since the Crows only have a single DIMM slot, and 16GB is a great amount for a switch.

    As for 1.35 versus 1.5? I am not 100% sure - you might have a revision that can do 1.35 and 1.5. I have a few 4GB DDR3 (1.5v) desktop DIMMs in the office, but I didn't have a chance to try it yet. As for non-ECC? I think @oddball said that he tried non-ECC and it worked, but registered RAM will puke. For toying around regular desktop DIMM will work, but if it's something you rely upon, I'll go ECC.
     
    #81
    Last edited: Nov 26, 2018
  2. spali

    spali Member

    Joined:
    Nov 4, 2018
    Messages:
    32
    Likes Received:
    3
    The soldering points are on the board... so in theory it could also be possible to add a second slot. Did anyone made pictures of the bottom of the board? Usually this pins are not too small and I think it should be possible to solder it. But I don't know where to get a slot from ;)
    It's not clear if he spoke about the "normal" or the ECC registered. What I got during my researches... ECC registered is not compatible with ECC unregistered. I think if he would try ECC unregistered it would work as on mine. But I could be wrong.
     
    #82
  3. WANg

    WANg Active Member

    Joined:
    Jun 10, 2018
    Messages:
    494
    Likes Received:
    190
    You could, in theory. Just look through Mouser or Newark (the people behind Element14) for a 240 Pin DDR3 SODIMM socket, or you could pull it from a dead board. As for the entire RAM thing, he was referring to ECC unregistered. I looked through the datasheet for the GX420CA, and it specifically mentioned UDIMMs or SODIMMs, no RDIMMs, and I really doubt that the APU (optimized for embedded applications) baked in support for it.
     
    #83
    Last edited: Nov 26, 2018
  4. WANg

    WANg Active Member

    Joined:
    Jun 10, 2018
    Messages:
    494
    Likes Received:
    190
    Okay, so I can now confirm that why yes, the DCS7050QX-32S can take non-ECC RAM @ 1.5v. Managed to find a stick of HyperX Fury DDR3-1866 8GB RAM in a spare machine at work, and to my surprise, it seems to work just fine. I don't recommend using non-ECC for production, but if you don't mind your homelab going down once in awhile, this is probably okay. Next step is to hunt down some 16GB desktop DDR3L units and see how that one jives.

    [​IMG]

    Also, it looks like the 4 pin USB2 header works just fine with the 9 pin USB DOM connector. That's the Sandisk Cruzer yet again sticking out of the USB port. So there we go, how to avoid spending stupid money on something un-necessary (like a USB DOM).

    [​IMG]

    BTW, @Patrick, something funny with the forum software? Both photos on this posting looks fine on preview but only the top one seems to render correctly.
     
    #84
    Last edited: Feb 23, 2019
    spali and fohdeesha like this.
  5. fohdeesha

    fohdeesha Kaini Industries

    Joined:
    Nov 20, 2016
    Messages:
    1,391
    Likes Received:
    1,120
    got my M2 drive in and got EOS installed and booting off of it:

    Code:
    ##FORMAT M2
    ##like the USB DOM, it doesn't want partitions
    ##if you mkfs.ext4 to sda1 instead of sda, it complains on boot
    enable
    bash
    sudo umount /mnt/drive
    sudo mkfs.ext4 /dev/sda
    exit
    reload (save and confirm changes)
    It'll reboot, /mnt/drive will now be available in aboot and EOS. Back in EOS, run the install image script:

    Code:
    enable
    bash
    image-install -d /mnt/drive /mnt/usb1/EOS-4.21.2F.swi
    reboot
    it'll reboot the switch, now booting off the M2 SSD instead of the USB DOM. It definitely boots faster, however there's at least one (pretty big) caveat so far: it seems some of their python scripts, at least the reload script, do not like "dir" EOS installs:

    Code:
    7050QX-32S#reload
    
    % Internal error
    % To see the details of this error, run the command 'show error 0'
    
    Code:
    =============== Exception raised in 'ConfigAgent     -d -i --dlopen -p -f  -l libLoadDynamicLibs.so procmgr libProcMgrSetup.so --daemonize ' (PID 1401; PPID 1156) ===============
    Local variables by frame (innermost frame last):
    
      File "/usr/lib/python2.7/site-packages/Cli.py", line 295, in runFrontendCmds
                  currThread = <CliThread(Thread-5, started -902825152)>
                     excInfo = (<type 'exceptions.ValueError'>, ValueError("Unknown URL scheme: 'dir:'",), <traceback object at 0xcbcb6694>)
    
    That's the only command I've found so far that errors out after a dir install, but I haven't tried a whole lot (it's 3am here). Who knows. I'll poke around more tomorrow after some sleep
     
    #85
    kiteboarder and spali like this.
  6. spali

    spali Member

    Joined:
    Nov 4, 2018
    Messages:
    32
    Likes Received:
    3
    Got my SSD too.

    for me it worked, but I created a DOS partition table first with fdisk. Not sure if this makes the difference. It does not complain in my case.
    updated: it did during the first boot, but now I don't get any complains anymore.
    updated3: tried some other scenarios, it does not complain if you boot from the drive, but you are right, it complains about the drive with a partition table if you boot from image on flash.
    So I suggest to format it as you did (the whole drive as ext4) without a partition table.

    Code:
    EOS Image on flash
    ------------------
    0:13    Press Control-C now to enter Aboot shell
    1:10    Switching rootfs
    2:18    Starting Power OCompleting EOS initialization
    3:33    console login
    4:13    green led
    
    Unpacked EOS on SSD (drive)
    ---------------------------
    0:11    Press Control-C now to enter Aboot shell
    0:17    Switching rootfs
    0:59    Starting Power OCompleting EOS initialization
    2:00    Login prompt on console
    2:35    Leds switching from red to green
    
    EOS Omage on SSD (drive)
    ---------------------------
    0:11    Press Control-C now to enter Aboot shell
    0:30    Switching rootfs
    1:40    Starting Power OCompleting EOS initialization
    2:54    Login prompt on console
    3:34    Leds switching from red to green
    the last variant in the list is booting with the image file on the SSD but packed (drive:EOS-4.21.2F.swi). It's also a bit faster. I assume it saves the time due faster unpack of the image on the ssd just before the switch to the unpacked rootfs.

    I can reproduce your error with the reload command.
    I also noticed, booting from unpacked image seems to have something different in the filesystem. The /usr/sbin/image-install is not there if you boot from a directory. Not yet figured out what's the reason for this behavior.

    updated2: reload does also not work with drive:EOS-4.21.2F.swi. It does not throw an error, but just do a logout.
    updated3: I looked at the scripts that throws the error. They do some checks on the boot-config and the image specified. They use a library that just does not support "dir:" format. So this could be theoretically patched.
     
    #86
    Last edited: Nov 28, 2018
    fohdeesha likes this.
  7. oddball

    oddball Active Member

    Joined:
    May 18, 2018
    Messages:
    153
    Likes Received:
    48
    In a wild twist of fate I was able to find a 7150S-24 on eBay that the seller claimed had no internals (drive, software, plus PSU/fans) for $200. Turns out they had no idea what they were selling. Switch was intact sans psu/fans, which I swapped from a 7124sx.

    I had modded my 7124 with a 16GB USB DOM and a 128GB SATA SSD and 8GB of RAM. Pulled all out, dumped in the 7150 and it booted like a champ. Since both switches have 24 10G ports even the configuration was pulled over.

    What's interesting about this switch is the Agileports. They allow you to group 4x 10G into a single 40g port. The switch uses an Intel chip and can convert from 10G to 40G in cut through mode without dropping to store and forward. So in theory you can get 10G to 40G speed change at 350ns in latency.

    So now I will have a 7124sx without psu/fans I need to sell... I'll be racking the 7150 and connecting it to the 7050qx-32s' this week hopefully.
     
    #87
  8. WANg

    WANg Active Member

    Joined:
    Jun 10, 2018
    Messages:
    494
    Likes Received:
    190
    #88
  9. HomelyPowder

    HomelyPowder New Member

    Joined:
    Feb 23, 2019
    Messages:
    2
    Likes Received:
    0
    In general I wouldn't assume assume that every hardware configuration that has configs was actually produced in quantity. It's common at most companies to support one-off configurations in software that never ship in order to aid development (and those configurations might get used as dev systems for a while)

    In response to another comment: you'll find that while the .swi files are ZIP format, the squashfs (and maybe the other components?) is actually stored uncompressed. By searching through the boot scripts you might find that it mounts the .swi file directly by applying an offset into it.

    While it's correct that EOS is based on a customized fedora, it's actually a 64-bit kernel with a (mostly?) 32-bit userland. It's been a while since I poked at it so I don't know if that changed to fully 64-bit at some point.

    If any of you get your hands on a clearlake+ you might notice other differences than what has been mentioned so far.
     
    #89
  10. NaCl

    NaCl New Member

    Joined:
    Dec 15, 2018
    Messages:
    18
    Likes Received:
    2
    Greetings,

    I have a 7050T-64 and am curious _how_ to remove the cover on this thing. I've pulled the outside screws I can see, pulled the supplies and fans. Nothing's moving except for the front part of the shell because it's so flimsy. What's the trick? I'm not wanting to break it.

    Thanks!
     
    #90
  11. NaCl

    NaCl New Member

    Joined:
    Dec 15, 2018
    Messages:
    18
    Likes Received:
    2
    NM...seems a rubber mallet was what it required.

    Any chance one of these headers is the vga port for the Radeon 4200?
     
    #91
  12. GuybrushThreepwood

    Joined:
    Aug 2, 2015
    Messages:
    70
    Likes Received:
    26
    Let me guess, you're gonna load up DOOM on the switch? :D
     
    #92
  13. NaCl

    NaCl New Member

    Joined:
    Dec 15, 2018
    Messages:
    18
    Likes Received:
    2
    Ha! No plans for anything like that atm. Was poking around and found that it had vga controller from an lspci and curiosity ensued. Seems like the 10pin header near the DPM silkscreen is the likeliest thing if the signals are exposed at all anyway. I have a ReadyNAS Pro that does this so not unheard of.
     
    #93
  14. fohdeesha

    fohdeesha Kaini Industries

    Joined:
    Nov 20, 2016
    Messages:
    1,391
    Likes Received:
    1,120
    VGA output is not brought out to anywhere sadly, not even unpopulated test pads. the 10 pin headers near DPM was something to do with power rail testing from factory, I can't remember the specifics but I had the management card off of the thing and in pieces trying to find the VGA output to no avail a year or so ago
     
    #94
  15. spali

    spali Member

    Joined:
    Nov 4, 2018
    Messages:
    32
    Likes Received:
    3
    because soon my stack of icx6610 arrives... I wanted to prepare my lab a bit for it. So because the ICX6610 's will replace my SG500 which currently running dhcp and there is a known problem with the dhcp service on 6610's, I plan to run glass-isc-dhcp in docker on the arista.
    I noticed two things... that maybe of interest regarding booting from the extracted image on the m.2.
    1. booting from the m.2. makes docker daemon not starting... couldn't figure out why till yet. I assume it has something todo with mounts docker use in the filesystem, which is probably not compatible with the mounted EOS dir.
    2. because of this I switched back to booting from the packed swi on USB-DOM. I noticed that booting from m.2. let me around 1.6GB ram free. Botting from the USB-DOM consumes almost all RAM of my 4GB. My guess is because of the overlay fs in ram, that holds volatile changes in fs.

    update regarding point 1:
    to explain, I used a boot event script to enable and start docker service on every boot. But somehow due using unpacked fs, it seems that systemd has a problem.... I always need to stop the service (even not running) and then start it. But then it works.
    But this makes me feel, unpacked swi boot is not really clean.
    Another notice... if using SWI, the docker start script automatically creates a tmpfs for /var/lib/docker which really fast runs out of space. But it works if you just bind mount a directory from the M.2 to it before starting docker service.
     
    #95
    Last edited: Jun 12, 2019
  16. spali

    spali Member

    Joined:
    Nov 4, 2018
    Messages:
    32
    Likes Received:
    3
    just tried EOS-4.22.0F, seems to work... I did it remote, so couldn't check the console output. But found no errors so far.
    They seem to have made a big kernel bump:
    Code:
    -bash-4.3# uname -a
    Linux coreswitch 4.9.122.Ar-12091609.4220F #1 SMP PREEMPT Sat May 4 07:36:03 PDT 2019 x86_64 x86_64 x86_64 GNU/Linux
    
    Does anyone know what EOS64 swi files are? Are they the awaited switch to 64bit linux distro?
    I didn't tried it yet, but maybe have some time at the weekend to put the console cable in and try it :D

    Edit: forgotten to mention, tested it on the 7050QX-32S

    update: couldn't wait :rolleyes: just did it remote.... EOS64-4.22.0F.swi boots fine. On the first look, it exactly the same except lib64 directory is there with the 64bit libs... so looks like the 64bit userland version EOS.
     
    #96
    Last edited: Jun 14, 2019
    kiteboarder likes this.
  17. spali

    spali Member

    Joined:
    Nov 4, 2018
    Messages:
    32
    Likes Received:
    3
    still now problems found. seems to work fine.
    During playing with docker on the switch, I found out that with the 64bit userland it's easier to get the right images. Because by default in 32bit userland it downloads the i386 platform images from the docker manifests, which some images do not support. With the 64bit userland it works flawless.
    The only thing I could get to work till now is swarm mode. You can init a swarm, but deploy services fails. First few researches tends to a kernel support problem with the network in swarm mode.
     
    #97
  18. HomelyPowder

    HomelyPowder New Member

    Joined:
    Feb 23, 2019
    Messages:
    2
    Likes Received:
    0
    The top cover slides forward a little and then lifts up. If it's being stubborn, the way I always did it was to put both palms down on the top and then push on the ports with your thumbs while pulling forward with your hands. Refreshing my memory, it looks like the 7050T-64 has plenty of surface to push on.

    If it's *really* difficult to get off, someone might have screwed up putting the cover on at some point. If the little posts near the back don't go into the slot they go over the top instead which bends the whole cover and makes it difficult to get on and off after that.

    To avoid this, make sure you're pressing down on the back of the cover while sliding it on.
     
    #98
Similar Threads: Modding/upgrading Arista
Forum Title Date
Networking Arista - Mirror MLAG Nov 30, 2019
Networking Arista Switch setup for Media Network, Help! Nov 24, 2019
Networking Accton/Edgecore/White Box vs Arista & other branded switches? Nov 23, 2019
Networking Looking for Arista EOS-4.23.0.1F firmware Nov 20, 2019
Networking Arista .swi Firmware Checksums Jul 23, 2019

Share This Page