Modding/upgrading Arista switches?

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

WANg

Well-Known Member
Jun 10, 2018
1,302
967
113
46
New York, NY
Checked mine after reading this, cpuinfo states GX-420CA.
Now I'm totally confused what kind of RAM I need to look for my 32S. But not only your fault ;) I found a pin-compatible old 1GB RAM module (SU3U1333B1G9-B) which I gave a shot... it worked... except it could not boot fine because during boot of EOS oom-killer killed all new processes :). But at least it was detected (but not as 1GB):
Code:
System RAM: 746104 kB
in the console.
And this module is (according to memoty4less.com) 1.5V non-ECC. So it seems that the switch isn't that picky?
@oddball mentioned in another thread what he had in his:

I have TRF7251U67G1600G8-NYCBP and SU3U1333B1G9-B was also detected. So my conclusion: 1.35V and 1.5V seems to work in mine and also ECC and non ECC.
I'm not a memory export, can't say what's a must for compatibility and what not... the only thing I know is ECC or Non-ECC should at least in theory not matter, what I proved on my side.

So I have following questions:
  • 1.35V vs 1.5V compatibility?
  • Single Rank vs Dual Rank compatibility?
  • Buffered vs Unbuffered compatibility?
  • Registered vs Unregister compatibility (I think not compatible)?
The problem of finding 16GB ECC for the 32S seems to be most because of Unregistered and Single/Dual Rank
Well, that's the thing - I think most DCS7050QX-32S models are Crow (since it's first announced in late 2015 and pretty close to the Athlon II Neo EOL), but some of the early productions are Raven, at least to the point where they specify it. If you look at the filesystem on your switch where they specify the familial relationships for the NorCal family of switches, you'll see this:

Code:
find /usr/share/NorCal/ -type f -name "*.fdl" | xargs grep 7050QX-32S\"

/usr/share/NorCal/ClearlakeCrowS1.fdl:baseSku = "DCS-7050QX-32S"
/usr/share/NorCal/ClearlakeRavenS1.fdl:baseSku = "DCS-7050QX-32S"
Arista definitely specified both boards as officially supported in the model.

And yeah, the machine is called ClearLake(Crow|Raven). ClearLake switch (that's 32 QX/40GBit QSFP+ ports + 4 10GbE SFP+ ports in the S model) + (Crow|Raven) CPU/management board.

As for whether non-ECC is kosher, that's an even trickier question. Technically the GX420CA APU can support both ECC and non-ECC consumer (aka desktop) DDR3/3L RAM, and in most installations that's usually 1.35v (whether it support 1.5v depends on the hardware implementation). The ECC is the unbuffered, unregistered type. I know that the APU has a built-in single channel RAM controller, and it's supposedly quad rank max. 4 and 8GB ECC RAM denominations will definitely work (since 8GB ECC RAM is shipped on the ClearLakePlusCrow models, that is, the DCS-7050QX2-32S).
The thing I am not sure about is whether the APU can deal with 16GB DIMMs (since most implementors of GX420CA hardware like the HP t620 Plus will roll out dual DIMM slots, and in the case of the t620, 2 8GB laptop DIMMs will get it to 16GB max. @fossexplorer ordered a pair of 16GB SODIMMs, so if the RAM works on the t620 Plus, then we know that there is a chance of getting it to work on the Crow. This is useful since the Crows only have a single DIMM slot, and 16GB is a great amount for a switch.

As for 1.35 versus 1.5? I am not 100% sure - you might have a revision that can do 1.35 and 1.5. I have a few 4GB DDR3 (1.5v) desktop DIMMs in the office, but I didn't have a chance to try it yet. As for non-ECC? I think @oddball said that he tried non-ECC and it worked, but registered RAM will puke. For toying around regular desktop DIMM will work, but if it's something you rely upon, I'll go ECC.
 
Last edited:

spali

Member
Nov 4, 2018
32
3
8
HP t620 Plus will roll out dual DIMM slots
The soldering points are on the board... so in theory it could also be possible to add a second slot. Did anyone made pictures of the bottom of the board? Usually this pins are not too small and I think it should be possible to solder it. But I don't know where to get a slot from ;)
As for non-ECC? Well, @oddball mentioned that he saw some weird stuff on the bootlog when he tried it, and it doesn't seem to work. So once again, maybe its board revision, firmware revision, bad/incompatible RAM, who knows.
It's not clear if he spoke about the "normal" or the ECC registered. What I got during my researches... ECC registered is not compatible with ECC unregistered. I think if he would try ECC unregistered it would work as on mine. But I could be wrong.
 

WANg

Well-Known Member
Jun 10, 2018
1,302
967
113
46
New York, NY
The soldering points are on the board... so in theory it could also be possible to add a second slot. Did anyone made pictures of the bottom of the board? Usually this pins are not too small and I think it should be possible to solder it. But I don't know where to get a slot from ;)
It's not clear if he spoke about the "normal" or the ECC registered. What I got during my researches... ECC registered is not compatible with ECC unregistered. I think if he would try ECC unregistered it would work as on mine. But I could be wrong.
You could, in theory. Just look through Mouser or Newark (the people behind Element14) for a 240 Pin DDR3 SODIMM socket, or you could pull it from a dead board. As for the entire RAM thing, he was referring to ECC unregistered. I looked through the datasheet for the GX420CA, and it specifically mentioned UDIMMs or SODIMMs, no RDIMMs, and I really doubt that the APU (optimized for embedded applications) baked in support for it.
 
Last edited:

WANg

Well-Known Member
Jun 10, 2018
1,302
967
113
46
New York, NY
Okay, so I can now confirm that why yes, the DCS7050QX-32S can take non-ECC RAM @ 1.5v. Managed to find a stick of HyperX Fury DDR3-1866 8GB RAM in a spare machine at work, and to my surprise, it seems to work just fine. I don't recommend using non-ECC for production, but if you don't mind your homelab going down once in awhile, this is probably okay. Next step is to hunt down some 16GB desktop DDR3L units and see how that one jives.



Also, it looks like the 4 pin USB2 header works just fine with the 9 pin USB DOM connector. That's the Sandisk Cruzer yet again sticking out of the USB port. So there we go, how to avoid spending stupid money on something un-necessary (like a USB DOM).



BTW, @Patrick, something funny with the forum software? Both photos on this posting looks fine on preview but only the top one seems to render correctly.
 
Last edited:
  • Like
Reactions: spali and fohdeesha

fohdeesha

Kaini Industries
Nov 20, 2016
2,727
3,075
113
33
fohdeesha.com
got my M2 drive in and got EOS installed and booting off of it:

Code:
##FORMAT M2
##like the USB DOM, it doesn't want partitions
##if you mkfs.ext4 to sda1 instead of sda, it complains on boot
enable
bash
sudo umount /mnt/drive
sudo mkfs.ext4 /dev/sda
exit
reload (save and confirm changes)
It'll reboot, /mnt/drive will now be available in aboot and EOS. Back in EOS, run the install image script:

Code:
enable
bash
image-install -d /mnt/drive /mnt/usb1/EOS-4.21.2F.swi
reboot
it'll reboot the switch, now booting off the M2 SSD instead of the USB DOM. It definitely boots faster, however there's at least one (pretty big) caveat so far: it seems some of their python scripts, at least the reload script, do not like "dir" EOS installs:

Code:
7050QX-32S#reload

% Internal error
% To see the details of this error, run the command 'show error 0'
Code:
=============== Exception raised in 'ConfigAgent     -d -i --dlopen -p -f  -l libLoadDynamicLibs.so procmgr libProcMgrSetup.so --daemonize ' (PID 1401; PPID 1156) ===============
Local variables by frame (innermost frame last):

  File "/usr/lib/python2.7/site-packages/Cli.py", line 295, in runFrontendCmds
              currThread = <CliThread(Thread-5, started -902825152)>
                 excInfo = (<type 'exceptions.ValueError'>, ValueError("Unknown URL scheme: 'dir:'",), <traceback object at 0xcbcb6694>)
That's the only command I've found so far that errors out after a dir install, but I haven't tried a whole lot (it's 3am here). Who knows. I'll poke around more tomorrow after some sleep
 

spali

Member
Nov 4, 2018
32
3
8
Got my SSD too.

if you mkfs.ext4 to sda1 instead of sda, it complains on boot
for me it worked, but I created a DOS partition table first with fdisk. Not sure if this makes the difference. It does not complain in my case.
updated: it did during the first boot, but now I don't get any complains anymore.
updated3: tried some other scenarios, it does not complain if you boot from the drive, but you are right, it complains about the drive with a partition table if you boot from image on flash.
So I suggest to format it as you did (the whole drive as ext4) without a partition table.

It definitely boots faster
Code:
EOS Image on flash
------------------
0:13    Press Control-C now to enter Aboot shell
1:10    Switching rootfs
2:18    Starting Power OCompleting EOS initialization
3:33    console login
4:13    green led

Unpacked EOS on SSD (drive)
---------------------------
0:11    Press Control-C now to enter Aboot shell
0:17    Switching rootfs
0:59    Starting Power OCompleting EOS initialization
2:00    Login prompt on console
2:35    Leds switching from red to green

EOS Omage on SSD (drive)
---------------------------
0:11    Press Control-C now to enter Aboot shell
0:30    Switching rootfs
1:40    Starting Power OCompleting EOS initialization
2:54    Login prompt on console
3:34    Leds switching from red to green
the last variant in the list is booting with the image file on the SSD but packed (drive:EOS-4.21.2F.swi). It's also a bit faster. I assume it saves the time due faster unpack of the image on the ssd just before the switch to the unpacked rootfs.

I can reproduce your error with the reload command.
I also noticed, booting from unpacked image seems to have something different in the filesystem. The /usr/sbin/image-install is not there if you boot from a directory. Not yet figured out what's the reason for this behavior.

updated2: reload does also not work with drive:EOS-4.21.2F.swi. It does not throw an error, but just do a logout.
updated3: I looked at the scripts that throws the error. They do some checks on the boot-config and the image specified. They use a library that just does not support "dir:" format. So this could be theoretically patched.
 
Last edited:
  • Like
Reactions: fohdeesha

oddball

Active Member
May 18, 2018
206
121
43
42
In a wild twist of fate I was able to find a 7150S-24 on eBay that the seller claimed had no internals (drive, software, plus PSU/fans) for $200. Turns out they had no idea what they were selling. Switch was intact sans psu/fans, which I swapped from a 7124sx.

I had modded my 7124 with a 16GB USB DOM and a 128GB SATA SSD and 8GB of RAM. Pulled all out, dumped in the 7150 and it booted like a champ. Since both switches have 24 10G ports even the configuration was pulled over.

What's interesting about this switch is the Agileports. They allow you to group 4x 10G into a single 40g port. The switch uses an Intel chip and can convert from 10G to 40G in cut through mode without dropping to store and forward. So in theory you can get 10G to 40G speed change at 350ns in latency.

So now I will have a 7124sx without psu/fans I need to sell... I'll be racking the 7150 and connecting it to the 7050qx-32s' this week hopefully.
 

HomelyPowder

New Member
Feb 23, 2019
2
0
1
In general I wouldn't assume assume that every hardware configuration that has configs was actually produced in quantity. It's common at most companies to support one-off configurations in software that never ship in order to aid development (and those configurations might get used as dev systems for a while)

In response to another comment: you'll find that while the .swi files are ZIP format, the squashfs (and maybe the other components?) is actually stored uncompressed. By searching through the boot scripts you might find that it mounts the .swi file directly by applying an offset into it.

While it's correct that EOS is based on a customized fedora, it's actually a 64-bit kernel with a (mostly?) 32-bit userland. It's been a while since I poked at it so I don't know if that changed to fully 64-bit at some point.

If any of you get your hands on a clearlake+ you might notice other differences than what has been mentioned so far.
 

NaCl

New Member
Dec 15, 2018
25
2
3
Greetings,

I have a 7050T-64 and am curious _how_ to remove the cover on this thing. I've pulled the outside screws I can see, pulled the supplies and fans. Nothing's moving except for the front part of the shell because it's so flimsy. What's the trick? I'm not wanting to break it.

Thanks!
 

NaCl

New Member
Dec 15, 2018
25
2
3
Greetings,

I have a 7050T-64 and am curious _how_ to remove the cover on this thing. I've pulled the outside screws I can see, pulled the supplies and fans. Nothing's moving except for the front part of the shell because it's so flimsy. What's the trick? I'm not wanting to break it.

Thanks!
NM...seems a rubber mallet was what it required.

Any chance one of these headers is the vga port for the Radeon 4200?
 

NaCl

New Member
Dec 15, 2018
25
2
3
Let me guess, you're gonna load up DOOM on the switch? :D
Ha! No plans for anything like that atm. Was poking around and found that it had vga controller from an lspci and curiosity ensued. Seems like the 10pin header near the DPM silkscreen is the likeliest thing if the signals are exposed at all anyway. I have a ReadyNAS Pro that does this so not unheard of.
 

fohdeesha

Kaini Industries
Nov 20, 2016
2,727
3,075
113
33
fohdeesha.com
VGA output is not brought out to anywhere sadly, not even unpopulated test pads. the 10 pin headers near DPM was something to do with power rail testing from factory, I can't remember the specifics but I had the management card off of the thing and in pieces trying to find the VGA output to no avail a year or so ago
 

spali

Member
Nov 4, 2018
32
3
8
because soon my stack of icx6610 arrives... I wanted to prepare my lab a bit for it. So because the ICX6610 's will replace my SG500 which currently running dhcp and there is a known problem with the dhcp service on 6610's, I plan to run glass-isc-dhcp in docker on the arista.
I noticed two things... that maybe of interest regarding booting from the extracted image on the m.2.
  1. booting from the m.2. makes docker daemon not starting... couldn't figure out why till yet. I assume it has something todo with mounts docker use in the filesystem, which is probably not compatible with the mounted EOS dir.
  2. because of this I switched back to booting from the packed swi on USB-DOM. I noticed that booting from m.2. let me around 1.6GB ram free. Botting from the USB-DOM consumes almost all RAM of my 4GB. My guess is because of the overlay fs in ram, that holds volatile changes in fs.

update regarding point 1:
to explain, I used a boot event script to enable and start docker service on every boot. But somehow due using unpacked fs, it seems that systemd has a problem.... I always need to stop the service (even not running) and then start it. But then it works.
But this makes me feel, unpacked swi boot is not really clean.
Another notice... if using SWI, the docker start script automatically creates a tmpfs for /var/lib/docker which really fast runs out of space. But it works if you just bind mount a directory from the M.2 to it before starting docker service.
 
Last edited:

spali

Member
Nov 4, 2018
32
3
8
just tried EOS-4.22.0F, seems to work... I did it remote, so couldn't check the console output. But found no errors so far.
They seem to have made a big kernel bump:
Code:
-bash-4.3# uname -a
Linux coreswitch 4.9.122.Ar-12091609.4220F #1 SMP PREEMPT Sat May 4 07:36:03 PDT 2019 x86_64 x86_64 x86_64 GNU/Linux
Does anyone know what EOS64 swi files are? Are they the awaited switch to 64bit linux distro?
I didn't tried it yet, but maybe have some time at the weekend to put the console cable in and try it :D

Edit: forgotten to mention, tested it on the 7050QX-32S

update: couldn't wait :rolleyes: just did it remote.... EOS64-4.22.0F.swi boots fine. On the first look, it exactly the same except lib64 directory is there with the 64bit libs... so looks like the 64bit userland version EOS.
 
Last edited:
  • Like
Reactions: kiteboarder

spali

Member
Nov 4, 2018
32
3
8
still now problems found. seems to work fine.
During playing with docker on the switch, I found out that with the 64bit userland it's easier to get the right images. Because by default in 32bit userland it downloads the i386 platform images from the docker manifests, which some images do not support. With the 64bit userland it works flawless.
The only thing I could get to work till now is swarm mode. You can init a swarm, but deploy services fails. First few researches tends to a kernel support problem with the network in swarm mode.
 

HomelyPowder

New Member
Feb 23, 2019
2
0
1
Greetings,
I have a 7050T-64 and am curious _how_ to remove the cover on this thing. I've pulled the outside screws I can see, pulled the supplies and fans. Nothing's moving except for the front part of the shell because it's so flimsy. What's the trick? I'm not wanting to break it.
The top cover slides forward a little and then lifts up. If it's being stubborn, the way I always did it was to put both palms down on the top and then push on the ports with your thumbs while pulling forward with your hands. Refreshing my memory, it looks like the 7050T-64 has plenty of surface to push on.

If it's *really* difficult to get off, someone might have screwed up putting the cover on at some point. If the little posts near the back don't go into the slot they go over the top instead which bends the whole cover and makes it difficult to get on and off after that.

To avoid this, make sure you're pressing down on the back of the cover while sliding it on.
 

Manbeard

New Member
Apr 15, 2020
2
1
3
Indianapolis, IN
I just received my 7124SX and this thread has me wanting to rip it open and toss in those upgrades!

Now if I could just get the newer firmware... it's currently running 4.8.3.
 

SRussell

Active Member
Oct 7, 2019
327
152
43
US
I just received my 7124SX and this thread has me wanting to rip it open and toss in those upgrades!

Now if I could just get the newer firmware... it's currently running 4.8.3.

Did you purchase your switch from someone on the board?