Modding/upgrading Arista switches?

Discussion in 'Networking' started by oddball, May 18, 2018.

  1. BLinux

    BLinux cat lover server enthusiast

    Joined:
    Jul 7, 2016
    Messages:
    2,370
    Likes Received:
    838
    Can you share this paper you are referring to? I work in this space, and I've never seen Bro perform very well. Processing 50Gbps for Bro is no small task... so, I'm very curious about this... please sure if you don't mind.
     
    #21
  2. oddball

    oddball Active Member

    Joined:
    May 18, 2018
    Messages:
    155
    Likes Received:
    48
  3. pcmoore

    pcmoore Member

    Joined:
    Apr 14, 2018
    Messages:
    100
    Likes Received:
    23
    I was curious so I quickly skimmed that paper and it would appear that the Arista switches aren't actually doing any of the IDS inspection, they are simply passing the traffic to a separate Bro cluster which is performing the primary IDS inspection and traffic enforcement/ACLs. It looks like the Bro cluster in their particular instance is a five node cluster, each node equipped with two 3.5GHz Ivy Bridge six-core processors (w/HT enabled). The paper expects that the five node cluster can support 50Gb of throughput, but additional nodes would be needed to grow beyond that.
     
    #23
  4. oddball

    oddball Active Member

    Joined:
    May 18, 2018
    Messages:
    155
    Likes Received:
    48
    Yes, I re-read the paper and saw that as well. And now that I've had a chance to load some VM's on the switches themselves I can see why...
     
    #24
  5. oddball

    oddball Active Member

    Joined:
    May 18, 2018
    Messages:
    155
    Likes Received:
    48
    As luck would have it I ended up with another 7050qx-32s for pennies on the dollar. I talked to a network company I’ve purchased from and they had a scratch and dent model. It was missing RAM and a USB DOM. Perfect coincidence, I had both of those laying on my desk from the upgrades. So $400 later I had my pair...

    I finally got to build a mlag with these things. It’s a really cool feature where both switches appear as a single device. You then have port-channels that span the pair. Hosts get a full 20/80Gbps and it’s fully redundant without spanning tree.

    Next step is to get the IDS’ rolling. I have done some testing and will try snort and suricata on docker or in a VM. For comparison I have a 1U that I’ll be running the same program on as a comparison.

    I’m wondering if EOS can do a port-mirror to a bridged VM...

    I’m also trying to get some flow monitoring up in a docker image as well. I’ve been trying a few tools and nothing has impressed me yet.
     
    #25
  6. Luzer

    Luzer New Member

    Joined:
    Mar 1, 2015
    Messages:
    9
    Likes Received:
    1
    is your 7050QX-32S still running ok? I'm thinking about ordering one and installing a m.2 ssd (if the one i get has one).

    do you have a link to the ram you purchased?
     
    #26
  7. oddball

    oddball Active Member

    Joined:
    May 18, 2018
    Messages:
    155
    Likes Received:
    48
    Still running like a champ.

    I don't have a link, I think I purchased the RAM from someone on ebay. It was one of those RAM warehouse deals, the RAM was new and had a few years of warranty.
     
    #27
  8. WANg

    WANg Active Member

    Joined:
    Jun 10, 2018
    Messages:
    496
    Likes Received:
    192
    If the Crow CPU board is a GX420CA, then congratulations, the chip itself can do 16GB of RAM. The GX420CA is the exact APU found in the HP t620 Plus thin clients that the others on this forum uses for pfsense, and I know for a fact that they'll take up to 16GB of DDR3L (2x8GB). You can probably go above 16GB if your equipment accepts 16GB DDR3Ls and you have more than 1 slot.

    As for the Raven, the specs almost match the Turion II Neo N36/40/54Ls found in the HP MicroServer Gen 7 - it's most likely an OEM version of the Geneva (45nm) K8. In that case, they should definitely support up to 16GB, but with very picky RAM typing. I am not sure about 32 in that case - I strongly doubt it. if you have spare RAM that can fit the footprint, give it a try.

    From what I understand Arista EOS is nothing but a tweaked version of Fedora x64 (FC14 on the older builds, not sure what nowadays)

    Hey, do me a favor when you get to a bash shell? Can you run the following:
    lspci -vvv
    Look for anything that contains ACS+ or ARI+
    Also, look to see if the dmesg/bootlogs reference AMD-vi or IOMMU being enabled.
    I want to see if Arista has a provision with AMD to implement single-root IO virtualization on that SoC. If it does, it can mean even better performance for virtualized networking on your Arista gear.
     
    #28
    Last edited: Nov 6, 2018
  9. oddball

    oddball Active Member

    Joined:
    May 18, 2018
    Messages:
    155
    Likes Received:
    48
    No mention of ACS or ARI from lspci on any switch. Here are other relevant details.

    7050qx-32s
    [admin@spine1 ~]$ dmesg | grep IOM
    [ 0.000000] AGP: Please enable the IOMMU option in the BIOS setup


    From the 7124sx:
    [admin@Toolbox ~]$ dmesg | grep IOM
    [ 0.000000] Please enable the IOMMU option in the BIOS setup
    [ 0.346160] PCI-DMA: using GART IOMMU.
    [ 0.346160] PCI-DMA: Reserving 64MB of IOMMU area in the AGP aperture


    From a 7050q-16:

    [admin@leaf1 ~]$ dmesg | grep IOM
    [ 0.000000] Please enable the IOMMU option in the BIOS setup
    [ 0.404668] PCI-DMA: using GART IOMMU.
    [ 0.404672] PCI-DMA: Reserving 64MB of IOMMU area in the AGP aperture


    So it appears the Raven board has IOMMU enabled, whereas the Crow does not.

    Here are CPU's for the various models:
    7050qx-32s (Crow)
    [admin@spine1 proc]$ cat cpuinfo
    processor : 0
    vendor_id : AuthenticAMD
    cpu family : 22
    model : 0
    model name : AMD GX-420CA SOC with Radeon(tm) HD Graphics
    stepping : 1
    microcode : 0x700010f
    cpu MHz : 2000.100
    cache size : 2048 KB
    physical id : 0
    siblings : 4
    core id : 0
    cpu cores : 4
    apicid : 0
    initial apicid : 0
    fpu : yes
    fpu_exception : yes
    cpuid level : 13
    wp : yes
    flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc extd_apicid aperfmperf eagerfpu pni pclmulqdq monitor ssse3 cx16 sse4_1 sse4_2 movbe popcnt aes xsave avx f16c lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt topoext perfctr_nb perfctr_l2 arat hw_pstate proc_feedback npt lbrv svm_lock nrip_save tsc_scale flushbyasid decodeassists pausefilter pfthreshold vmmcall bmi1 xsaveopt
    bugs : fxsave_leak
    bogomips : 4000.20
    TLB size : 1024 4K pages
    clflush size : 64
    cache_alignment : 64
    address sizes : 40 bits physical, 48 bits virtual
    power management: ts ttp tm 100mhzsteps hwpstate [11]


    7124sx (Raven)

    processor : 1
    vendor_id : AuthenticAMD
    cpu family : 16
    model : 6
    model name : AMD Turion(tm) II Neo N41L Dual-Core Processor
    stepping : 3
    cpu MHz : 1499.888
    cache size : 1024 KB
    physical id : 0
    siblings : 2
    core id : 1
    cpu cores : 2
    apicid : 1
    initial apicid : 1
    fpu : yes
    fpu_exception : yes
    cpuid level : 5
    wp : yes
    flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm 3dnowext 3dnow constant_tsc rep_good nopl nonstop_tsc extd_apicid pni monito r cx16 popcnt lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3d nowprefetch osvw ibs skinit wdt npt lbrv svm_lock nrip_save
    bogomips : 3000.22
    TLB size : 1024 4K pages
    clflush size : 64
    cache_alignment : 64
    address sizes : 48 bits physical, 48 bits virtual
    power management: ts ttp tm stc 100mhzsteps hwpstate


    7050q-16
    processor : 1
    vendor_id : AuthenticAMD
    cpu family : 16
    model : 6
    model name : AMD Turion(tm) II Neo N41H Dual-Core Processor
    stepping : 3
    microcode : 0x10000b6
    cpu MHz : 1499.929
    cache size : 1024 KB
    physical id : 0
    siblings : 2
    core id : 1
    cpu cores : 2
    apicid : 1
    initial apicid : 1
    fpu : yes
    fpu_exception : yes
    cpuid level : 5
    wp : yes
    flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm 3dnowext 3dnow constant_tsc rep_good nopl nonstop_tsc extd_apicid pni monito r cx16 popcnt lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3d nowprefetch osvw ibs skinit wdt nodeid_msr hw_pstate npt lbrv svm_lock nrip_save
    bogomips : 2999.85
    TLB size : 1024 4K pages
    clflush size : 64
    cache_alignment : 64
    address sizes : 48 bits physical, 48 bits virtual
    power management: ts ttp tm stc 100mhzsteps hwpstate
     
    #29
  10. WANg

    WANg Active Member

    Joined:
    Jun 10, 2018
    Messages:
    496
    Likes Received:
    192
    The Crows have one desktop RAM slots, while the Ravens have 2, right?
     
    #30
  11. oddball

    oddball Active Member

    Joined:
    May 18, 2018
    Messages:
    155
    Likes Received:
    48
    Yes, Crow has one and Raven two.
     
    #31
  12. WANg

    WANg Active Member

    Joined:
    Jun 10, 2018
    Messages:
    496
    Likes Received:
    192
    Oh, that's...inconvenient.
    The APU on the Crows are more advanced and likely more future-proof, but it'll support only a single 16GB DDR3L RAM stick max. The Ravens can get to 16GB via 2x8GB DDR3Ls. That being said, if you are in the market for a switch that can do 40GbE via QSFPs, those Aristas are probably good bang for the buck. How are they noise-wise? I am thinking of buying an Arista 7050 to play around with. How noisy and power hungry are they?
     
    #32
    Last edited: Nov 6, 2018
  13. i386

    i386 Well-Known Member

    Joined:
    Mar 18, 2016
    Messages:
    1,710
    Likes Received:
    423
    With Default settings they are too loud for an Office.
    Fans Set to 30% are More quiet, ~39dba according to my Nexus 5x
     
    #33
  14. oddball

    oddball Active Member

    Joined:
    May 18, 2018
    Messages:
    155
    Likes Received:
    48
    #34
  15. oddball

    oddball Active Member

    Joined:
    May 18, 2018
    Messages:
    155
    Likes Received:
    48
    I thought about this for a while last night. If you have a Raven and a Crow I *think* you can try something. The motherboard has a connector on it. The front of the control plane is where the ASIC's and the chip reside. The back is the ram, flash, power connectors. The back is the Raven vs Crow. I would be curious to see what happens if you toss a Raven with the quad core processor and better ASIC's. If you were to do that you could jump to 32GB of ram.
     
    #35
  16. fohdeesha

    fohdeesha Kaini Industries

    Joined:
    Nov 20, 2016
    Messages:
    1,463
    Likes Received:
    1,221
    @oddball can you link to the type of M.2 SSD you're using in the 7050qx-32S, or post a pic of the M.2 slot? From the pictures I've seen (like attached, from another thread, I just see what appears to be a full size SATA data + power port coming off the mobo vertically for a caddy of some type. Is there an actual M.2 slot out of view? Want to make sure I order the right drive :)

    Finally have one of these things on its way, excited for yet another 40gb switch I have no need for
     

    Attached Files:

    #36
    Last edited: Nov 13, 2018
  17. fohdeesha

    fohdeesha Kaini Industries

    Joined:
    Nov 20, 2016
    Messages:
    1,463
    Likes Received:
    1,221
    I don't have my switch yet, but it seems the management CPU is on the rear card (the amd chip to the right of the USB DOM in the pic I posted), and the ASIC is on the front (asic and management cpu are not on the same card). Either way, started a deep dive into EOS fw earlier tonight, came across something you might be interested in. In the EOS filesystem, navigate to \usr\share\NorCal

    all these fdl files define the chassis data, just cat one to view.

    Contents of ClearlakeCrowS1.fdl (our 7050qx-32S):

    Code:
    # This file describes everything specific to the switch that consists of
    # a Clearlake switch card and a Crow CPU card
    description = "32 QSFP+ + 4 SFP+ 1RU"
    baseSku = "DCS-7050QX-32S"
    
    numSfpPorts = 4
    numTenGigFortyGigPorts = 24
    numFortyGigOnlyPorts = 8
    numQsfpPorts = numTenGigFortyGigPorts + numFortyGigOnlyPorts
    numPortsWithMacAddr = numTenGigFortyGigPorts * 4 + numFortyGigOnlyPorts + numSfpPorts
    # this file stitches the Crow cpu card together with a Clearlake switch card
    # the cards share a connector that connects PCIE busses, an Smbus and gpios between
    # the two cards
    # the Crow.fin file provides the following variables to allow components on the
    # switch card to be plumbed into the system.
    # SwitchPciRoot -- a dict of invPciBridge objects that hook up to the pci topology
    #                  on the switch card. the number of items is based on the
    #                  Sw2CpuPcieBridgeNums dict below.
    # _configCpuSmbusDevices( smbus ) -- a function in the cpu card called when the
    #                  smbus object for the smbus is created on the switch card, this
    #                  should be used to initialize all smbus devices on the cpu card
    
    # the pcie topology between the switch and CPU card, first bridge is connected to the
    # Trident, the second to the scd
    # configure how the CPU card creates the pcie links to the swtich card
    
    # configure how the switch card looks up the pci root for the pcie components
    Sw2CpuPcieBridgeNums = { 0 : ( 0x2, 1 ), 1 : ( 0x2, 2 ) }
    
    backToFrontCoolingSupported = True
    frontToBackCoolingSupported = True
    
    # XXX_THEBE: These are cloverdale values
    forwardInletTempToCoolingLevel ={25.0: 2, 33.33: 3, 37.5: 4, 40.0: 5}
    reverseInletTempToCoolingLevel ={20.0: 2, 30.0: 3, 35.0: 4, 40.0: 5}
    
    You can see it's quite modular/flexible, it even states "#this file stitches the Crow cpu card together with a Clearlake switch card" implying arbitrary stitching is possible (given physical limitations of course).

    And, indeed, it seems a descriptor file for exactly what you wanted to do (combine a raven management plane with the 7050qx-32S ASIC board "codename clearlake") already exists!

    ClearlakeRavenS1.fdl -

    Code:
    # This file describes everything specific to the switch that consists of
    # a Clearlake switch card and a Raven CPU card
    description = "32 QSFP+ + 4 SFP+ 1RU"
    baseSku = "DCS-7050QX-32S"
    
    numSfpPorts = 4
    numTenGigFortyGigPorts = 24
    numFortyGigOnlyPorts = 8
    numQsfpPorts = numTenGigFortyGigPorts + numFortyGigOnlyPorts
    numPortsWithMacAddr = numTenGigFortyGigPorts * 4 + numFortyGigOnlyPorts + numSfpPorts
    # this file stitches the raven cpu card together with a Clearlake switch card
    # the cards share a connector that connects PCIE busses, an Smbus and gpios between
    # the two cards
    # the Raven.fin file provides the following variables to allow components on the
    # switch card to be plumbed into the system.
    # SwitchPciRoot -- a dict of invPciBridge objects that hook up to the pci topology
    #                  on the switch card. the number of items is based on the
    #                  Sw2CpuPcieBridgeNums dict below.
    # _configCpuSmbusDevices( smbus ) -- a function in the cpu card called when the
    #                  smbus object for the smbus is created on the switch card, this
    #                  should be used to initialize all smbus devices on the cpu card
    
    # the pcie topology between the switch and CPU card, first bridge is connected to the
    # Trident, the second to the scd
    # configure how the CPU card creates the pcie links to the swtich card
    
    # configure how the switch card looks up the pci root for the pcie components
    Sw2CpuPcieBridgeNums = { 0 : ( 0x4, 0 ), 1 : ( 0x9, 0 ) }
    
    backToFrontCoolingSupported = True
    frontToBackCoolingSupported = True
    
    # XXX_THEBE: These are cloverdale values
    forwardInletTempToCoolingLevel ={25.0: 2, 33.33: 3, 37.5: 4, 40.0: 5}
    reverseInletTempToCoolingLevel ={20.0: 2, 30.0: 3, 35.0: 4, 40.0: 5}
    The only values differing between the two (other than comments) are Sw2CpuPcieBridgeNum, which I would assume are the pci-e lane/bridge identifiers of where the MGMT CPU should expect to talk to the switch ASIC.

    I wouldn't count on the switch being smart enough to load this correct profile when mixing in a new mgmt board automatically, I have some more digging to do but it seems the config that gets loaded is based on FDL identifier data stored on the switch nonvolatile storage. Separate from all the usb/sata/etc flash storage there's an MXIC SPI flash chip soldered to the board, this contains aboot, fdl chassis ident data, serials, the burned in MAC, etc:

    Code:
       'norcal4' : {
                   'total' :    { 'start' : 0x000000, 'length' : 0x0800000 },
                   'prefdl' :   { 'start' : 0x010000, 'length' : 0x000f000 },
                   'mfgdata' :  { 'start' : 0x01f000, 'length' : 0x0001000 },
                   'fdl' :      { 'start' : 0x020000, 'length' : 0x0010000 },
                   'fallback' : { 'start' : 0x030000, 'length' : 0x03f0000 },
                   'image' :    { 'start' : 0x420000, 'length' : 0x03db000 },
    The contents of the script \usr\bin\flashUtil details some of the contents and addresses of said SPI flash, and seems to fully support dumping said data sections to files, and also the reverse (writing a file you provide to said flash chip). I would not run the script yet though, even with just a help arg passed, if something goes wrong or behaves not like we'd expect, wiping out the aboot section of SPI flash would brick the switch. I have tools for SPI flash raw reading/writing (and arista was nice enough to break out the chip to a SPI header), so when my switch arrives I'll see what I can get the script to do. I'd imagine worst case, to change the chassis ident to a different mgmt cpu/asic combo it would just be something like "flashutil -w fdl customfdl.bin"

    for someone who enjoys reverse engineering things until they break,. the openness of EOS has been quite enjoyable, even if I only have it running in a VM currently
     
    #37
    Last edited: Nov 13, 2018
  18. fohdeesha

    fohdeesha Kaini Industries

    Joined:
    Nov 20, 2016
    Messages:
    1,463
    Likes Received:
    1,221
    Interesting sidenote: the FDL file for the 7050 that ships with an SSD (DCS-7050QX-32S-SSD) has this appended to the end (everything else is the same):

    Code:
    invStorageDeviceDir = system.component.newEntity( "Inventory::StorageDeviceDir",
                                                         "storageDeviceDir" )
    storageDevice = invStorageDeviceDir.newStorageDevice( "ssd1" )
    storageDevice.description = "SSD drive"
    storageDevice.address = Value( "Inventory::PciAddress", domain=0, bus=0, slot=0x11, function=0 )
    storageDevice.type = "ssd"
    storageDevice.sizeGB = 120
    storageDevice.disableWriteCache = True
    I remember you got yours working with an added on SSD no problem (on a chassis using the stock non-ssd FDL chassis descriptor), I wonder what the above would be needed for then, if anything
     
    #38
  19. i386

    i386 Well-Known Member

    Joined:
    Mar 18, 2016
    Messages:
    1,710
    Likes Received:
    423
    Maybe the ssd is for containers/docker?
     
    #39
  20. oddball

    oddball Active Member

    Joined:
    May 18, 2018
    Messages:
    155
    Likes Received:
    48
    Yes, mine have the m.2 slot. Let me see if I can find a picture on my phone and upload. I purchased a cheap Crucial 256Gb for $100 on Amazon. It's m.2 SATA ssd.

    But... I also purchased for a different switch an industrial SATA SSD that hooks into that slot you point out. I purchased one on ebay, it was 128GB for $35 from China, it looked like this (16GB SSD SATA SERIAL ATA HDD HARD DRIVE FOR COMPUTER THINCLIENT DOC DOM FLASH | eBay)

    The form factor is perfect, it snaps in and doesn't extend. It took forever to ship/receive. Maybe three or four weeks. I was worried it was lost in customs, but it eventually arrived.

    The switches detected the drive, but I had to format and partition with fdisk then reboot. After that everything worked fine. You can do a "dir drive:" and it'll show the drive.
     
    #40
Similar Threads: Modding/upgrading Arista
Forum Title Date
Networking Arista 7050 - Mirror MLAG-Port Yesterday at 11:41 PM
Networking Arista - Mirror MLAG Nov 30, 2019
Networking Arista Switch setup for Media Network, Help! Nov 24, 2019
Networking Accton/Edgecore/White Box vs Arista & other branded switches? Nov 23, 2019
Networking Looking for Arista EOS-4.23.0.1F firmware Nov 20, 2019

Share This Page