Drag to reposition cover

Brocade ICX Series (cheap & powerful 10gbE/40gbE switching)

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

MrRedHat

New Member
May 25, 2012
10
0
1
I have an LB6M that if flashed to the Brocade firmware awhile back. The switch has been working good for me, but I am looking to “upgrade”. – I’d really like to have activity lights on my switch. :) I have 18 10 SFP+ network devices, so the Brocade ICX6650 seems the most logical choice. I see a few on eBay, but none in the US, for less than $1,400. Is there anything else I should be looking at, or just wait to see something cheaper comes along?
 

Jason Antes

Active Member
Feb 28, 2020
224
76
28
Twin Cities
I have an LB6M that if flashed to the Brocade firmware awhile back. The switch has been working good for me, but I am looking to “upgrade”. – I’d really like to have activity lights on my switch. :) I have 18 10 SFP+ network devices, so the Brocade ICX6650 seems the most logical choice. I see a few on eBay, but none in the US, for less than $1,400. Is there anything else I should be looking at, or just wait to see something cheaper comes along?
Would you be interested in a fully licensed VDX6740-48-F? Updated with the help of Fohdeesha ealier in the year. Had to replace the CF card in it. Also have a bunch of Brocade Gb SFP's for it.
 
Last edited:
  • Like
Reactions: fohdeesha

rootwyrm

Member
Mar 25, 2017
74
93
18
www.rootwyrm.com
So @ske4za, thank you for the awesome pointers... I'm still doing testing here but of course, can't get a hold of anyone at the damned provider to reset the tables. AGAIN. And I can't actually test without that.

Couple notes though:
  • Brocade developers, if you are reading this, you are idiots. And I mean that. The 6450 supports a maximum of 32 MAC filters. Not 1024. Not 128. 32. That means you do not even have enough filters for the ports on the switch. It's lazy, sloppy, and stupid.
  • 802.1X and VMware with dvSwitches or (gods help you) NSX? Is a recipe for disaster. If you want to run VMware on your Brocade, avoid everything in the 802.1X section and pretend 802.1X does not exist. Seriously.
  • no lldp enable ports ethe 1/1/1 to 1/1/2 ethe 1/2/1 to 1/2/4 - just... REALLY?!

I think I might be where I need to be, but again, until I can get somebody there who doesn't struggle to operate velcro, I can't test. The net configuration pieces look.. well.. like crap. But mirror says it's working.

Code:
mac filter 1 permit any 589c.fc44.0000 ffff.ffff.0000
mac filter 2 permit any 589c.fc46.0000 ffff.ffff.0000
mac filter 3 permit any 0001.5c00.0000 ffff.ff00.0000
mac filter 32 deny any any

interface ethernet 1/1/1
 mac filter-group 1 to 3 32
 mac-learn-disable
 spanning-tree root-protect
 trust dscp

interface ethernet 1/1/2
 mac filter-group 1 to 3 32
 mac-learn-disable
 spanning-tree root-protect
 trust dscp

no lldp enable ports ethe 1/1/1 to 1/1/2
I have to ask - WTF service are you using that is forcing you to do this? I'd like to have a word with their engineers. However you're correct, this isn't really what fastiron was built for. There's some similar-ish mechanisms like what @ske4za posted above, but it's only going to get you 70% of the way there in your application. The proper router/service provider line (NetIron) will do complex MAC filtering and assignment like this quite happily, but obviously we're not running NetIron. That's one of my biggest pet peeves with brocade was the harsh market segmentation between fastiron and netiron - stuff like this is missing from fastiron not because the ASICs aren't capable of it, but because they didn't bother to think their "access switch" line would ever need it. Juniper is *much* better about this
This is the experience you will get with one of the largest cable overbuilders in the United States. I've TOLD them I don't know HOW many times that their entire provisioning setup is just wrong and that is not how provisioning is supposed to work. Because the way they have it misconfigured means that it is literally IMPOSSIBLE for pretty much ANY CPE to NOT be capable of creating a misbalance. They are, idiotically, trying to combine a static MAC configuration (strict lease permit) with a dynamic MAC configuration (CPE MAC authentication) without understanding either or configuring either one even remotely correctly. Disclaimer: I am grossly oversimplifying and leaving out so many moving pieces.

I don't want to drag us too off-topic in the Brocade thread, but since we're already talking about the general concepts anyways... basically, a strict lease permit is like Comcast or Verizon where you have to give them the MAC of the device(s) connected to the CPE. Those MACs are then authorized in the provisioning system, and will be allowed to get a DHCP lease. All other DHCP requests get ignored. Hence, "strict lease permit." CPE MAC authentication is the reverse; lease authorization is based on the CPE MAC being added to the CMTS provisioning database and an unprovisioned CPE cannot establish BPI or BPI+ or lands on the capture network. Then you're supposed to use MAC limits outside of the UMAC/DMAC and preferably inside the DHCP configuration to limit the number of IPs handed out to a single CPE.
What happens when you put the MAC limits in the UMAC/DMAC is, because DOCSIS is an inherently asymmetric physical layer with a single L2 broadcast domain, you create a race condition ANY time you broadcast ARP down the CPE or a device behind the CPE link-floods. (If you replicated this in an Ethernet network, you'd get the same problem.) The link-flooding device sends down the wire and is learned by the DMAC before the DHCP request. The DHCP request is then learned by the UMAC and the DMAC. Now you have 1 learned MAC on the UMAC because it sent the DHCP response and 2 learned MACs on the DMAC because it got spammed with junk traffic with no attached L3 address. And the DMAC will now block responses to a third MAC while the UMAC will learn it and make broadcast visible to it. (And part of this mess is because somebody had the brilliant idea that isolating the L3 broadcast domain is the same thing.)

I would gladly invite you to join the dogpile, except I'm fairly certain they laid off everyone with a clue who hasn't already quit. I don't claim to be a senior network engineer by any stretch (I'm really not; I can't for the life of me figure out how to make OSPF do what I want, my BGP sucks, but if you need ATM or IS-IS expertise I'm your guy.) But with what they've done to their network since Juniper washed their hands of it, I doubt very much I could do worse. I at least understand that announcing literally everything as a /24 is idiotic and unacceptable.
 

ArmedAviator

Member
May 16, 2020
91
56
18
Kansas
I'm having trouble suddenly with LACP. It has been working fine for a long time and suddenly after making some VLAN changes to ports unrelated to this LAG, it is being problematic.

I have two identical servers with two identical LAGs each - one on dual 1G links and one on dual 10G links. In this case, the proton-10G-LAG is the problematic one.

Code:
lag neutron-10G-LAG dynamic id 6
 ports ethernet 1/2/5 ethernet 1/2/10
 primary-port 1/2/5
 lacp-timeout short
 deploy
 port-name "neutron 10G LAG" ethernet 1/2/5
 port-name "neutron 10G LAG" ethernet 1/2/10
!
lag neutron-mgmt dynamic id 10
 ports ethernet 1/1/17 ethernet 1/1/19
 primary-port 1/1/17
 lacp-timeout short
 deploy
 port-name "neutron Management" ethernet 1/1/17
 port-name "neutron Management" ethernet 1/1/19
!                                                                 
lag proton-10G-LAG dynamic id 5
 ports ethernet 1/2/4 ethernet 1/2/9
 primary-port 1/2/4
 lacp-timeout short
 deploy
 port-name "proton 10G LAG" ethernet 1/2/4
 port-name "proton 10G LAG" ethernet 1/2/9
!
lag proton-mgmt dynamic id 9
 ports ethernet 1/1/13 ethernet 1/1/15
 primary-port 1/1/13
 lacp-timeout short
 deploy
 port-name "proton Management" ethernet 1/1/13
 port-name "proton Management" ethernet 1/1/15
Code:
SSH@brocore(config)#show lag proton-10G-LAG
Total number of LAGs:          11
Total number of deployed LAGs: 11
Total number of trunks created:11 (109 available)
LACP System Priority / ID:     1 / 748e.f8e7.b4b0
LACP Long timeout:             120, default: 120
LACP Short timeout:            3, default: 3

=== LAG "proton-10G-LAG" ID 5 (dynamic Deployed) ===
LAG Configuration:
   Ports:         e 1/2/4 e 1/2/9
   Port Count:    2
   Primary Port:  1/2/4
   Trunk Type:    hash-based
   LACP Key:      20005
   LACP Timeout:  short
Deployment: HW Trunk ID 2
Port       Link    State   Dupl Speed Trunk Tag Pvid Pri MAC             Name
1/2/4      Up      Blocked Full 10G   5     No  4    0   748e.f8e7.b4e4  proton 10G LAG
1/2/9      Up      Forward Full 10G   5     No  4    0   748e.f8e7.b4e4  proton 10G LAG

Port       [Sys P] [Port P] [ Key ] [Act][Tio][Agg][Syn][Col][Dis][Def][Exp][Ope]
1/2/4           1        1   20005   Yes   S   Agg  Syn  Col  Dis  Def  No   Ina
1/2/9           1        1   20005   Yes   S   Agg  Syn  Col  Dis  No   No   Ope
                                                                  

 Partner Info and PDU Statistics
Port          Partner         Partner     LACP      LACP     
             System ID         Key     Rx Count  Tx Count 
1/2/4    1-0000.0000.0000       67   310800   8773259
1/2/9    65535-0002.c93b.6130       15   337450   8772796
After much head scratching, hair pulling, and keyboard thrashing last night (at 3AM) I was able to get one of the two ports to cooperate. For a good while. both ports would be in LACP-BLOCKED mode. No configuration changes caused the port to work - simply a combination of undeploying the LAG removing the ports from the VLAN, disabling the ports, deploying the LAG, adding the ports to the VLAN, and enabling the primary port. Don't quote me on the order, I tried repeated different methods. I also rebooted the server numerous times to no avail.

On the Linux host:
Code:
# Start bonding driver with 2 bonds in LACP mode
modprobe bonding max_bonds=2 mode=4 lacp_rate=1 xmit_hash_policy=1

# Management network (1 Gig)
ip link set bond0 up
ifenslave bond0 enp10s0f0 enp10s0f1

# SAN VLAN network (10 Gig)
ip link set bond1 up
ifenslave bond1 enp9s0 enp9s0d1
ip addr add 10.1.4.3/24 dev bond1
ip link set dev bond1 mtu 9000
Code:
rich@proton ~ $ cat /proc/net/bonding/bond1
Ethernet Channel Bonding Driver: v5.8.8_1

Bonding Mode: IEEE 802.3ad Dynamic link aggregation
Transmit Hash Policy: layer3+4 (1)
MII Status: up
MII Polling Interval (ms): 100
Up Delay (ms): 0
Down Delay (ms): 0
Peer Notification Delay (ms): 0

802.3ad info
LACP rate: fast
Min links: 0
Aggregator selection policy (ad_select): stable

Slave Interface: enp9s0
MII Status: down
Speed: Unknown
Duplex: Unknown
Link Failure Count: 0
Permanent HW addr: 00:02:c9:3b:61:30
Slave queue ID: 0
Aggregator ID: 1
Actor Churn State: churned
Partner Churn State: churned
Actor Churned Count: 1
Partner Churned Count: 1

Slave Interface: enp9s0d1
MII Status: up
Speed: 10000 Mbps
Duplex: full
Link Failure Count: 0
Permanent HW addr: 00:02:c9:3b:61:31
Slave queue ID: 0
Aggregator ID: 2
Actor Churn State: none
Partner Churn State: none
Actor Churned Count: 1
Partner Churned Count: 1
Code:
bond1: flags=5187<UP,BROADCAST,RUNNING,MASTER,MULTICAST>  mtu 9000
        inet 10.1.4.3  netmask 255.255.255.0  broadcast 0.0.0.0
        inet6 fe80::202:c9ff:fe3b:6130  prefixlen 64  scopeid 0x20<link>
        ether 00:02:c9:3b:61:30  txqueuelen 1000  (Ethernet)
        RX packets 13347485  bytes 98321391464 (91.5 GiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 7774116  bytes 2785512572 (2.5 GiB)
        TX errors 0  dropped 4 overruns 0  carrier 0  collisions 0

enp9s0: flags=6147<UP,BROADCAST,SLAVE,MULTICAST>  mtu 9000
        ether 00:02:c9:3b:61:30  txqueuelen 1000  (Ethernet)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 0  bytes 0 (0.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

enp9s0d1: flags=6211<UP,BROADCAST,RUNNING,SLAVE,MULTICAST>  mtu 9000
        ether 00:02:c9:3b:61:30  txqueuelen 1000  (Ethernet)
        RX packets 13347485  bytes 98321391464 (91.5 GiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 7774116  bytes 2785512572 (2.5 GiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
 

ArmedAviator

Member
May 16, 2020
91
56
18
Kansas
Out of more testing and hatred of myself, I disabled the port again to see if there's any change. As stated before, now both ports are LACP-BLOCKED with no other configuration changes:

Code:
SSH@brocore(config-if-e10000-1/2/4)#disable

SSH@brocore(config-if-e10000-1/2/4)#show int eth 1/2/4
  10GigabitEthernet 1/2/4 is disabled, line protocol is down
  Port down for 5 second(s)
  Hardware is   10GigabitEthernet , address is 748e.f8e7.b4e4 (bia 748e.f8e7.b4e4)
  Configured speed 10Gbit, actual unknown, configured duplex fdx, actual unknown
  Configured mdi mode AUTO, actual unknown
  Member of L2 VLAN ID 4, port is untagged, port state is DISABLED
  BPDU guard is Disabled, ROOT protect is Disabled, Designated protect is Disabled
  Link Error Dampening is Disabled
  STP configured to ON, priority is level0, mac-learning is enabled
  Openflow is Disabled, Openflow Hybrid mode is Disabled,  Flow Control is config disabled, oper disabled
  Mirror disabled, Monitor disabled
  Mac-notification is disabled
  Member of active trunk ports 1/2/4,1/2/9, primary port is 1/2/4
  Member of configured trunk ports 1/2/4,1/2/9, primary port is 1/2/4
  Port name is proton 10G LAG
  MTU 10200 bytes, encapsulation ethernet
  300 second input rate: 0 bits/sec, 0 packets/sec, 0.00% utilization
  300 second output rate: 0 bits/sec, 0 packets/sec, 0.00% utilization
  440558812 packets input, 645249133300 bytes, 0 no buffer
  Received 13 broadcasts, 310855 multicasts, 440247944 unicasts
  0 input errors, 0 CRC, 0 frame, 0 ignored
  0 runts, 0 giants
  474219943 packets output, 2764246386949 bytes, 0 underruns
  Transmitted 2570307 broadcasts, 13721883 multicasts, 457927753 unicasts
  0 output errors, 0 collisions
  Relay Agent Information option: Disabled

Egress queues:
Queue counters    Queued packets    Dropped Packets
    0           460711139                   0
    1                   0                   0
    2                   0                   0
    3                   0                   0
    4                   0                   0
    5                  42                   0
    6                   0                   0
    7            13508762                   0



SSH@brocore(config-if-e10000-1/2/4)#no disable  
SSH@brocore(config-if-e10000-1/2/4)#exit

SSH@brocore(config)#show lag proton-10G-LAG
Total number of LAGs:          11
Total number of deployed LAGs: 11
Total number of trunks created:11 (109 available)
LACP System Priority / ID:     1 / 748e.f8e7.b4b0
LACP Long timeout:             120, default: 120
LACP Short timeout:            3, default: 3

=== LAG "proton-10G-LAG" ID 5 (dynamic Deployed) ===
LAG Configuration:
   Ports:         e 1/2/4 e 1/2/9
   Port Count:    2
   Primary Port:  1/2/4
   Trunk Type:    hash-based
   LACP Key:      20005
   LACP Timeout:  short
Deployment: HW Trunk ID 2
Port       Link    State   Dupl Speed Trunk Tag Pvid Pri MAC             Name
1/2/4      Up      Blocked Full 10G   5     No  4    0   748e.f8e7.b4e4  proton 10G LAG
1/2/9      Up      Blocked Full 10G   5     No  4    0   748e.f8e7.b4e4  proton 10G LAG

Port       [Sys P] [Port P] [ Key ] [Act][Tio][Agg][Syn][Col][Dis][Def][Exp][Ope]
1/2/4           1        1   20005   Yes   S   Agg  Syn  Col  Dis  Def  No   Ina
1/2/9           1        1   20005   Yes   S   Agg  Syn  Col  Dis  Def  No   Ina
                                                             

Partner Info and PDU Statistics
Port          Partner         Partner     LACP      LACP
             System ID         Key     Rx Count  Tx Count
1/2/4    1-0000.0000.0000       67   310800   8774191
1/2/9    1-0000.0000.0000       72   338405   8773725

SSH@brocore(config)#show int eth 1/2/4
  10GigabitEthernet 1/2/4 is up, line protocol is down (LACP-BLOCKED)
  Port down (LACP-BLOCKED) for 1 minute(s) 10 second(s)
  Hardware is   10GigabitEthernet , address is 748e.f8e7.b4e4 (bia 748e.f8e7.b4e4)
  Configured speed 10Gbit, actual 10Gbit, configured duplex fdx, actual fdx
  Configured mdi mode AUTO, actual none
  Member of L2 VLAN ID 4, port is untagged, port state is BLOCKING
  BPDU guard is Disabled, ROOT protect is Disabled, Designated protect is Disabled
  Link Error Dampening is Disabled
  STP configured to ON, priority is level0, mac-learning is enabled
  Openflow is Disabled, Openflow Hybrid mode is Disabled,  Flow Control is config disabled, oper disabled
  Mirror disabled, Monitor disabled
  Mac-notification is disabled
  Member of active trunk ports 1/2/4,1/2/9, primary port is 1/2/4
  Member of configured trunk ports 1/2/4,1/2/9, primary port is 1/2/4
  Port name is proton 10G LAG
  MTU 10200 bytes, encapsulation ethernet
  300 second input rate: 0 bits/sec, 0 packets/sec, 0.00% utilization
  300 second output rate: 888 bits/sec, 0 packets/sec, 0.00% utilization
  440558812 packets input, 645249133300 bytes, 0 no buffer
  Received 13 broadcasts, 310855 multicasts, 440247944 unicasts
  0 input errors, 0 CRC, 0 frame, 0 ignored
  0 runts, 0 giants
  474220007 packets output, 2764246395141 bytes, 0 underruns
  Transmitted 2570307 broadcasts, 13721947 multicasts, 457927753 unicasts
  0 output errors, 0 collisions
  Relay Agent Information option: Disabled

Egress queues:
Queue counters    Queued packets    Dropped Packets
    0           460711139                   0
    1                   0                   0
    2                   0                   0
    3                   0                   0
    4                   0                   0
    5                  42                   0
    6                   0                   0
    7            13508826                   0

SSH@brocore(config)#show int eth 1/2/9
  10GigabitEthernet 1/2/9 is up, line protocol is down (LACP-BLOCKED)
  Port down (LACP-BLOCKED) for 1 minute(s) 13 second(s)
  Hardware is   10GigabitEthernet , address is 748e.f8e7.b4e4 (bia 748e.f8e7.b4e9)
  Configured speed 10Gbit, actual 10Gbit, configured duplex fdx, actual fdx
  Configured mdi mode AUTO, actual none
  Member of L2 VLAN ID 4, port is untagged, port state is BLOCKING
  BPDU guard is Disabled, ROOT protect is Disabled, Designated protect is Disabled
  Link Error Dampening is Disabled
  STP configured to ON, priority is level0, mac-learning is enabled
  Openflow is Disabled, Openflow Hybrid mode is Disabled,  Flow Control is config disabled, oper disabled
  Mirror disabled, Monitor disabled
  Mac-notification is disabled
  Member of active trunk ports 1/2/4,1/2/9, primary port is 1/2/4
  Member of configured trunk ports 1/2/4,1/2/9, primary port is 1/2/4
  Port name is proton 10G LAG
  MTU 10200 bytes, encapsulation ethernet
  300 second input rate: 0 bits/sec, 0 packets/sec, 0.00% utilization
  300 second output rate: 872 bits/sec, 0 packets/sec, 0.00% utilization
  268988020 packets input, 546972198876 bytes, 0 no buffer
  Received 865724 broadcasts, 341082 multicasts, 267781214 unicasts
  0 input errors, 0 CRC, 0 frame, 0 ignored
  0 runts, 0 giants
  485536726 packets output, 2720495805565 bytes, 0 underruns
  Transmitted 532846 broadcasts, 8922831 multicasts, 476081049 unicasts
  0 output errors, 0 collisions
  Relay Agent Information option: Disabled

Egress queues:
Queue counters    Queued packets    Dropped Packets
    0           476744109                   0
    1                   0                   0
    2                   0                   0
    3                   0                   0
    4                   0                   0
    5                 138                   0
    6                   0                   0
    7             8792479                   0
However, once I undeploy and then deploy the LAG, I get one of the ports Forwarding again:

Code:
SSH@brocore(config)#lag proton-10G-LAG
SSH@brocore(config-lag-proton-10G-LAG)#no deploy
Secondary port 1/2/9 disabled automatically upon LAG un-deploy to avoid potential loop
LAG proton-10G-LAG un-deployed successfully!
SSH@brocore(config-lag-proton-10G-LAG)#deploy  
LAG proton-10G-LAG deployed successfully!

SSH@brocore(config-lag-proton-10G-LAG)#show lag proton-10G-LAG
Total number of LAGs:          11
Total number of deployed LAGs: 11
Total number of trunks created:11 (109 available)
LACP System Priority / ID:     1 / 748e.f8e7.b4b0
LACP Long timeout:             120, default: 120
LACP Short timeout:            3, default: 3

=== LAG "proton-10G-LAG" ID 5 (dynamic Deployed) ===
LAG Configuration:
   Ports:         e 1/2/4 e 1/2/9
   Port Count:    2
   Primary Port:  1/2/4
   Trunk Type:    hash-based
   LACP Key:      20005
   LACP Timeout:  short
Deployment: HW Trunk ID 2
Port       Link    State   Dupl Speed Trunk Tag Pvid Pri MAC             Name
1/2/4      Up      Blocked Full 10G   5     No  4    0   748e.f8e7.b4e4  proton 10G LAG
1/2/9      Up      Forward Full 10G   5     No  4    0   748e.f8e7.b4e4  proton 10G LAG

Port       [Sys P] [Port P] [ Key ] [Act][Tio][Agg][Syn][Col][Dis][Def][Exp][Ope]
1/2/4           1        1   20005   Yes   S   Agg  Syn  Col  Dis  Def  No   Ina
1/2/9           1        1   20005   Yes   S   Agg  Syn  Col  Dis  No   No   Ope
                                                                 

Partner Info and PDU Statistics
Port          Partner         Partner     LACP      LACP    
             System ID         Key     Rx Count  Tx Count
1/2/4    1-0000.0000.0000       67   310800   8774919
1/2/9    65535-0002.c93b.6130       15   338598   8774452
 
Last edited:

infoMatt

Active Member
Apr 16, 2019
222
100
43
Code:
Port [Sys P] [Port P] [ Key ] [Act][Tio][Agg][Syn][Col][Dis][Def][Exp][Ope]
 1/2/4 1 1 20005 Yes S Agg Syn Col Dis Def No Ina
 1/2/9 1 1 20005 Yes S Agg Syn Col Dis No No Ope
Pay close attention on the show lagg: the disabled port is in "Default" state.
From documentation:
DefIndicates whether the port is using default link aggregation values. The port uses default values if it has not received link aggregation information through LACP from the port at the remote end of the link. This field can have one of the following values:
  • Def - The port has not received link aggregation values from the port at the other end of the link and is therefore using its default link aggregation LACP settings.
To me it looks as there is a problem on the linux side.. The two intefaces have a different Aggregator ID, and one of the two is marked as "Churned"...

Looking around, it seems a problem with Systemd: systemd LACP bond mess up aggregation ID for NICs but it has no solution provided.
 
  • Like
Reactions: klui

xerxies

New Member
Oct 30, 2018
3
0
1
33
Saint Joseph, Missouri
I haven't been able to find anything like BGP Peer Groups for a subnet, but does fastiron support anything like BGP Dynamic Neighbors in IOS? Or does every BGP neighbor need to be a known IP configured ahead of time?
 

rootwyrm

Member
Mar 25, 2017
74
93
18
www.rootwyrm.com
To me it looks as there is a problem on the linux side.. The two intefaces have a different Aggregator ID, and one of the two is marked as "Churned"...

Looking around, it seems a problem with Systemd: systemd LACP bond mess up aggregation ID for NICs but it has no solution provided.
Correct; this is a known bug due to (drumroll please) systemd having terminally broken behavior. I'm sure everyone is so very shocked by this.

There is no answer, because there is no fix. It's an expression of a design defect in systemd which cannot and will not be fixed. Use NetworkManager and only NetworkManager. Do not ever use systemd-networkd for anything. Period. Especially not LACP. It is literally impossible for it to ever work correctly.

I haven't been able to find anything like BGP Peer Groups for a subnet, but does fastiron support anything like BGP Dynamic Neighbors in IOS? Or does every BGP neighbor need to be a known IP configured ahead of time?
No, there is no IOS-style dynamic neighbor support. Brocade/Arris' routing engine basically mirrors Quagga capabilities. (Might even just be Quagga, frankly, because...) If you have a set of known IP addresses you can configure them as a peer-group. i.e. neighbor group1 peer-group; neighbor 10.10.10.10 peer-group group1; neighbor 10.10.10.11 peer-group group 1.
Yes, this is precisely identical to Quagga.
 

ArmedAviator

Member
May 16, 2020
91
56
18
Kansas
@infoMatt @rootwyrm

Unfortunately I do not have systemd on this Linux (Void Linux) and my LAGS, VLANs, etc. are configured entirely via a shell script (posted in my first problem post). It is just extremely odd that two identical systems (hardware and software) are acting completely differently, and they both worked fine for over a year on this and longer on other switches.
 

fohdeesha

Kaini Industries
Nov 20, 2016
2,727
3,075
113
33
fohdeesha.com
  • Brocade developers, if you are reading this, you are idiots. And I mean that. The 6450 supports a maximum of 32 MAC filters. Not 1024. Not 128. 32. That means you do not even have enough filters for the ports on the switch. It's lazy, sloppy, and stupid.
I haven't tried to see the actual max on a 6450, but on all the other ICX's I've used it's a system setting, and 32 is just the default limit. Looks like the 6610 will let you configure the limit up to 512:

Code:
ICX1#show default values | inc (mac-filter|Current)
System Parameters    Default    Maximum    Current    Configured
mac-filter-port      16         256        16         16
mac-filter-sys       32         512        32         32
to change it would just be `system-max mac-filter-sys 512` in conf t mode
 

tommybackeast

Active Member
Jun 10, 2018
286
105
43
fohdeesha - off topic question : now that 10GB has become somewhat common for homelab users; what is next a few years from now?

I refer to 5-6 years from now as the modern $3000 Enterprise Switches get sold for $200-300

25GBe ? 40GBe? 50GBe?

What is current cutting edge for large Data Center? 200GBe? more?
 

klui

Well-Known Member
Feb 3, 2019
824
453
63
40G is no longer the move forward solution in the data center. It's 25, 50, 100, 200.... If you get your hands on a Mellanox CX6 with its 200G bandwidth you would think you just need a 200G switch but they're not available. Only 400G are being sold by Arista for example. Just look at their current offerings to get an idea of what other vendors are doing. A more up-to-date lower-end high-speed switch would be a 7160-48YC6, 48 10/25G and 6 4x10/4x25/2x50/40/100G.

Arista 100G 7060CX-32Ss are coming down in price but it idles at 120W with only the management cable attached. Fan noise is sorta OK at 30% PWM. I recall this switch was over 20K when it was first released. It's 1.2K now in the used market. The Mellanox SX6012 idles at 30W and its fans can be made to run very quiet at 30% PWM. The only drawback with 40G switches being dumped on eBay at this time, including the SX6012 is they will not be able to link at 25G using breakouts.

EDIT: here's a very informative video about the state-of-the-art in transceivers from the Nanog channel.

If you have 2 hours, check out their video about optical
 
Last edited:

rootwyrm

Member
Mar 25, 2017
74
93
18
www.rootwyrm.com
@infoMatt @rootwyrm

Unfortunately I do not have systemd on this Linux (Void Linux) and my LAGS, VLANs, etc. are configured entirely via a shell script (posted in my first problem post). It is just extremely odd that two identical systems (hardware and software) are acting completely differently, and they both worked fine for over a year on this and longer on other switches.
So, this comes down to understanding the actual OS code as opposed to all the kids out there acting like they're the smartest ones in the room when they don't even know how to compile a kernel. When Linux spins up LAGs, it breaks the rules, period. Linux, because self-important basement dwelling children, does a lot of very stupid things which are all very wrong. And then demands everyone change for them.

Behold the stupid for yourself. Stupid: "will setup a network device, with an ip address. No mac address will be assigned at this time."
So if you bring up two bond interfaces at the same time either on the switch side (such as by rebooting the switch) or the host side within your 802.1ad window, then you have two 802.1ad dynamics both without a MAC address. That behavior occurs any time you have a full link drop; it will reinit initially with no MAC. This is deeply broken and utterly stupid. It becomes especially broken when you have, say, two machines that both have "eth0" because they will both present the same port identifiers in the 802.3ad structures, so as far as a switch can tell? It's the same host.

OTHER switches don't "work." They reject the malformed 802.3ad, and the link only comes up when it actually presents valid structs on a later tick. Brocade doesn't reject the malformed struct and assumes that it will become valid on the next tick. Except by that point you've already confused matters so badly it cannot possibly recover. Arista used to have the same behavior ages back.

I haven't tried to see the actual max on a 6450, but on all the other ICX's I've used it's a system setting, and 32 is just the default limit. Looks like the 6610 will let you configure the limit up to 512:

Code:
ICX1#show default values | inc (mac-filter|Current)
System Parameters    Default    Maximum    Current    Configured
mac-filter-port      16         256        16         16
mac-filter-sys       32         512        32         32
to change it would just be `system-max mac-filter-sys 512` in conf t mode
I am very much going to stand by my statement, because it's an absolutely idiotic default value by any measure. But yes, I missed where it can be bumped to 512. Which of course, is not what Brocade's documentation says, but at least it's closer to truth than Cisco's. According to Brocade's documentation, the 6250, 6450, and 6610 should all have a system limit of 1024. (Nevermind that it's deeply unusual to have to change system maximums off defaults...)