InterVLAN Routing Issues - Something missing?

Kubowski

New Member
Aug 3, 2022
11
0
1
An existing network configuration I am overseeing had a flat layout, using 10.0.0.0/24. The SonicWALL firewall with IP 10.0.0.230 acting as the gateway, and all devices, servers, VMs, etc all on that one subnet. No VLANs at all. I've replaced the old switches with a stack of Brocade ICX 7250's with a base configuration and all of the existing network works fine on the new devices.

Now, I'm starting to introduce VLANs to begin segmenting this flat network. The switch stack, currently set with VLAN 1 VE1 as 10.0.0.211, I created new VLANs, 10 and 50, with 10.10.10.0/24 and 10.10.50.0/24 respectively. I've modified all of the network devices to now use the switch as the gateway, including configuring DHCP to assign 10.0.0.211 as the gateway. The switch has 10.0.0.230 as the gateway of last resort for 0.0.0.0/0.

For testing, I set ports 4/1/46 and 4/1/48 as VLANs 10 and 50, and I have two devices set statically as 10.10.10.150 and 10.10.50.150 in those VLANs. The ports are untagged for the respective VLANs.

10.10.50.150 can ping 10.10.10.150, but the reverse is not true, and... I have no idea why. It feels like a routing issue, but, the switch should be configured for interVLAN routing correctly, unless I'm missing something. I've attached the switch config with this post.

Some assistance would be lovely, to save some sanity / hair. :)
 

Attachments

Kubowski

New Member
Aug 3, 2022
11
0
1
Some additional things, it does indeed seem to be a routing issue on the switch itself, I just don't understand why. The switch is using L3 routing firmware.

show ip route:

Total number of IP routes: 4
Type Codes - B:BGP D:Connected O:OSPF R:RIP S:Static; Cost - Dist/Metric
BGP Codes - i:iBGP e:eBGP
OSPF Codes - i:Inter Area 1:External Type 1 2:External Type 2
STATIC Codes - v:Inter-VRF
Destination Gateway Port Cost Type Uptime
1 0.0.0.0/0 10.0.0.230 ve 1 1/1 S 15d15h
2 10.0.0.0/24 DIRECT ve 1 0/0 D 15d15h
3 10.10.10.0/24 DIRECT ve 10 0/0 D 1d14h
4 10.10.50.0/24 DIRECT ve 50 0/0 D 1d20h

But, if I do:

ping 10.10.50.150 source 10.10.10.1

Sending 1, 16-byte ICMP Echo to 10.10.50.150, timeout 5000 msec, TTL 64
Type Control-c to abort
Request timed out.
No reply from remote host.

Which should certainly work..
 

adman_c

Active Member
Feb 14, 2016
145
63
28
Chicago
You should set up the IP of each VE as the default gateway for DHCP assignments in their respective subnets. So for DHCP assignments in VLAN 50 you should set the default gateway as 10.10.50.1, and for DHCP assignments in VLAN 10 it should be 10.10.10.1. At that point, as long as 10.10.50.150 can ping 10.10.50.1, it should be able to reach 10.10.10.150. Honestly, with the default gateway as 10.0.0.211, I’m not sure how even one host can ping the other—it looks to me like they should be totally isolated, since 10.10.50.150 shouldn’t be able to reach 10.0.0.211 as its first hop.
 

Kubowski

New Member
Aug 3, 2022
11
0
1
I haven't even gotten to DHCP configuration, just some test devices in each VLAN with static IP configuration, including the respective correct gateway for its own VLAN / subnet.

Device in VLAN10, static network settings are: IP 10.10.10.150, its gateway is 10.10.10.1
Device in VLAN50, static network settings are: IP 10.10.50.150, its gateway is 10.10.50.1

The devices can ping their own gateways, they can ping anything in VLAN1 (10.0.0.0/24) and via the switch (which has the firewall as its gateway of last resort) they correctly route to everything outside the network.
 

Kubowski

New Member
Aug 3, 2022
11
0
1
From Device in VLAN 50 with IP 10.10.50.150:

Tracing Route to 10.10.10.150
over a maximum of 30 hops:

1 1 ms 1 ms 1 ms 10.10.50.1
2 * * * Request timed out.
3 * * * Request timed out.
4 * * * Request timed out.
... etc

From Device in VLAN 10 with IP 10.10.10.150:

Tracing Route to 10.10.50.150
over a maximum of 30 hops:

1 1 ms 1 ms 1 ms 10.10.10.1
2 * * * Request timed out.
3 * * * Request timed out.
4 * * * Request timed out.
... etc
 
Last edited:

Kubowski

New Member
Aug 3, 2022
11
0
1
It almost feels like the "gateway of last resort" on the switch is taking priority over the connected route. If that were to be happening, I could understand those traceroutes. (There's a static route on the gateway of last resort (the firewall) that will send the traffic to the switch 10.0.0.211).

But, that just shouldn't happen. The Route table on the switch is as follows:

Total number of IP routes: 4
Type Codes - B:BGP D:Connected O:OSPF R:RIP S:Static; Cost - Dist/Metric
BGP Codes - i:iBGP e:eBGP
OSPF Codes - i:Inter Area 1:External Type 1 2:External Type 2
STATIC Codes - v:Inter-VRF
Destination Gateway Port Cost Type Uptime
1 0.0.0.0/0 10.0.0.230 ve 1 1/1 S 16d17h
2 10.0.0.0/24 DIRECT ve 1 0/0 D 16d17h
3 10.10.10.0/24 DIRECT ve 10 0/0 D 3h10m
4 10.10.50.0/24 DIRECT ve 50 0/0 D 3h11m

The "cost" of the local route is zero, thus less so it SHOULD go to the respective VE and thus VLAN.
 

Kubowski

New Member
Aug 3, 2022
11
0
1
Good Question. This is in production right now so I can't test at the moment. But I'll add that to my "things to test" when I have a service window. Additional items I'm currently going to test:

1) Change the native VLAN to something different. Supposedly the native VLAN has some quirks.
2) Re-configure the connection between the switch and the firewall such that there's a completely independent "transit VLAN", as right now the switch, firewall and most of the flat network share (what will eventually become) the transit VLAN. 10.0.0.0/24.
 

adman_c

Active Member
Feb 14, 2016
145
63
28
Chicago
Good Question. This is in production right now so I can't test at the moment. But I'll add that to my "things to test" when I have a service window. Additional items I'm currently going to test:

1) Change the native VLAN to something different. Supposedly the native VLAN has some quirks.
2) Re-configure the connection between the switch and the firewall such that there's a completely independent "transit VLAN", as right now the switch, firewall and most of the flat network share (what will eventually become) the transit VLAN. 10.0.0.0/24.
Yeah, my inter-vlan routing works, but I have done both 1) and 2). The only problem I had was having the wrong gateway set on several of my machines with manually-set IPs.
 

Kubowski

New Member
Aug 3, 2022
11
0
1
Yeah, my inter-vlan routing works, but I have done both 1) and 2). The only problem I had was having the wrong gateway set on several of my machines with manually-set IPs.
-nods- It'll take me some time to schedule a service window, so I won't have an update for a bit. Annoyingly. I would have already done those things if this was a greenfield implementation, but alas. Lessons learned for additional things to change when I'm staging other locations: assuming one of these things are the underlying issue.

Thanks!
 

fohdeesha

Kaini Industries
Nov 20, 2016
2,525
2,698
113
31
fohdeesha.com
what's the result of running "ping 10.10.10.1 source 10.10.50.1" on the switch at the enable level? if it succeeds, the issue is outside the switch. I'd triple check your client configs and ensure they have the correct subnet mask first, then triple check gateway and IP. check their routing table too to ensure they don't have some other old gateway entry left in there. is there any other devices between these clients and the 7250?

as a last sanity check you can try running "clear mac-addr" and "clear arp" on the switch
 
  • Like
Reactions: abq

Kubowski

New Member
Aug 3, 2022
11
0
1
Damn.. "ping 10.10.10.1 source 10.10.50.1":

Ping self done.

I checked both device 10.10.50.150 and 10.10.10.150. The gateways and subnet masks are correct, there's no additional / stray gateway entries, both have their respective VE as the gateway.

The two clients are directly plugged into the switch. Two ports designed with the specific VLANs. There's an entire network of devices currently in VLAN1, native VLAN, all on the 10.0.0.0/24 subnet. The two device in VLAN10 and VLAN50 can ping everything in VLAN1, but things in VLAN1 can't ping VLAN10 or VLAN50. No ACLs in play at this time.

I did manage to try moving things off the native VLAN, and still no dice.

Could I be missing something more fundamental? My next steps at this point are to take a separate switch, clean configuration, set a few things up, plug devices in directly to that and attempt see what that leads me to.
 

Kubowski

New Member
Aug 3, 2022
11
0
1
A traceroute from a device in VLAN1, say IP 10.0.0.10 trying to reach 10.10.50.150 you get:

Tracing route to 10.10.50.150
over a maximum of 30 hops:

1 1 ms 1 ms 1 ms 10.0.0.211
2 * * * Request timed out.
3 * * * Request timed out.
4 * * * Request timed out.
5 * * * Request timed out.
6 * * * Request timed out.
7 * * * Request timed out.
... etc

Which makes sense, 10.0.0.211 is the gateway for the 10.0.0.10/24 subnet, so it gets to the correct switch VE, and then.. ?
 

Kubowski

New Member
Aug 3, 2022
11
0
1
I think I need a few drinks.

I've spent hours troubleshooting this today. We have a product called Cisco Umbrella, so I tried uninstalling that on one of my machines.. suddenly things work. But.. it now also works on the machine that still has Cisco Umbrella installed...

I also went and created a new DHCP scope for one of the VLANs during testing so maybe that was the cause? So I created yet another new VLAN, VE, etc.. put one of the machines in that VLAN, did NOT configure a DHCP scope.. and the new VLAN device works fine.

I added DNS servers (there were none on the configs before) for each of my test devices.. so, to see if that somehow was causing the issue, I removed the DNS servers, rebooted the device.. OMG what??? WHAT???

So.. all of my routing issues on windows devices was because I had no DNS servers set? Dear goodness why?? Ungh..
 

Kubowski

New Member
Aug 3, 2022
11
0
1
-facepalm- I know. I even have an "It's always DNS" sticker on my laptop.

In seriousness though, I'm still confused why Windows requires a DNS server set for it to respond to pings. To its IP address. Routing only. Bloody Windows.
 

adman_c

Active Member
Feb 14, 2016
145
63
28
Chicago
I know right? I'll be like "I know it's always DNS, but I know that and I have DNS configured right, so it's not DNS."

Narrator: "It was DNS"