Decreased performance when adding switch

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

Cape

Member
Oct 28, 2015
36
6
8
Hey,
I'm rebuilding the lab, and one part will be going from ESXi to Proxmox. Before moving my pfSense setup, I wanted to establish a baseline for what performance I get today. Current setup is ETTH -> pfSense on ESXi -> internal network. I have a 1G/1G uplink.
Doing a speed test (from a client on the wired LAN) on this yielded ~880 Mbps down/930 Mbps up. Decent real-world performance.

The new setup will have multiple nodes with WAN access, so next step is moving the ETTH to my switch on a VLAN segment and cabling in the ESXi port that previously had the WAN cable there. New speed test. And now I got 450 Mbps down. Wat.

The switch is a HP 1810G, and checking the data sheet it should be able to do 48 Gbps switching, so it doesn't seem like I should be hitting some cap there.

Any ideas? While I basically only use the full bandwidth a few times per month or so when downloading something from Steam, I don't really like the idea of dumping half the performance.
 

itronin

Well-Known Member
Nov 24, 2018
1,233
793
113
Denver, Colorado
Your performance number looks suspiciously like a switch port either auto-negotiated poorly or was set to half-duplex.
Since your direct wired connection performance was good, suspect the switch ports first.
Check the involved switch ports to see their current status and check their configuration to make sure they are ideally auto duplex or you could configure the duplex to FULL.

if the switch ports look fine then I have occasionally seen a port set to auto but an interface card negotiate to half so you'll have to check the ETTH device's interface status and settings as well as the pfSense ESXI mapped interface.

itr
 

Cape

Member
Oct 28, 2015
36
6
8
That is a very good point! Not sure why it didn't cross my mind... However, the switch reports both ports being full duplex and 1G autoneg, so at least it the switch believes we're doing full speed. There isn't really a "ETTH device", I just get an ethernet port by my apartment door :) (Well, there's a big switch somewhere in the basement, but nothing I have access to).
pfSense also says full duplex.

EDIT: Doing a new speed test, I now get 530 Mbps/680 Mbps. So while it sounded plausible with the initial numbers, it doesn't seem to be it. Thanks for the idea, though!
 

itronin

Well-Known Member
Nov 24, 2018
1,233
793
113
Denver, Colorado
>=pfSense also says full duplex.
>EDIT: Doing a new speed test, I now get 530 Mbps/680 Mbps. So while it sounded plausible with the >initial numbers, it doesn't seem to be it. Thanks for the idea, though!

Your VLAN is untagged at both ends or just on the carrier connection?
Is your VLAN tagged coming into ESXI or untagged?

re. pfSense says full duplex: is that the interface from pfSense's view or ESXI's view? What does ESXI say about the actual interfaace?

If you have a small gig "dumb switch" you might try that instead of using a VLAN just to test and see what the perf numbers look like... That will also force a disconnect and the various switch ports to auto negotiate.

ETTH-> dumb switch <- ESXI phys interface <- pfSense virt interface

FWIW the times I have seen a carrier switch auto negotiate poorly the carrier was using Bay networks or older Cisco Cat gear, 35xx, 37xx on metro ethernet. I doubt there is a Bay networks switch in your basement...
 

Evan

Well-Known Member
Jan 6, 2016
3,346
598
113
Switch isn’t being forced to do some layer 3 type work instead of a basic layer 2 config is it ?
 

Cape

Member
Oct 28, 2015
36
6
8
Your VLAN is untagged at both ends or just on the carrier connection?
Is your VLAN tagged coming into ESXI or untagged?
It is untagged coming into ESXi.

re. pfSense says full duplex: is that the interface from pfSense's view or ESXI's view? What does ESXI say about the actual interfaace?
Both ESXi and pfsense, sorry for being unclear.

If you have a small gig "dumb switch" you might try that instead of using a VLAN just to test and see what the perf numbers look like... That will also force a disconnect and the various switch ports to auto negotiate.
I doubt there is a Bay networks switch in your basement...
No idea what is used, but the ISP's CTO moved in here a few years ago (some years after we signed up with them, so no funny business), so I don't think the equipment is "known bad" at least.
 

Terry Kennedy

Well-Known Member
Jun 25, 2015
1,140
594
113
New York City
www.glaver.org
The switch is a HP 1810G, and checking the data sheet it should be able to do 48 Gbps switching, so it doesn't seem like I should be hitting some cap there.
Switch / router performance numbers are generally calculated "downhill, with a tail wind" - in other words, optimally sized packets. Normally, "optimally" means 9K jumbos. If one side of the switch is connected to your ISP, you don't have control over what they're willing to send / accept. What is the MTU setting on your host (hypervisor and application)?

I'm not familiar with HP switches (and from what I've heard, they're OEM'd from several suppliers), but on Cisco switches you can monitor the CPU load. Which can be misleading as a lot of the processing happens in hardware - on the ancient 2900XL switches, 80 to 90 percent of the CPU was used to flash the pretty LEDs on the front panel, but since switching happened in hardware, that didn't matter. Some things tend to punt packets to the processor, though, so it can be useful to check, even on the latest switches.
Any ideas? While I basically only use the full bandwidth a few times per month or so when downloading something from Steam, I don't really like the idea of dumping half the performance.
Since you reported 3 very different sets of numbers, make sure it isn't actually something outside your LAN - try the same configuration at multiple times of the day / night and see how much things change. I run a 40Gbit/sec speedtest.net instance and looking at the reports sorted by client IP address, there's quite a bit of variation, presumably all from their ISP since I have 40GbE links to multiple upstreams and several peering points. Some of the differences may be several customers with different connection speeds behind a carrier grade NAT, but I've seen the differences with providers who I know don't use CGN.