Weird issue

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

lahatte

New Member
Jul 8, 2020
27
3
3
I've been having this weird issue for 6 months now and I can't figure it out.

The issue is when a certain A/C unit turns on, or if there is a thunderstorm, the internet goes out for 3 seconds or so and goes right back to normal. At first I thought it was Electrical issue, so i had electrician check entire house out, make sure no low voltage lines were near higher voltage lines, etc. I also had him run a dedicated power line to the server closet so it wouldn't be on same main service line as the A/C unit. None of that fixed the issue. Plus, everything is connected to the battery backup so a power shouldn't be an issue especially when the volt meters are rock solid with negligible fluctuation. Also, the A/C unit is about 50ft from the server closet and about 20ft from my main computer. All the network lines run away from the A/C unit since my main computer is in between the server closet (upstairs) and the A/C unit downstairs. The A/C unit is outside on a concrete slab. (The thunderstorm issue is really stumping me, especially when the thunder is way off in the distance and affects the internet)... but it's only when the switch is involved.

Equipment I have:

- Spectrum 1g internet service using their modem,

- Orbi PRO router with 3 satellites

- Netgear XS748T ProSAFE 48-Port 10-Gigabit Smart Managed Switch

- Luxul 24 port unmanaged switch 1gbe

- Cyberpower PR3000RTXL2UN | Smart App Sinewave UPS

- Cat6a run through entire house and 2 Cat7 lines

- Three 40gbe fiber lines (not connected yet)


Originally I had a Synology router mesh system, so i replaced it with Orbi... problem still persisted. Then I replaced Orbi, with Orbi PRO, problem still persisted. I had Spectrum come out and they monitored the modem and line for a week or so and detected no issues. I've also had modem replaced twice.

I plug my computer directly into the modem and problem goes away. I plug computer directly into Router, problem goes away. I plug computer into Netgear, problem re-appears. So could that Netgear switch be sensitive and picking up something? Netgear's tech support was stumped. It's an expensive switch that I purchased brand new back in Dec 2019

Should I just replace the Netgear switch with a better 10gbe switch? as well as replace the Luvul 24 port with a better switch? any recommendations? Would be nice to have all 10gbe and better. I recently purchased a SX6018 for my 40gbe lines, and was going to connect it to the Netgear 10gbe switch with 4 ports link aggregated together. But that probably won't solve the issue.

Any input, tips, hints, tricks, etc would be greatly appreciated. I downloaded WireShark in hopes to find something, but I don't know how to use it. In the Netgear logs it shows up as if the cable was unplugged and re-plugged back in quickly with messages such as "%NT_LLDP-N-LLDP_MED_TRAP: LLDP-MED: Remote MED End Point Device Disconnected on ifindex xg32." or " %NT_LLDP-N-LLDP_MED_TRAP: LLDP-MED: Remote MED End Point Device Connected on ifindex xg32." but never both "disconnected" and "connected" together... it's only one or the other which is odd...

I've also tried a few other ports on Netgear switch all with same issue...

really weird
 
  • Like
Reactions: Rock

klui

Well-Known Member
Feb 3, 2019
834
457
63
This is just a swag. I'm not an electrician.

Try another computer and make sure it's not its network port. Also try another switch if you have spares.

Take a look at your Netgear and router and see if they seem to reset during these events.

Then have some way to obtain line voltage measurements for your Netgear and router during these events. Ensure there's no over/under voltages. Your Netgear should have a grounding screw. Connect that to a good ground. If you have a rack, make sure your rack is grounded and connect a ground wire from your Netgear to your rack, ensuring the screw's washer bite through the paint.
 
  • Like
Reactions: lahatte

pricklypunter

Well-Known Member
Nov 10, 2015
1,709
517
113
Canada
I suspect a power supply issue in the Netgear switch. What else do you have connected to it? Remove everything and just try your computer on it, then add your other stuff one port at a time and watch for when the issue re-appears.
 
  • Like
Reactions: lahatte

pod

New Member
Mar 31, 2020
15
7
3
Assuming you're on a cable modem, check as best you can your signal/noise ratio. Had this happen at the end of a Comcast cable run. The signal was just enough in best of times, but too weak in the presence of high noise.
 
  • Like
Reactions: lahatte

madbrain

Active Member
Jan 5, 2019
212
44
28
Are you losing Internet, or are you actually losing LAN connectivity ?

Sorry to say, but Orbi is the root of all evil. I had the original Orbi. Had problems to no end with large number of devices, especially Chromecast Audio. There are bugs for years that Netgear just won't fix.

I went to a Unifi USG wired router last year, and 4 x Unifi NanoHD access points. Two of the APs are running wired, two as bridged wirelessly.
This is definitely more complex to setup than Orbi. But it also works damn reliably. The NanoHD are only AC access points, not AX, but I still don't have a single AX device, and not sure when we will have one. Orbi (even the first iteration) is faster than the NanoHD by about 20-25%, but speed is useless when devices don't work all the time, which they definitely didn't. Netgear kept introducing regressions in the firmware for its Orbi router and APs. And there was no way to opt out of firmware updates. Stay the heck away from that stuff.

When it comes to Ethernet switches, though, Netgear is generally pretty good, in my experience. It's of course possible you have a defective unit, or a bad port.

If you really want to narrow things down, I suggest running a program called Smokeping, and have it ping a bunch of your home devices, as well as Internet sites, or just the first few hops that show up in a traceroute. I run smokeping 24/7 on a single-board computer, an Odroid XU4. I also run my Unifi controller software on it. You can just use a PC to run Smokeping of course, but it will cost you much more in electricity to keep it up 24/7. That's why I use a single board computer. Would suggest a Raspberry Pi 4 for its better community support if you are new to single board computer. The OdroiD XU4 is an old box. If you are going to run Smokeping, run it on a computer that is wired preferably, not Wifi. With Orbi wifi, you never know if a client connects to the main router or a satellite.

I have configured Smokeping to send 5 ping packets every 60 seconds to every host, rather than the default 300 seconds. This allows me to pick up when Comcast drops, since a cable modem resync takes about a minute. I live on a hill at the very end of the cable line and Comcast is frequently called to fix their shitty network ...

In your case, you could connect the machine running Smokeping to your Netgear switch. And then directly on to the router. And you can compare the resulting data. It's possible only a specific port on your Netgear switch is bad, also. Hope this helps.
 

madbrain

Active Member
Jan 5, 2019
212
44
28
Also if you manage to configure sendmail or postfix (it's a PITA with SMTP authentication setup), you can get smokeping to send you emails when certain hosts go down. So you don't have to query the data. I have set rules for Smokeping to send me email alerts if the second, third and fourth hop from my cable modem lose more than 50% of ping packets for 3 minutes.

As it turns out, I just got a short outage around 2am. Noticed it, too, as I was awake running network tests. I got an email a bit later that looks like this :

Fri Jul 17 02:05:08 2020

Alert "hostdown" is active for http://odroid/cgi-bin/smokeping.cgi?target=Internet.ThirdHop

Pattern
-------
>50%,>50%,>50%

Data (old --> now)
------------------
loss: 100%, 100%, 100%
rtt: U, U, U

Comment
-------
Massive loss for one minute

Actually, 3 emails, one for each hop. Not too bad as this was the first such email this month. Had to bug Comcast to come and for a credit last month. 5 days prorated bill discount doesn't begin to cover the headache of intermittent internet.
 
  • Like
Reactions: lahatte

lahatte

New Member
Jul 8, 2020
27
3
3
thanks madbrain... I was using the synology mesh setup with all satellites wired with Cat6a and was having same issue. I switched to Orbi with all satellites wired and still had same issue. And same with the Orbi PRO.

I narrowed it to switch for sure. I plugged my computer into the Luxul 24 port unmanaged switch and had the same issue but not as frequent. I noticed the internet line ran into the Netgear first and the Luxul switch connects to the Netgear. So i disconnected the internet line from the Netgear and plugged it directly into the Luxul and plugged my computer directly into the Luxul and problem has gone away. So that Netgear switch seems to be the problem, which sucks considering how much I paid for that dang thing new.

I'm now trying to find an Arista DCS-7050T-52 switch to replace the Netgear with. I actually would like to get 2 Arista switches to replace Netgear and Luxul switch... I'll try to get Netgear to replace or fix this switch, but I don't have a lot of faith in it now.. lol
 

madbrain

Active Member
Jan 5, 2019
212
44
28
I narrowed it to switch for sure. I plugged my computer into the Luxul 24 port unmanaged switch and had the same issue but not as frequent. I noticed the internet line ran into the Netgear first and the Luxul switch connects to the Netgear. So i disconnected the internet line from the Netgear and plugged it directly into the Luxul and plugged my computer directly into the Luxul and problem has gone away. So that Netgear switch seems to be the problem, which sucks considering how much I paid for that dang thing new.
Have you tried to use different ports and/or cables to connect the two switches together ? Could be you have one bad port, but the way you describe your testing, it's not really possible to tell for sure sure which switch has a bad port.

If you really think you have narrowed it down to the Netgear switch, the least they could do is offer a warranty replacement. If you are lucky, you just have a defective unit.
 
  • Like
Reactions: lahatte

lahatte

New Member
Jul 8, 2020
27
3
3
@madbrain I did try different ports with my main computer. I left the same ports connected for the two switches connecting to each other. But you bring up a good point, because I didn't test different ports for the internet cat6 though...

Netgear hasn't responded to ticket yet, hopefully they want be too slow... lol
 

madbrain

Active Member
Jan 5, 2019
212
44
28
@madbrain I did try different ports with my main computer. I left the same ports connected for the two switches connecting to each other. But you bring up a good point, because I didn't test different ports for the internet cat6 though...

Netgear hasn't responded to ticket yet, hopefully they want be too slow... lol
If you left the same ports in use between the two switches, it could be just that port that's bad on either switch. Or it could be the one cable between the two switches. Since it sounds like the two switches are nearby, it should be simple enough to
a) take off the old cable, and use a new known good cable
b) connect that new cable on a different port on the Luxul
c) connect that new cable on a different port on the Netgear
Then just turn your A/C on/off and look for the problem, since it sounds like it's reproducible.
You can watch for it with either Smokeping, or just keep the "ping" command running on a system in a terminal under Linux, or "ping -t" under Windows. You can ping things that are connected to either switch, or to the router. You should be able to figure that out.
It doesn't sound like Internet has anything to do with your problem, so just ping targets between your LAN, not WAN targets.
In fact you could just disconnect the modem to your ISP from your router altogether, assuming the router will still provide a local DNS server in this case. If not, just use IP addresses in your ping commands rather than hostnames.
 
  • Like
Reactions: lahatte

lahatte

New Member
Jul 8, 2020
27
3
3
Update... i replaced the Netgear switch with an Arista DCS-7050T-64 and have the same issue. Is there software that can monitor all ports and log issues? or will i literally have to connect 1 port at a time and wait to find the line that's the culprit? (will take a long time with 40+ ports :( )
 

Mwilliamson

New Member
Aug 15, 2020
19
7
3
Fenton, Michigan, USA
@lahatte since your outage issue is so short, is it possible that what you're seeing is a Spanning Tree issue? Spanning Tree is used by switches to communicate with other switches about changes in client topology, and can cause a few headaches when not configured properly. Since your Arista is a managed switch, take a look at how you have your interfaces for clients configured and add the command "spanning-tree portfast".
 

lahatte

New Member
Jul 8, 2020
27
3
3
*UPDATE* The intermittent dropping was due to using the metal Trendnet Cat6A Keystone Jacks. I replaced those with the plastic keystone jacks from Monoprice and it fixed the issue. Hope this helps someone in the future.
 
  • Like
Reactions: Rock