Troubleshooting some extremely strange behavior. Can re-create this by turning on my Swarm.
Came home from being out of town from work only to find that nothing on my network was working. As I dug into its become clear that every several minutes my switch drops every packet for about 30 seconds. This has resulted in my VMs being corrupted.
This is just me doing a ping to the management IP of the USW from a PC directly connected to the switch and on the same subnet.
/var/log/messages from the switch
So bad that its causing my LAG to go up and down on FreeNAS.
Came home from being out of town from work only to find that nothing on my network was working. As I dug into its become clear that every several minutes my switch drops every packet for about 30 seconds. This has resulted in my VMs being corrupted.
This is just me doing a ping to the management IP of the USW from a PC directly connected to the switch and on the same subnet.
Code:
Reply from 192.168.10.30: bytes=32 time=3ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=4ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=10ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=5ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=2ms TTL=64
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Reply from 192.168.10.233: Destination host unreachable.
Request timed out.
Request timed out.
Reply from 192.168.10.1: Destination host unreachable.
Reply from 192.168.10.1: Destination host unreachable.
Reply from 192.168.10.1: Destination host unreachable.
Reply from 192.168.10.1: Destination host unreachable.
Reply from 192.168.10.1: Destination host unreachable.
Reply from 192.168.10.1: Destination host unreachable.
Reply from 192.168.10.1: Destination host unreachable.
Request timed out.
Request timed out.
Request timed out.
Reply from 192.168.10.30: bytes=32 time=2006ms TTL=64
Reply from 192.168.10.30: bytes=32 time=2ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=23ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=29ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=26ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=11ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=5ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=17ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=6ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Code:
Device Main Switch: Connected
BusyBox v1.19.4 (2017-04-13 15:57:14 PDT) built-in shell (ash)
Enter 'help' for a list of built-in commands.
US.v3.7.55# tail -f /var/log/messages
May 11 01:05:02 MainSwitch user.info syslog: libubnt_webrtc.get_next_sdp_offer(): SDP request: uuid<d66f5aa0-b387-4dd6-9
7ba-1766520dcecd>, stun<>, turn<>, username<>
May 11 01:05:02 MainSwitch user.info syslog: libubnt_webrtc.get_next_sdp_offer(): turn_query<-1>, reg_url<>
, creds_url<>
May 11 01:05:02 MainSwitch user.info syslog: utermd.session_request_handler(): Create new session node, channel name:"d6
6f5aa0-b387-4dd6-97ba-1766520dcecd", stun:"", turn:""
May 11 01:05:06 MainSwitch user.info syslog: utermd.house_keeper(): Session "d66f5aa0-b387-4dd6-97ba-1766520dcecd" offer
start creating
May 11 01:05:06 MainSwitch user.info syslog: utermd.house_keeper(): Session "d66f5aa0-b387-4dd6-97ba-1766520dcecd" (1,1)
offer sent, now waiting for answer
May 11 01:05:06 MainSwitch user.info syslog: utermd.house_keeper(): === v=0 o=- 8406355376 21431512 IN IP4 127.0.0.1 s=E
voStream_WebRTC t=0 0 a=tool:ubnt_webrtc version 496d505-develop a=disable-sctp-checksum a=msid-semantic: WMS m=applicat
ion 1 DTLS/SCTP 5000 c=IN IP4 0.0.0.0 a=ice-ufrag:TkT212TUt9f0mjfs a=ice-pwd:2/UfE9vSwRGjGexTUzS38vdm a=fingerprint:sha-
256 1E:F6:8E:72:F5:4E:9A:65:93:D3:4D:D0:E6:3A:B7:7F:86:DE:26:D0:24:B2:F1:FD:40:D0:48:5B:27:17:C6:E9 a=setup:actpass a=mi
d:data a=sctpmap:5000 webrtc-datachannel 1024 a=
May 11 01:05:08 MainSwitch user.info syslog: utermd.session_request_handler(): Session "d66f5aa0-b387-4dd6-97ba-1766520d
cecd" recving answer
May 11 01:05:08 MainSwitch user.info syslog: libubnt_webrtc.set_sdp_answer(): SDP answer: uuid<d66f5aa0-b387-4dd6-97ba-1
766520dcecd>, answer<v=0^M o=- 3883655987703657141 2 IN IP4 127.0.0.1^M s=-^M t=0 0^M a=msid-semantic: WMS^M m=applicati
on 9 DTLS/SCTP 5000^M c=IN IP4 0.0.0.0^M b=AS:30^M a=ice-ufrag:qte0^M a=ice-pwd:nj5zlVenNVG9/GzCd+RbePA7^M a=fingerprint
:sha-256 94:FF:FF:92:9B:00:68:E3:F3:03:37:E6:6A:83:11:5B:17:EB:3F:C8:05:A9:7C:E2:5B:7A:0E:E2:CE:36:4F:5C^M a=setup:activ
e^M a=mid:data^M a=sctpmap:5000 webrtc-datachannel 1024^M a=ca
May 11 01:05:08 MainSwitch user.info syslog: libubnt_webrtc.set_sdp_answer(): SDP success: uuid<d66f5aa0-b387-4dd6-97ba-
1766520dcecd>
May 11 01:05:08 MainSwitch user.info syslog: utermd.house_keeper(): Session "d66f5aa0-b387-4dd6-97ba-1766520dcecd" answe
r parsed successfully, going to build up connection
May 11 01:13:51 MainSwitch user.err syslog: ace_reporter.reporter_fail(): Timeout (http://192.168.10.216:8080/inform)
May 11 01:13:51 MainSwitch user.err syslog: ace_reporter.reporter_fail(): inform failed #1 (last inform: 40 seconds ago)
, rc=4
May 11 01:14:12 MainSwitch user.err syslog: ace_reporter.reporter_fail(): Timeout (http://192.168.10.216:8080/inform)
May 11 01:14:12 MainSwitch user.err syslog: ace_reporter.reporter_fail(): inform failed #2 (last inform: 61 seconds ago)
, rc=4
May 11 01:14:32 MainSwitch user.err syslog: ace_reporter.reporter_fail(): Timeout (http://192.168.10.216:8080/inform)
May 11 01:14:32 MainSwitch user.err syslog: ace_reporter.reporter_fail(): inform failed #3 (last inform: 82 seconds ago)
, rc=4
May 11 01:14:32 MainSwitch user.err syslog: ace_reporter.reporter_fail(): [STATE] entering SELFRUN!!!!
May 11 01:14:51 MainSwitch user.info syslog: ace_reporter.reporter_set_managed(): [STATE] enter MANAGED
May 11 01:15:54 MainSwitch user.err syslog: ace_reporter.reporter_fail(): Timeout (http://192.168.10.216:8080/inform)
May 11 01:15:54 MainSwitch user.err syslog: ace_reporter.reporter_fail(): inform failed #1 (last inform: 59 seconds ago)
, rc=4
May 11 01:16:14 MainSwitch user.err syslog: ace_reporter.reporter_fail(): Timeout (http://192.168.10.216:8080/inform)
May 11 01:16:14 MainSwitch user.err syslog: ace_reporter.reporter_fail(): inform failed #2 (last inform: 80 seconds ago)
, rc=4
Code:
freenas.davis.local kernel log messages:
> igb1: link state changed to DOWN
> igb0: link state changed to DOWN
> lagg0: link state changed to DOWN
> igb0: link state changed to UP
> lagg0: link state changed to UP
> igb1: link state changed to UP
> igb0: link state changed to DOWN
> igb1: link state changed to DOWN
> lagg0: link state changed to DOWN
> igb1: link state changed to UP
> lagg0: link state changed to UP