Docker Swarm + Unifi Switch = Massive Packet Loss?

nitrobass24 · May 11, 2017

Troubleshooting some extremely strange behavior. Can re-create this by turning on my Swarm.

Came home from being out of town from work only to find that nothing on my network was working. As I dug into its become clear that every several minutes my switch drops every packet for about 30 seconds. This has resulted in my VMs being corrupted.

This is just me doing a ping to the management IP of the USW from a PC directly connected to the switch and on the same subnet.

Code:

Reply from 192.168.10.30: bytes=32 time=3ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=4ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=10ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=5ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=2ms TTL=64
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Reply from 192.168.10.233: Destination host unreachable.
Request timed out.
Request timed out.
Reply from 192.168.10.1: Destination host unreachable.
Reply from 192.168.10.1: Destination host unreachable.
Reply from 192.168.10.1: Destination host unreachable.
Reply from 192.168.10.1: Destination host unreachable.
Reply from 192.168.10.1: Destination host unreachable.
Reply from 192.168.10.1: Destination host unreachable.
Reply from 192.168.10.1: Destination host unreachable.
Request timed out.
Request timed out.
Request timed out.
Reply from 192.168.10.30: bytes=32 time=2006ms TTL=64
Reply from 192.168.10.30: bytes=32 time=2ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=23ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=29ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=26ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=11ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=5ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=17ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=6ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64
Reply from 192.168.10.30: bytes=32 time=1ms TTL=64

/var/log/messages from the switch

Code:

Device Main Switch: Connected


BusyBox v1.19.4 (2017-04-13 15:57:14 PDT) built-in shell (ash)
Enter 'help' for a list of built-in commands.


US.v3.7.55# tail -f /var/log/messages
May 11 01:05:02 MainSwitch user.info syslog: libubnt_webrtc.get_next_sdp_offer(): SDP request: uuid<d66f5aa0-b387-4dd6-9
7ba-1766520dcecd>, stun<>, turn<>, username<>
May 11 01:05:02 MainSwitch user.info syslog: libubnt_webrtc.get_next_sdp_offer():              turn_query<-1>, reg_url<>
, creds_url<>
May 11 01:05:02 MainSwitch user.info syslog: utermd.session_request_handler(): Create new session node, channel name:"d6
6f5aa0-b387-4dd6-97ba-1766520dcecd", stun:"", turn:""
May 11 01:05:06 MainSwitch user.info syslog: utermd.house_keeper(): Session "d66f5aa0-b387-4dd6-97ba-1766520dcecd" offer
 start creating
May 11 01:05:06 MainSwitch user.info syslog: utermd.house_keeper(): Session "d66f5aa0-b387-4dd6-97ba-1766520dcecd" (1,1)
 offer sent, now waiting for answer
May 11 01:05:06 MainSwitch user.info syslog: utermd.house_keeper(): === v=0 o=- 8406355376 21431512 IN IP4 127.0.0.1 s=E
voStream_WebRTC t=0 0 a=tool:ubnt_webrtc version 496d505-develop a=disable-sctp-checksum a=msid-semantic: WMS m=applicat
ion 1 DTLS/SCTP 5000 c=IN IP4 0.0.0.0 a=ice-ufrag:TkT212TUt9f0mjfs a=ice-pwd:2/UfE9vSwRGjGexTUzS38vdm a=fingerprint:sha-
256 1E:F6:8E:72:F5:4E:9A:65:93:D3:4D:D0:E6:3A:B7:7F:86:DE:26:D0:24:B2:F1:FD:40:D0:48:5B:27:17:C6:E9 a=setup:actpass a=mi
d:data a=sctpmap:5000 webrtc-datachannel 1024 a=
May 11 01:05:08 MainSwitch user.info syslog: utermd.session_request_handler(): Session "d66f5aa0-b387-4dd6-97ba-1766520d
cecd" recving answer
May 11 01:05:08 MainSwitch user.info syslog: libubnt_webrtc.set_sdp_answer(): SDP answer: uuid<d66f5aa0-b387-4dd6-97ba-1
766520dcecd>, answer<v=0^M o=- 3883655987703657141 2 IN IP4 127.0.0.1^M s=-^M t=0 0^M a=msid-semantic: WMS^M m=applicati
on 9 DTLS/SCTP 5000^M c=IN IP4 0.0.0.0^M b=AS:30^M a=ice-ufrag:qte0^M a=ice-pwd:nj5zlVenNVG9/GzCd+RbePA7^M a=fingerprint
:sha-256 94:FF:FF:92:9B:00:68:E3:F3:03:37:E6:6A:83:11:5B:17:EB:3F:C8:05:A9:7C:E2:5B:7A:0E:E2:CE:36:4F:5C^M a=setup:activ
e^M a=mid:data^M a=sctpmap:5000 webrtc-datachannel 1024^M a=ca
May 11 01:05:08 MainSwitch user.info syslog: libubnt_webrtc.set_sdp_answer(): SDP success: uuid<d66f5aa0-b387-4dd6-97ba-
1766520dcecd>
May 11 01:05:08 MainSwitch user.info syslog: utermd.house_keeper(): Session "d66f5aa0-b387-4dd6-97ba-1766520dcecd" answe
r parsed successfully, going to build up connection
May 11 01:13:51 MainSwitch user.err syslog: ace_reporter.reporter_fail(): Timeout (http://192.168.10.216:8080/inform)
May 11 01:13:51 MainSwitch user.err syslog: ace_reporter.reporter_fail(): inform failed #1 (last inform: 40 seconds ago)
, rc=4
May 11 01:14:12 MainSwitch user.err syslog: ace_reporter.reporter_fail(): Timeout (http://192.168.10.216:8080/inform)
May 11 01:14:12 MainSwitch user.err syslog: ace_reporter.reporter_fail(): inform failed #2 (last inform: 61 seconds ago)
, rc=4
May 11 01:14:32 MainSwitch user.err syslog: ace_reporter.reporter_fail(): Timeout (http://192.168.10.216:8080/inform)
May 11 01:14:32 MainSwitch user.err syslog: ace_reporter.reporter_fail(): inform failed #3 (last inform: 82 seconds ago)
, rc=4
May 11 01:14:32 MainSwitch user.err syslog: ace_reporter.reporter_fail(): [STATE] entering SELFRUN!!!!
May 11 01:14:51 MainSwitch user.info syslog: ace_reporter.reporter_set_managed(): [STATE] enter MANAGED
May 11 01:15:54 MainSwitch user.err syslog: ace_reporter.reporter_fail(): Timeout (http://192.168.10.216:8080/inform)
May 11 01:15:54 MainSwitch user.err syslog: ace_reporter.reporter_fail(): inform failed #1 (last inform: 59 seconds ago)
, rc=4
May 11 01:16:14 MainSwitch user.err syslog: ace_reporter.reporter_fail(): Timeout (http://192.168.10.216:8080/inform)
May 11 01:16:14 MainSwitch user.err syslog: ace_reporter.reporter_fail(): inform failed #2 (last inform: 80 seconds ago)
, rc=4

So bad that its causing my LAG to go up and down on FreeNAS.

Code:

freenas.davis.local kernel log messages:
> igb1: link state changed to DOWN
> igb0: link state changed to DOWN
> lagg0: link state changed to DOWN
> igb0: link state changed to UP
> lagg0: link state changed to UP
> igb1: link state changed to UP
> igb0: link state changed to DOWN
> igb1: link state changed to DOWN
> lagg0: link state changed to DOWN
> igb1: link state changed to UP
> lagg0: link state changed to UP

Monoman · May 11, 2017

restart the switch? I actually had this issue this morning. A reboot fixed it. I have no idea what was the cause. If it happens again I'm afraid it's not a fluke and will need further investigation.

Patrick · May 11, 2017

Perhaps worth a ticket on their beta product.

nitrobass24 · May 11, 2017

Monoman said:
restart the switch? I actually had this issue this morning. A reboot fixed it. I have no idea what was the cause. If it happens again I'm afraid it's not a fluke and will need further investigation.

Rebooted, upgraded firmware. checked the memory it's under 50%. I mean this is a home network, not a lot going on!

Plus I can reproduce almost on demand! If I start my three Docker VMs, give it 10 minutes, the whole network FUBAR.

At this point I'm not even mad, I'm impressed!

Monoman · May 11, 2017

yep exactly, a ticket.

I'll give the test a go in a couple hours. Maybe I can replicate... Which USG-SW?

nitrobass24 · May 11, 2017

USW-24 250w - I opened a ticket last night and posted to the beta forum.

Monoman · May 11, 2017

I have a couple, but will test on the usw-16-150w

Is the VM host proxmox or esxi?

nitrobass24 · May 11, 2017

Esx

Sent from my iPhone using Tapatalk

nitrobass24 · May 11, 2017

Totally nuts.

Built three new VMs this time on Ubuntu, no firewall, no SELinux.
Stand up some containers.
Half hour later the whole network takes a dump.

Monoman · May 12, 2017

Didn't have time to replicate yesterday, hopefully will shortly/this weekend

Can you provide some details on this? I'm using proxmox for my VM.

-Switch FW version?

Code:

Controller is 5.5.9
FW was 3.7.55.6308

-Docker on (assuming centos or ubuntu) latest version?
-vlans on a trunked(all) connection to the switch or specific port for only this traffic?

What did you do exactly to generate the switch taking a dump?
-Provision 3 VM's. (one nic or two) how is the network setup?
-Install docker via official repos
-create swarm, add other two nodes to swarm
-add a couple containers(which ones specifically?)
-wait 30 min and self destruct?

Thanks!

Monoman · May 12, 2017

I found your beta forum port on ubnt I'm running current stable versions, so at least we'll be able to say it's beta related or not. I do not plan to update to their beta branch at the moment.

CookiesLikeWhoa · May 12, 2017

Is STP enabled or disabled on the switch?

I had a similar issue, though completely unrelated to a docker swarm. Disabling STP fixed it.

PigLover · May 12, 2017

STP problems were my first thought too - but the 30 minute delay before meltdown made me hold off posting it. STP problems usually get exposed in seconds or less. None the less, it might be worth disabling STP in the switch to see if that helps.

I'm thinking its more likely the "chat" between the nodes in the swarm is somehow overwhelming some table in the switch or tickling a bug (likely a memory leak bug). Do you have another switch you could throw on the cluster, even temporarily to try to sniff out the root cause? Something with a more stable track record than the Ubiquity Beta switch you are using?

Monoman · May 12, 2017

@PigLover are we calling ubnt unifi switches all beta? Maybe I just don't get the reference...

PigLover · May 12, 2017

Since he referenced opening a ticket on the Unifi Beta Blog there is an implication that he was using a beta product, thus the comment. But since you mention it - and given their propensity to release products to GA with well documented major defects and then take seemingly forever to fix them - its not completely unfair to consider all Unifi switches Beta (or worse).

nitrobass24 · May 12, 2017

I do have STP enabled, will try disabling it tonight. Unfortunately, I don't have any other managed switches besides smaller UBNTs.

My switch is a GA model, but I am running a beta version of the Controller and was running a beta firmware. I uploaded the latest GA LTS version of the firmware and see similar results.

My guess is it has to do with the chatting between the manager nodes to build the Raft Consensus. On my Ubuntu cluster, I could never get the 3rd manager node to fully join as a manager. It would join the swarm, take workloads, but the "Manager" status was always unreachable.

Very strange behavior. Ubuntu 16.10 VMs w/ 8 cores, 8gb ram, 1 NIC all on the same subnet. You would think they could talk to each other pretty easily.

Monoman · May 12, 2017

PigLover said:
Since he referenced opening a ticket on the Unifi Beta Blog there is an implication that he was using a beta product, thus the comment. But since you mention it - and given their propensity to release products to GA with well documented major defects and then take seemingly forever to fix them - its not completely unfair to consider all Unifi switches Beta (or worse).

This is what I thought you meant.

I wasn't sure if we (sth) were calling it beta to be cheeky haha.

Ubnt offers many different fw versions, one happens to be the beta release. Not sure how familiar you are with their release cycle. Just wondering.

nitrobass24 · May 12, 2017

RSTP --> Disabled = Same problem exist.

I am now on a 2-node single manager swarm. This experiment might be over pretty fast. Starting to cause me a lot of trouble.
Only containers I am running are Portainer, Sonarr and NZBGet (idle).

nitrobass24 · May 25, 2017

So just a few updates for those following along.

I went to a single Manager swarm = same problem
No Swarm, Single Node = same problem - At least we know its not the swarm.

Only containers I am now running are:
1. Portainer
2. NZBGet (though it was idle when it happened the last two times)
3. Sonarr (high memory usage)

Right now I am wondering if it's either some sort of weird issue with VMware or my Sonarr container. With a single node, I wouldn't have expected it to break both ESXI host, but both of them completely lost connectivity to my FreeNAS box.

Wonder if it has anything to do with my docker /data folders being stored via NFS Asnyc? I might try bring the storage local to the VM, to see if I see different behavior.

nitrobass24 · May 26, 2017

I ran last night still on the shared storage, but with just my Sonarr container running. Woke up to a dead network.

Rebuilt all my containers locally inside the single VM, been running all day without issue. I also completely destroyed the Sonarr container since it was acting weird.

Will try moving it back to an NFS share next week, when I am back home.
Basically I'm down to either a totally FUBARd Sonarr container or Docker over NFS async doesnt play nice with FreeNAS.

Docker Swarm + Unifi Switch = Massive Packet Loss?

Moderator

Active Member

Administrator

Moderator

Active Member

Moderator

Active Member

Moderator

Moderator

Active Member

Active Member

Active Member

Moderator

Active Member

Moderator

Moderator

Active Member

Moderator

Moderator

Moderator