Skip to content
This repository has been archived by the owner on Jun 20, 2024. It is now read-only.

restarting Docker 1.10 after launching weave causes Docker to die #1959

Closed
bergtwvd opened this issue Feb 7, 2016 · 9 comments
Closed

restarting Docker 1.10 after launching weave causes Docker to die #1959

bergtwvd opened this issue Feb 7, 2016 · 9 comments
Assignees
Milestone

Comments

@bergtwvd
Copy link

bergtwvd commented Feb 7, 2016

Installed docker 1.10 on Ubuntu:
bergtwvd@app-docker01:~$ uname -a
Linux app-docker01.xxx.xxx.xxx3.13.0-76-generic #120-Ubuntu SMP Mon Jan 18 15:59:10 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux

At installation time I set op a Consul KV store on another host and adapted DOCKER_OPTS to point to it.

Later on I installed weave via weave launch.

At some point I have edited the DOCKER_OPTS in /etc/default/docker for other options, and restart the service.

This fails.

After some cleanup (removing all images and containers, and deleting all network bridges), I manually start the daemon and see:

bergtwvd@app-docker01:~$ sudo docker daemon
INFO[0000] [graphdriver] using prior storage driver "aufs"
INFO[0000] Graph migration to content-addressability took 0.00 seconds
INFO[0000] Firewalld running: false
WARN[0000] Could not get list of networks during endpoint cleanup: could not find endpoint count key docker/network/v1.0/endpoint_count/4e93760ae64cd929abd098e930b4f02324270ca22e6cadac69707741c958727f/ for network weave while listing: Key not found in store
ERRO[0000] could not find endpoint count key docker/network/v1.0/endpoint_count/4e93760ae64cd929abd098e930b4f02324270ca22e6cadac69707741c958727f/ for network weave while listing: Key not found in store
INFO[0000] Default bridge (docker0) is assigned with an IP address 172.17.0.0/16. Daemon option --bip can be used to set a preferred IP address
INFO[0000] Loading containers: start.

INFO[0000] Loading containers: done.
INFO[0000] Daemon has completed initialization
INFO[0000] Docker daemon commit=590d5108 execdriver=native-0.2 graphdriver=aufs version=1.10.0
INFO[0000] API listen on /var/run/docker.sock

bergtwvd@app-docker01:~$ docker network ls
NETWORK ID NAME DRIVER

bergtwvd@app-docker01:~$ ifconfig
br-85814b5480fb Link encap:Ethernet HWaddr 02:42:22:d1:74:e0
inet addr:172.18.0.1 Bcast:0.0.0.0 Mask:255.255.0.0
UP BROADCAST MULTICAST MTU:1500 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:0 (0.0 B) TX bytes:0 (0.0 B)

docker0 Link encap:Ethernet HWaddr 02:42:88:4e:2b:f1
inet addr:172.17.0.6 Bcast:0.0.0.0 Mask:255.255.0.0
UP BROADCAST MULTICAST MTU:1500 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:0 (0.0 B) TX bytes:0 (0.0 B)

docker_gwbridge Link encap:Ethernet HWaddr 02:42:97:86:73:2a
inet addr:172.20.0.1 Bcast:0.0.0.0 Mask:255.255.0.0
UP BROADCAST MULTICAST MTU:1500 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:0 (0.0 B) TX bytes:0 (0.0 B)

eth0 Link encap:Ethernet HWaddr 00:50:56:9a:2a:1a
inet addr:134.221.44.65 Bcast:134.221.44.255 Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:1139564 errors:0 dropped:0 overruns:0 frame:0
TX packets:168946 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:1547591291 (1.5 GB) TX bytes:41282974 (41.2 MB)

lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
UP LOOPBACK RUNNING MTU:65536 Metric:1
RX packets:1344 errors:0 dropped:0 overruns:0 frame:0
TX packets:1344 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:74818 (74.8 KB) TX bytes:74818 (74.8 KB)

On a host without the weave plugin I have no problem. This seens to be an issue between Docker and network plugins.

@bergtwvd bergtwvd changed the title Docker 1.10 and Weave plugin Docker 1.10 and Weave plugin issue Feb 7, 2016
@rade
Copy link
Member

rade commented Feb 7, 2016

This seens to be an issue between Docker and network plugins.

It is. Docker gets very upset if it cannot talk to a plugin associated with a previously created network. When you were

removing all images and containers

you removed the weaveplugin container, but the 'weave' network still exist and references the now-defunct plugin.

The only way out of this mess that I know of is to stop docker and then rm -rf /var/lib/docker/network/files.

@bergtwvd
Copy link
Author

bergtwvd commented Feb 8, 2016

The rm indeed solved the issue. I will raise this issue with Docker too.

Thanks.

@bboreham
Copy link
Contributor

bboreham commented Feb 8, 2016

This fails.

What exactly fails? I reported something similar with Docker 1.9, at moby/libnetwork#813, but from my testing they fixed that in Docker 1.10.

@bergtwvd
Copy link
Author

bergtwvd commented Feb 8, 2016

When I restart docker with "service docker restart" it does not come back up.

@rade
Copy link
Member

rade commented Feb 8, 2016

I have just reproduced this. weave launch; sudo service docker restart results in the following docker daemon log entries:

INFO[0263] Processing signal 'terminated'               
ERRO[0263] could not find endpoint count key docker/network/v1.0/endpoint_count/db83448d82ff23cfbcb12f8655015442ce7deb3401fef004d20cf04990b5c2b0/ for network weave while listing: Key not found in store 
ERRO[0263] could not find endpoint count key docker/network/v1.0/endpoint_count/db83448d82ff23cfbcb12f8655015442ce7deb3401fef004d20cf04990b5c2b0/ for network weave while listing: Key not found in store 
ERRO[0263] could not find endpoint count key docker/network/v1.0/endpoint_count/db83448d82ff23cfbcb12f8655015442ce7deb3401fef004d20cf04990b5c2b0/ for network weave while listing: Key not found in store 
ERRO[0263] could not find endpoint count key docker/network/v1.0/endpoint_count/db83448d82ff23cfbcb12f8655015442ce7deb3401fef004d20cf04990b5c2b0/ for network weave while listing: Key not found in store 
INFO[0273] Container 907c3adf946a3276ccc3325937d336193e871fb261f58b2b940076a223d0d12d failed to exit within 10 seconds of SIGTERM - using the force 
WARN[0273] Unable to connect to plugin: /run/docker/plugins/weavemesh.sock, retrying in 1s 
ERRO[0273] could not find endpoint count key docker/network/v1.0/endpoint_count/db83448d82ff23cfbcb12f8655015442ce7deb3401fef004d20cf04990b5c2b0/ for network weave while listing: Key not found in store 
ERRO[0273] could not find endpoint count key docker/network/v1.0/endpoint_count/db83448d82ff23cfbcb12f8655015442ce7deb3401fef004d20cf04990b5c2b0/ for network weave while listing: Key not found in store 
WARN[0000] /!\ DON'T BIND ON ANY IP ADDRESS WITHOUT setting -tlsverify IF YOU DON'T KNOW WHAT YOU'RE DOING /!\ 
INFO[0000] Graph migration to content-addressability took 0.00 seconds 
INFO[0000] Firewalld running: false                     
WARN[0000] Could not get list of networks during endpoint cleanup: could not find endpoint count key docker/network/v1.0/endpoint_count/db83448d82ff23cfbcb12f8655015442ce7deb3401fef004d20cf04990b5c2b0/ for network weave while listing: Key not found in store 
ERRO[0000] could not find endpoint count key docker/network/v1.0/endpoint_count/db83448d82ff23cfbcb12f8655015442ce7deb3401fef004d20cf04990b5c2b0/ for network weave while listing: Key not found in store 
INFO[0000] Default bridge (docker0) is assigned with an IP address 172.17.0.0/16. Daemon option --bip can be used to set a preferred IP address 
FATA[0000] Error starting daemon: Error initializing network controller: Error creating default "bridge" network: failed to allocate gateway (172.17.0.1): Address already in use 

The problem, including all the above warnings/errors, goes away when I remove the call to RemoveNetwork in the plugin signal handler. Evidently this ends up not returning, and, moreover, causes docker's state to become corrupted (note that the fatal error ostensibly has absolutely nothing to do with weave).

Until we fix this, the options are:

a) stick to docker 1.9
b) launch weave with WEAVE_NO_PLUGIN=1, and don't use the plugin
c) stop the plugin before restarting docker (though this obviously won't help with uncontrolled restarts/reboots)

@rade rade added this to the 1.4.4 milestone Feb 8, 2016
@rade rade changed the title Docker 1.10 and Weave plugin issue restarting Docker 1.10 after launching weave causes Docker to die Feb 8, 2016
@bergtwvd
Copy link
Author

bergtwvd commented Feb 8, 2016

Thanks for checking this out. I will opt for WEAVE_NO_PLUGIN=1, as I am relying on a fix in 1.10 (assignable mac address in overlay network).

@bboreham bboreham self-assigned this Feb 9, 2016
awh added a commit that referenced this issue Feb 9, 2016
Don't remove weave network on plugin shutdown because it bricks Docker 1.10
@awh
Copy link
Contributor

awh commented Feb 9, 2016

Fixed by #1963.

@awh awh closed this as completed Feb 9, 2016
@bboreham
Copy link
Contributor

bboreham commented Feb 9, 2016

Filed moby/moby#20140

@bboreham
Copy link
Contributor

bboreham commented Feb 9, 2016

The fix for this issue is now released in Weave 1.4.4

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

4 participants