-
Notifications
You must be signed in to change notification settings - Fork 5.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[1.6.0] cannot run against swarm in attached mode #2857
Comments
Below a docker swarm (1.1.0) debug output (with the same timeout on docker-compose (1.6.0) side) obtained by running the shown compose file. Docker compose times out on creating the first service (crc). In the swarm output I am puzzled about these two lines: and a few lines further down the latter is another address than the previous proxy request. DOCKER COMPOSE FILE:
SWARM DEBUG OUTPUT:
|
I am still stuck on this issue. In the meantime I have eliminated swarm and am running on a single host. So I just run docker-compose on a single host and nothing else. A few hints:
Next, I put a proxy between compose and the Docker daemon to capture the traffic. Attached the verbose output from Docker compose and the output from the proxy. From another run I captured the docker daemon output too: How does compose exactly connect to the daemon? (protocol, address, and port format) |
From the daemon logs it seems to be responding to the attach request just fine. The attach call should block and wait for the process in the container to exit. What happens in the "WORKS" cases? What happens if you inspect the container state using |
In the WORKS cases the containers run fine as expected. I have a similar setup in AWS and never have these problems. Only when I moved to my in-company provided set of VMs these issues popped up. The hang happens on the very first container that compose tries to start. The container appears created, but not running:
|
Since I moved from AWS to in-house I also would like to take a closer look at the networking. In AWS I have for all VMs all ports and protocols open for both ingoing and outgoing traffic. I use the same versions of docker engine, compose and swarm. |
Other deduction from the logs: So the start container invocation in compose is missing, which is in line with what I see at the docker engine side. |
I narrowed down the problem further to https://github.com/docker/compose/blob/master/compose/service.py, execute_convergence_plan. When I add the "-d" flag to docker-compose up, the whole lot starts and runs correct. So the problem is related to the container.attach_log_stream call. As a sanity check I looked up the 1.4.2 and 1.5.0. code. In the 1.4.2 service.py there is no attach_log_stream call in execute_convergence_plan, and in 1.5.0 there is. The change has been added with #2254. As a test I put a software proxy (http://notes.tweakblogs.net/blog/7955/using-netcat-to-build-a-simple-tcp-proxy-in-linux.html) between Docker client and daemon to monitor message traffic, and fired up a container (docker run -id busybox). I can successfully attach to it from different hosts (using tcp), although I need to enter crtl-C at one of the shells to get stdout flushed and force the Connection Upgrade message. The following message is printed by the proxy when I enter ctrl-C in one of the shells:
Next I reran the test with docker-compose, also using the software proxy. The attach comes through at the docker daemon, but I noticed that the Connection Upgrade is never sent from the client. The difference with the previous busybox test is however that the container is in the status Created (and not running yet). For reference the execute_convergence_plan snippet:
|
Thanks for tracking this down! So to summarize, the fix we put in place in #2254 to attach before start does not work against swarm. That fix was a bit of a hack anyway, we should try to use logs instead of attach. So it's possible that the changes we need to make to I believe this is only an issue when run against swarm. It works correctly for engine. It's probably worth opening an issue on |
Please note that in my analysis it turned out that the issue is not related to swarm (I also changed the title of the issue). I run into exactly the same issue when I use Docker Compose against Docker Engine, using e.g. a single image like busybox. From: Daniel Nephin [mailto:[email protected]] Thanks for tracking this down! So to summarize, the fix we put in place in #2254#2254 to attach before start does not work against swarm. That fix was a bit of a hack anyway, we should try to use logs instead of attach. So it's possible that the changes we need to make to logs will resolve this. I believe this is only an issue when run against swarm. It works correctly for engine. It's probably worth opening an issue on docker/swarm as well, so we can track it on both sides. If this works with engine it may be that it can be fixed in swarm. — |
Oh, I missed that. That is definitely not the case most of the time (we have many tests, and many users that use attach). There are a couple related issues about this: #812, #2338, #494 We need to figure out what is different about your setup. It sounds like you're running against a VM? Is that created by |
I do not use docker machine, but follow the instructions for installing Docker Engine on Ubuntu (running in a VM) from the Docker.com website. Docker Compose is installed in the same VM. Thus all is running on a single host. When I use a Unix socket to connect to Engine it works; when I use a TCP connection it doesn't. Does this effect the Connection Upgrade message in any way? |
Since it only happens with TCP, it sounds very likely that a proxy or firewall is causing this issue. |
That is what I thought too, but when I start a container by hand and attach to it later it works fine. Even the attach from other hosts. And with the software proxy in between I also see the Connection Upgrade message passing by. Question: can you attach to a container that is created, but not started yet? (as Docker Compose appears to do) When I use the docker commandline client I am not permitted to do this. |
That check seems to happen on the client side in |
I followed up with the network engineer and system admin. Both the proxy and the host firewall do not block or manipulate any traffic on this subnetwork. So this rules out these factors. |
Hi, I encountered the similar problem, I have 3 VMs and if I remove all the proxy settings (/etc/profile.d/proxy.sh, /etc/environment, /etc/default/docker), it will work. I have to pull all images first before running docker-compose. |
Issue grooming: After ~3 years with no further complaints I'm going to take a guess that either this was fixed along the way somehow or it is otherwise no longer relevant. If you can reproduce with modern versions of everything then please do reopen (or open a new issue with the new info). |
Using Docker 1.10, Compose 1.6.0 / 1.5.x / 1.4.x, and Swarm 1.1.0.
I set up a swarm with 4 nodes. The swarm master is started on node 1 with:
docker run -p 5000:2375 -it swarm --debug manage -H tcp://0.0.0.0:2375 "nodes://x.x.x.[65:68]:2375"
I set DOCKER_HOST to point to the swarm master.
I use the following compose YML file for test purposes:
docker-compose-1.4.2 -f myfile.yml up => WORKS OK
docker-compose-1.5.0 -f myfile.yml up => EVENTUALLY TIMES OUT
docker-compose-1.5.1 -f myfile.yml up => EVENTUALLY TIMES OUT
docker-compose-1.5.2 -f myfile.yml up => EVENTUALLY TIMES OUT
docker-compose-1.6.0 -f myfile.yml up => EVENTUALLY TIMES OUT
It seems that a change introduced in 1.5.0 causes a problem in combination with swarm 1.6.0.
See also docker-archive/classicswarm#1765.
LOGS:
The text was updated successfully, but these errors were encountered: