-
Notifications
You must be signed in to change notification settings - Fork 170
Docker commands in container do not work anymore #54
Comments
There was a recent update of the images, but I'm not seeing a regression on this. How are you launching the container? |
I'm using an azure container services deployment with Kubernetes. There is a volume mount for the docker socket in the configuration.
Any suggestions what could have changed? |
Ok, the issue occurs only when using the docker-compose CLI - missed that part. I'll look into it. |
Actually it still isn't reproducing for me (I had forgotten to volume mount the docker socket). Only docker-compose gives the error message you are seeing, but only when I don't volume mount the socket. In this case, the docker CLI is more helpful:
I think what's happened is Kubernetes restarted the Docker daemon while your container was running and the running container lost the socket connection. We've observed this behavior in the past and confirmed with Kubernetes experts that the Docker daemon is sometimes restarted on a node. Can you try recreating the Kubernetes pod to see if the issue disappears? |
I deleted and recreated the agent pods, no luck. Still the same error. BTW: The agent capabilities list docker. |
Interesting. Can you manually exec into the container and see if "docker images" works? Alternatively, you could try to run a VSTS Docker task (rather than a Docker Compose task). |
I executed a build with a shell script where I executed |
I changed the build process to use the docker task and it works now. Currently I can live with that, but I still wonder what causes the error with docker-compose |
That is quite strange. I've never seen a state where docker works and docker-compose does not. It's possible that if I upgrade the version of docker-compose (currently 1.15) to the latest that it would work. If you're interested, you could try to run the following command inside the container to upgrade to the latest docker-compose CLI and try running a docker compose VSTS task after that:
Since I can't reproduce the issue, I won't be able to see if this fixes it, but if you're able to validate for me, I'll make the change and push new versions of the images. Thanks! |
Unfortunately I still have the same error. Docker-compose was updated but ends with the same error about docker.
|
Well that's unfortunate. I wonder if there's something about how your VSTS task is configured. In order to rule that out, can you manually exec a console into the container and run the following to see if you have the same issue independent of running VSTS tasks?
(You can exec in using kubectl, first "kubectl get pod" to find the name of the pod that is running the VSTS agent, then "kubectl exec -it bash"). |
I have the same issue. I just pulled down today. This is the command I'm using to run the agent on my local machine:
I'm using the Docker Compose tasks in the agent pipeline. This one fails when executing a 2017-12-05T22:32:56.8237380Z ##[section]Starting: Run services
2017-12-05T22:32:56.8595120Z ==============================================================================
2017-12-05T22:32:56.8606180Z Task : Docker Compose
2017-12-05T22:32:56.8619340Z Description : Build, push or run multi-container Docker applications. Task can be used with Docker or Azure Container registry.
2017-12-05T22:32:56.8631180Z Version : 0.4.7
2017-12-05T22:32:56.8641410Z Author : Microsoft Corporation
2017-12-05T22:32:56.8653500Z Help : [More Information](https://go.microsoft.com/fwlink/?linkid=848006)
2017-12-05T22:32:56.8666490Z ==============================================================================
2017-12-05T22:33:01.0701670Z [command]/usr/local/bin/docker-compose -f /vsts/agent/_work/1/s/docker-compose.test.yml -f /vsts/agent/.docker-compose.1512513178668.yml -p [GIT REPO NAME] up --build --abort-on-container-exit
2017-12-05T22:33:01.0728600Z Couldn't connect to Docker daemon at http+docker://localunixsocket - is it running?
2017-12-05T22:33:01.0737520Z
2017-12-05T22:33:01.0749700Z If it's at a non-standard location, specify the URL with the DOCKER_HOST environment variable.
2017-12-05T22:33:01.0920530Z ##[error]Couldn't connect to Docker daemon at http+docker://localunixsocket - is it running?
2017-12-05T22:33:01.1028470Z ##[error]If it's at a non-standard location, specify the URL with the DOCKER_HOST environment variable.
2017-12-05T22:33:01.1122350Z ##[error]/usr/local/bin/docker-compose failed with return code: 1
2017-12-05T22:33:01.1182160Z ##[section]Finishing: Run services My Docker Compose file is version |
Ok, this is concerning. I couldn't reproduce the issue by using docker-compose directly. I'll try running a task via the VSTS agent and see if I can recreate the problem. |
Actually @Sawtaytoes, the command you are using to run the agent locally is giving an expected error because you didn't volume mount the docker socket into the container. See the "docker images" section of this page. @marc-mueller, I'm still trying to reproduce the issue in your case. |
@marc-mueller , I'm unable to reproduce this issue. I ran the VSTS agent in Kubernetes using your exact replication controller definition and queued a VSTS build on it that runs the exact same VSTS task configured the same way (the docker-compose task configured to build services). It works fine. Would you be able to send me or summarize how exactly your VSTS docker-compose task is configured? I'm particularly interested to know if you ever at some point set the "Docker Host Connection" property to something. If you add another docker-compose VSTS task and disable the existing one does it work? I feel like there's something in VSTS that is trying to override which Docker daemon is used but for some reason that isn't working any more with the latest agent image. |
Thanks @stepro. I see it now: |
Hi @stepro Sorry, I was on a business trip and my inbox is more then full. I need to setup a new environment to test it again (it was deleted in the meantime). I'll come back to you with the results (around friday). |
I'm seeing a similar behaviour. VSTS build tasks fail (using the docker-compose task) with the same If i |
@obfu5c8 , when you |
I am unsure if this is related but using either the ubuntu-16.04-docker-17.06.0-ce-standard or the ubuntu-16.04-docker-17.12.0-ce-standard I am seeing the following errors. This is the error I get with 17.06.
This is the error I get with 17.12
The build is utilizing the following Microsoft task.
|
Seems more and more people are seeing this, yet I'm still unable to reproduce it myself. Unfortunately I can't help without more information. @ephos, can you provide the command line you are using to run the container? Or if you're running it in an orchestrator like Kubernetes, the steps for running it there? Thanks. |
Hi @stepro, no orchestrsation at this time I am just using commands. Below is the command I used. sudo docker run --env VSTS_ACCOUNT=myaccount --env VSTS_TOKEN=REDACTED --env VSTS_AGENT='vsts-docker-$(hostname)' --env VSTS_POOL=VSTS-Container-Pool -d microsoft/vsts-agent:ubuntu-16.04-docker-17.12.0-ce-standard |
Hi @ephos, if you want Docker to work, you need to volume mount the host's docker socket into the container using "-v /var/run/docker.sock:/var/run/docker.sock". The README shows an example of this. |
@stepro Sorry I had definitely missed that. I created new containers with that parameter but still seem to get the following error. Command
Build Error
|
@stepro you are correct, I believe my issue was an SELinux problem. I'm able to spin up the containers without a problem now! |
@ephos glad you got it working! |
@stepro I'm having the same error when using the Hosted Linux Preview build agent with the Docker Compose task. 2018-03-05T17:12:41.7220507Z ##[error]Couldn't connect to Docker daemon at http+docker://localunixsocket - is it running? |
Well it definitely shouldn't be a problem there. @chrisrpatterson, @dakale, could you follow up? |
@promontis Can you provide any more information about your build definition? I could not reproduce using the Docker Compose task on Hosted Linux Preview, and running two simple phases that just executed |
I haven't got my failing code to hand, but iirc my issue turned out to be a syntax error in the compose file - I think it was an illegal character in the container name - which must have caused an exception that was reported as a connection error rather than a syntax error. @promontis could you post your compose file - it might be the same root cause here. If that is the case, i guess the next step would be confirming if this is a bug inherent to docker-compose executable universally (i.e. when used outside the vsts agent image) |
@obfu5c8 that's indeed the case!! I had an illegal character in the container name! |
@promontis You can open an issue in https://github.com/Microsoft/vsts-tasks/issues if youd like. If not, could you provide the character/line that caused the issue so I can report it? In my test, I tried using an illegal field, something like "imag_s" instead of images. That, however, reported a proper error |
@dakale I had a dot in the image name. Could you try that? |
@promontis I tried this docker-compose file:
I also tried a few variations for the image format such as If possible, could you post the image name you used that caused the original error? Or if its private, what image format you used? i.e., from Compose docs:
|
@dakale I had this:
So it could be either the image name or the service name. Based on your answer it probably is the service name. It seems long service names is not what you want anyhow, because you have to type them in the cli if you want to manually deploy. So people might not do this that often. Still, I think the error should be more descriptive. |
I too have the same problem as OP @marc-mueller. In my case I've been using a VSTS Docker Compose task to build an image from the compose file created by Visual Studio for an ASP.NET Core Web Application on a private Lunux build agent I created myself. I've blogged about the setup here and it's all working fine. I now want to build the image on a private build agent hosted on AKS. As with @marc-mueller I have successfully created the agent, in my case using this (very similar) deployment:
However, when I now run a build it fails at the Build service images stage with the previously reported error:
In contrast to @marc-mueller, if I switch to building the Dockerfile directly using a plain Docker task I get this error:
The Dockerfile is of the multi-stage build type and it looks like this is being rejected, despite being supported by the installed version of docker. In terms of troubleshooting I've run the VSTS Docker and Docker Compose tasks with basic commands ( |
@GrahamDSmith Ill continue doing some investigation into the first point. Im also running some agent containers in AKS, but im not mounting the docker socket, so ill see if I can reproduce your issue when I get a moment (probably tomorrow morning). In the meantime, I'm curious if that error is correct. As seen earlier in the thread, that error cropped up simply because of a syntax error in the compose file. Maybe its possible the issue is coming from trying to build images via docker-compose on a host that doesnt support your Dockerfiles? If thats the problem you could of course just convert back to single stage building the service images with that |
Thanks @dakale for setting me right regarding multi-stage builds - I'd forgotten I needed to look at the k8s layer for the version of docker engine. I'll add my support to those asking for an upgrade. My compose file is generated by VS 2017 (15.5.6) and very simple so I hope I can rule out a syntax error:
However I too was wondering if the compose error was a symptom of an invalid Dockerfile for the version of docker running on k8s. I'll investigate and report back. |
@dakale I've now tested my setup with a single stage Dockerfile and everything works fine, so looks like in my case the error message was a red herring and the multi-stage Dockerfile was the problem. I saw I had a cluster upgrade option (1.9.2) so I upgraded but sadly the docker engine version is still only at 1.13.1. |
While we certainly understand the desire to do builds in your cluster, an the desire to have AKS on a current docker version, we are preparing to ship a native container build service. Please message me if you'd like to be part of the preview. |
@SteveLasker how do we message you :) would like to be part of the preview |
@SteveLasker Me 2 ;). |
@SteveLasker Me also please. I've emailed my contact details to you. |
@GrahamDSmith @jmezach @promontis and others interested... you can signup for ACR Preview features here: http://aka.ms/acr/preview/signup |
@SteveLasker Cheers! |
It looks like everyone's issues have been resolved on this thread by workaround, reading how to correctly configure the VSTS agent and/or understanding Docker's confusing error messages for these scenarios. I'm going to close out this issue. |
I've recently had the same issue. We've been running our vsts agents (microsoft docker hub images with ubuntu) for almost a year now on our AKS cluster (first preview, then GA, now GA with RBAC enabled). We've never changed the deployment yaml (which pretty much looks identical to the one mentioned in the thread). However since moving to the RBAC enabled cluster, it seems that every three days or so the agents lose connection with the docker daemon on the nodes with the same error "Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?" . This is the case with build steps, but also when i use kubectl exec to log into the containers. The only way i can resolve it is to use the kubectl scale command, scale down to 0, wait for all of them to be terminated and scale back up to the amount i had. Then the docker commands work again. killing and respawning without scaling to 0 does not work. I don't know if it has something to do with the fact that we moved to an RBAC enabled cluster and i need to specify additional permissions in the deployment yaml. |
Thanks, this was it for me.
|
Hi,
I just updated our build containers to the latest version of ubuntu-16.04-docker-17.06.0-ce-standard.
If I'm using the agent with VSTS, the following error occurs when building a docker image:
What changed so far? Did I miss something? I only updated the container, nothing else.
Thanks
Marc
The text was updated successfully, but these errors were encountered: