Skip to content
This repository has been archived by the owner on Jul 17, 2020. It is now read-only.

Docker commands in container do not work anymore #54

Closed
marc-mueller opened this issue Dec 1, 2017 · 49 comments
Closed

Docker commands in container do not work anymore #54

marc-mueller opened this issue Dec 1, 2017 · 49 comments

Comments

@marc-mueller
Copy link

Hi,

I just updated our build containers to the latest version of ubuntu-16.04-docker-17.06.0-ce-standard.

If I'm using the agent with VSTS, the following error occurs when building a docker image:

Couldn't connect to Docker daemon at http+docker://localunixsocket - is it running?
If it's at a non-standard location, specify the URL with the DOCKER_HOST environment variable.
/usr/local/bin/docker-compose failed with return code: 1

What changed so far? Did I miss something? I only updated the container, nothing else.

Thanks
Marc

@stepro
Copy link
Member

stepro commented Dec 1, 2017

There was a recent update of the images, but I'm not seeing a regression on this. How are you launching the container?

@marc-mueller
Copy link
Author

I'm using an azure container services deployment with Kubernetes. There is a volume mount for the docker socket in the configuration.

apiVersion: v1
kind: ReplicationController
metadata:
  name: vsts-agent
  namespace: vsts
spec:
  replicas: 1
  template:
    metadata:
      labels:
        app: vsts-agent
        version: "0.1"
    spec:
      containers:
      - name: vsts-agent
        image: xxxxx.azurecr.io/vsts-agent:ubuntu-16.04-docker-17.06.0-ce-standard
        env:
          - name: VSTS_ACCOUNT
            valueFrom:
              secretKeyRef:
                name: vsts
                key: VSTS_ACCOUNT
          - name: VSTS_TOKEN
            valueFrom:
              secretKeyRef:
                name: vsts
                key: VSTS_TOKEN
          - name: VSTS_POOL
            value: k8sAgents
        volumeMounts:
        - mountPath: /var/run/docker.sock
          name: docker-volume
      volumes:
      - name: docker-volume
        hostPath:
          path: /var/run/docker.sock
      imagePullSecrets:
        - name: xxxx

Any suggestions what could have changed?

@stepro
Copy link
Member

stepro commented Dec 1, 2017

Ok, the issue occurs only when using the docker-compose CLI - missed that part. I'll look into it.

@stepro
Copy link
Member

stepro commented Dec 1, 2017

Actually it still isn't reproducing for me (I had forgotten to volume mount the docker socket). Only docker-compose gives the error message you are seeing, but only when I don't volume mount the socket. In this case, the docker CLI is more helpful:

Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?

I think what's happened is Kubernetes restarted the Docker daemon while your container was running and the running container lost the socket connection. We've observed this behavior in the past and confirmed with Kubernetes experts that the Docker daemon is sometimes restarted on a node. Can you try recreating the Kubernetes pod to see if the issue disappears?

@marc-mueller
Copy link
Author

marc-mueller commented Dec 1, 2017

I deleted and recreated the agent pods, no luck. Still the same error.

BTW: The agent capabilities list docker.

@stepro
Copy link
Member

stepro commented Dec 1, 2017

Interesting. Can you manually exec into the container and see if "docker images" works? Alternatively, you could try to run a VSTS Docker task (rather than a Docker Compose task).

@marc-mueller
Copy link
Author

I executed a build with a shell script where I executed docker images and it provided the correct results. so it definitely has to be something with docker compose.

@marc-mueller
Copy link
Author

I changed the build process to use the docker task and it works now. Currently I can live with that, but I still wonder what causes the error with docker-compose

@stepro
Copy link
Member

stepro commented Dec 2, 2017

That is quite strange. I've never seen a state where docker works and docker-compose does not. It's possible that if I upgrade the version of docker-compose (currently 1.15) to the latest that it would work.

If you're interested, you could try to run the following command inside the container to upgrade to the latest docker-compose CLI and try running a docker compose VSTS task after that:

curl -fSL "https://github.com/docker/compose/releases/download/1.17.1/docker-compose-`uname -s`-`uname -m`" -o /usr/local/bin/docker-co
mpose

Since I can't reproduce the issue, I won't be able to see if this fixes it, but if you're able to validate for me, I'll make the change and push new versions of the images. Thanks!

@marc-mueller
Copy link
Author

Unfortunately I still have the same error. Docker-compose was updated but ends with the same error about docker.

******************************************************************************
Starting: Shell Script
******************************************************************************
==============================================================================
Task         : Bash
Description  : This is an early preview. Run a Bash script on macOS, Linux, or Windows
Version      : 3.121.0
Author       : Microsoft Corporation
Help         : [More Information](https://go.microsoft.com/fwlink/?LinkID=613738)
==============================================================================
Generating script.
/bin/bash --noprofile --norc /vsts/agent/_work/_temp/f7c3c959-cd47-452f-8aad-060bd8aa8d5d.sh
docker-compose version 1.15.0, build e12f3b9
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
100   617    0   617    0     0   1040      0 --:--:-- --:--:-- --:--:--  1040
  0 8649k    0 16360    0     0  12733      0  0:11:35  0:00:01  0:11:34 12733
 49 8649k   49 4273k    0     0  1890k      0  0:00:04  0:00:02  0:00:02 4357k
100 8649k  100 8649k    0     0  3442k      0  0:00:02  0:00:02 --:--:-- 7036k
docker-compose version 1.17.1, build 6d101fb
******************************************************************************
Finishing: Shell Script
******************************************************************************
******************************************************************************
Starting: Build services
******************************************************************************
==============================================================================
Task         : Docker Compose
Description  : Build, push or run multi-container Docker applications. Task can be used with Docker or Azure Container registry.
Version      : 0.4.7
Author       : Microsoft Corporation
Help         : [More Information](https://go.microsoft.com/fwlink/?linkid=848006)
==============================================================================
/usr/local/bin/docker-compose -f /vsts/agent/_work/1/s/docker-compose.yml -f /vsts/agent/.docker-compose.1512211278257.yml -p FastFood.Services.Orders build
Building fastfood.services.orders
Couldn't connect to Docker daemon at http+docker://localunixsocket - is it running?
If it's at a non-standard location, specify the URL with the DOCKER_HOST environment variable.
Building fastfood.services.orders
Couldn't connect to Docker daemon at http+docker://localunixsocket - is it running?
If it's at a non-standard location, specify the URL with the DOCKER_HOST environment variable.
/usr/local/bin/docker-compose failed with return code: 1

@stepro
Copy link
Member

stepro commented Dec 2, 2017

Well that's unfortunate. I wonder if there's something about how your VSTS task is configured. In order to rule that out, can you manually exec a console into the container and run the following to see if you have the same issue independent of running VSTS tasks?

docker images
echo version: \"2\" > docker-compose.yml
docker-compose up

(You can exec in using kubectl, first "kubectl get pod" to find the name of the pod that is running the VSTS agent, then "kubectl exec -it bash").

@Sawtaytoes
Copy link

Sawtaytoes commented Dec 5, 2017

I have the same issue. I just pulled down today. This is the command I'm using to run the agent on my local machine:

docker run -e VSTS_ACCOUNT=companionprotect -e VSTS_TOKEN=[TOKEN] -it "microsoft/vsts-agent:ubuntu-16.04-docker-17.06.0-ce"

I'm using the Docker Compose tasks in the agent pipeline. This one fails when executing a docker-compose run command using the Docker Compose file **/docker-compose.test.yml:

2017-12-05T22:32:56.8237380Z ##[section]Starting: Run services
2017-12-05T22:32:56.8595120Z ==============================================================================
2017-12-05T22:32:56.8606180Z Task         : Docker Compose
2017-12-05T22:32:56.8619340Z Description  : Build, push or run multi-container Docker applications. Task can be used with Docker or Azure Container registry.
2017-12-05T22:32:56.8631180Z Version      : 0.4.7
2017-12-05T22:32:56.8641410Z Author       : Microsoft Corporation
2017-12-05T22:32:56.8653500Z Help         : [More Information](https://go.microsoft.com/fwlink/?linkid=848006)
2017-12-05T22:32:56.8666490Z ==============================================================================
2017-12-05T22:33:01.0701670Z [command]/usr/local/bin/docker-compose -f /vsts/agent/_work/1/s/docker-compose.test.yml -f /vsts/agent/.docker-compose.1512513178668.yml -p [GIT REPO NAME] up --build --abort-on-container-exit
2017-12-05T22:33:01.0728600Z Couldn't connect to Docker daemon at http+docker://localunixsocket - is it running?
2017-12-05T22:33:01.0737520Z 
2017-12-05T22:33:01.0749700Z If it's at a non-standard location, specify the URL with the DOCKER_HOST environment variable.
2017-12-05T22:33:01.0920530Z ##[error]Couldn't connect to Docker daemon at http+docker://localunixsocket - is it running?
2017-12-05T22:33:01.1028470Z ##[error]If it's at a non-standard location, specify the URL with the DOCKER_HOST environment variable.
2017-12-05T22:33:01.1122350Z ##[error]/usr/local/bin/docker-compose failed with return code: 1
2017-12-05T22:33:01.1182160Z ##[section]Finishing: Run services

My Docker Compose file is version 3.3, and this works fine when using the Hosted Linux build agent.

@stepro
Copy link
Member

stepro commented Dec 6, 2017

Ok, this is concerning. I couldn't reproduce the issue by using docker-compose directly. I'll try running a task via the VSTS agent and see if I can recreate the problem.

@stepro
Copy link
Member

stepro commented Dec 7, 2017

Actually @Sawtaytoes, the command you are using to run the agent locally is giving an expected error because you didn't volume mount the docker socket into the container. See the "docker images" section of this page. @marc-mueller, I'm still trying to reproduce the issue in your case.

@stepro
Copy link
Member

stepro commented Dec 7, 2017

@marc-mueller , I'm unable to reproduce this issue. I ran the VSTS agent in Kubernetes using your exact replication controller definition and queued a VSTS build on it that runs the exact same VSTS task configured the same way (the docker-compose task configured to build services). It works fine. Would you be able to send me or summarize how exactly your VSTS docker-compose task is configured? I'm particularly interested to know if you ever at some point set the "Docker Host Connection" property to something. If you add another docker-compose VSTS task and disable the existing one does it work? I feel like there's something in VSTS that is trying to override which Docker daemon is used but for some reason that isn't working any more with the latest agent image.

@Sawtaytoes
Copy link

Thanks @stepro. I see it now: -v /var/run/docker.sock:/var/run/docker.sock. Must've glanced over it thanks!

@marc-mueller
Copy link
Author

Hi @stepro

Sorry, I was on a business trip and my inbox is more then full. I need to setup a new environment to test it again (it was deleted in the meantime). I'll come back to you with the results (around friday).

@obfu5c8
Copy link

obfu5c8 commented Feb 13, 2018

I'm seeing a similar behaviour. VSTS build tasks fail (using the docker-compose task) with the same Couldn't connect to Docker daemon at http+docker://localunixsocket - is it running? errors.

If i docker exec the command into the running image from the host it works as expected though which leads me to think it might be something in the build agent scripts that is mucking with the env perhaps?

@stepro
Copy link
Member

stepro commented Feb 13, 2018

@obfu5c8 , when you docker exec into the container, what commands are you trying?

@ephos
Copy link

ephos commented Feb 21, 2018

I am unsure if this is related but using either the ubuntu-16.04-docker-17.06.0-ce-standard or the ubuntu-16.04-docker-17.12.0-ce-standard I am seeing the following errors.

This is the error I get with 17.06.

Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?
/usr/local/bin/docker failed with return code: 1

This is the error I get with 17.12

failed to dial gRPC: cannot connect to the Docker daemon. Is 'docker daemon' running on this host?: dial unix /var/run/docker.sock: connect: no such file or directory

The build is utilizing the following Microsoft task.

Task : Docker
Description : Build, tag, push, or run Docker images, or run a Docker command. Task can be used with Docker or Azure Container registry.
Version : 0.3.4
Author : Microsoft Corporation
Help : More Information

@stepro
Copy link
Member

stepro commented Feb 22, 2018

Seems more and more people are seeing this, yet I'm still unable to reproduce it myself. Unfortunately I can't help without more information. @ephos, can you provide the command line you are using to run the container? Or if you're running it in an orchestrator like Kubernetes, the steps for running it there? Thanks.

@ephos
Copy link

ephos commented Feb 22, 2018

Hi @stepro, no orchestrsation at this time I am just using commands. Below is the command I used.

sudo docker run --env VSTS_ACCOUNT=myaccount --env VSTS_TOKEN=REDACTED --env VSTS_AGENT='vsts-docker-$(hostname)' --env VSTS_POOL=VSTS-Container-Pool -d microsoft/vsts-agent:ubuntu-16.04-docker-17.12.0-ce-standard

@stepro
Copy link
Member

stepro commented Feb 22, 2018

Hi @ephos, if you want Docker to work, you need to volume mount the host's docker socket into the container using "-v /var/run/docker.sock:/var/run/docker.sock". The README shows an example of this.

@ephos
Copy link

ephos commented Feb 23, 2018

@stepro Sorry I had definitely missed that. I created new containers with that parameter but still seem to get the following error.

Command

sudo docker run \
--env VSTS_ACCOUNT=myaccount \ 
--env VSTS_TOKEN=REDACTED \
--env VSTS_AGENT='vsts-docker-$(hostname)' \
--env VSTS_POOL=VSTS-Container-Pool \
-v /var/run/docker.sock:/var/run/docker.sock \
-d microsoft/vsts-agent:ubuntu-16.04-docker-17.12.0-ce-standard

Build Error

time="2018-02-23T15:21:59Z" level=error msg="failed to dial gRPC: cannot connect to the Docker daemon. Is 'docker daemon' running on this host?: dial unix /var/run/docker.sock: connect: permission denied"

@stepro
Copy link
Member

stepro commented Feb 23, 2018

@ephos I wonder if your Docker daemon isn't configured to listen on the default local socket file. Can you take a look at the docs here to see if perhaps the daemon is configured differently?

@ephos
Copy link

ephos commented Mar 1, 2018

@stepro you are correct, I believe my issue was an SELinux problem. I'm able to spin up the containers without a problem now!

@stepro
Copy link
Member

stepro commented Mar 1, 2018

@ephos glad you got it working!

@promontis
Copy link

@stepro I'm having the same error when using the Hosted Linux Preview build agent with the Docker Compose task.

2018-03-05T17:12:41.7220507Z ##[error]Couldn't connect to Docker daemon at http+docker://localunixsocket - is it running?
2018-03-05T17:12:41.7303252Z ##[error]If it's at a non-standard location, specify the URL with the DOCKER_HOST environment variable.

@stepro
Copy link
Member

stepro commented Mar 5, 2018

Well it definitely shouldn't be a problem there. @chrisrpatterson, @dakale, could you follow up?

@dakale
Copy link

dakale commented Mar 5, 2018

@promontis Can you provide any more information about your build definition? I could not reproduce using the Docker Compose task on Hosted Linux Preview, and running two simple phases that just executed docker-compose version and docker-compose images. I suspect docker-compose images is a good enough test of whether there is a connection to a Docker daemon- perhaps your definition is more complicated?

@obfu5c8
Copy link

obfu5c8 commented Mar 5, 2018

I haven't got my failing code to hand, but iirc my issue turned out to be a syntax error in the compose file - I think it was an illegal character in the container name - which must have caused an exception that was reported as a connection error rather than a syntax error.

@promontis could you post your compose file - it might be the same root cause here.

If that is the case, i guess the next step would be confirming if this is a bug inherent to docker-compose executable universally (i.e. when used outside the vsts agent image)

@promontis
Copy link

@obfu5c8 that's indeed the case!! I had an illegal character in the container name!

@dakale
Copy link

dakale commented Mar 5, 2018

@promontis You can open an issue in https://github.com/Microsoft/vsts-tasks/issues if youd like.

If not, could you provide the character/line that caused the issue so I can report it? In my test, I tried using an illegal field, something like "imag_s" instead of images. That, however, reported a proper error

@promontis
Copy link

@dakale I had a dot in the image name. Could you try that?

@dakale
Copy link

dakale commented Mar 5, 2018

@promontis I tried this docker-compose file:

version: '3'
services:
  redis:
    image: re.dis

I also tried a few variations for the image format such as redis:alpine and an image from a personal Dockerhub repo.

If possible, could you post the image name you used that caused the original error? Or if its private, what image format you used? i.e., from Compose docs:

image
Specify the image to start the container from. Can either be a repository/tag or a partial image ID.

image: redis
image: ubuntu:14.04
image: tutum/influxdb
image: example-registry.com:4000/postgresql
image: a4bc65fd

@promontis
Copy link

promontis commented Mar 6, 2018

@dakale I had this:

version: '3'

services:
  Stylister.Constraints.Api:
    image: Stylister.Constraints.Api
    build:
      context: ./Stylister.Constraints.Api/
      dockerfile: Dockerfile
    ports: 
      - "8080:8080"

So it could be either the image name or the service name. Based on your answer it probably is the service name. It seems long service names is not what you want anyhow, because you have to type them in the cli if you want to manually deploy. So people might not do this that often. Still, I think the error should be more descriptive.

@GrahamDSmith
Copy link

I too have the same problem as OP @marc-mueller. In my case I've been using a VSTS Docker Compose task to build an image from the compose file created by Visual Studio for an ASP.NET Core Web Application on a private Lunux build agent I created myself. I've blogged about the setup here and it's all working fine.

I now want to build the image on a private build agent hosted on AKS. As with @marc-mueller I have successfully created the agent, in my case using this (very similar) deployment:

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: vsts-agent
spec:
  replicas: 1
  template:
    metadata:
      labels:
        app: vsts-agent
        version: "0.1"
    spec:
      containers:
      - name: vsts-agent
        image: microsoft/vsts-agent:ubuntu-16.04-docker-17.12.0-ce
        env:
          - name: VSTS_ACCOUNT
            valueFrom:
              secretKeyRef:
                name: vsts
                key: VSTS_ACCOUNT
          - name: VSTS_TOKEN
            valueFrom:
              secretKeyRef:
                name: vsts
                key: VSTS_TOKEN
          - name: VSTS_POOL
            value: k8s
        volumeMounts:
        - mountPath: /var/run/docker.sock
          name: docker-volume
      volumes:
      - name: docker-volume
        hostPath:
          path: /var/run/docker.sock

However, when I now run a build it fails at the Build service images stage with the previously reported error:

Couldn't connect to Docker daemon at http+docker://localunixsocket - is it running?
If it's at a non-standard location, specify the URL with the DOCKER_HOST environment variable.
/usr/local/bin/docker-compose failed with return code: 1

In contrast to @marc-mueller, if I switch to building the Dockerfile directly using a plain Docker task I get this error:

Error parsing reference: "microsoft/aspnetcore:2.0 AS base" is not a valid repository/tag
/usr/local/bin/docker failed with return code: 1

The Dockerfile is of the multi-stage build type and it looks like this is being rejected, despite being supported by the installed version of docker.

In terms of troubleshooting I've run the VSTS Docker and Docker Compose tasks with basic commands (version, run hello-world and version respectively) and everything works as expected.

@dakale
Copy link

dakale commented Mar 7, 2018

@GrahamDSmith
Let me quickly follow up on your second point- Multistage builds don't seem to be supported in AKS. See this issue. If you shell into your agent container on AKS and run docker version youll see the docker daemon on those hosts is only version 1.12.6 and 17.05 is required on both the daemon and client

Ill continue doing some investigation into the first point. Im also running some agent containers in AKS, but im not mounting the docker socket, so ill see if I can reproduce your issue when I get a moment (probably tomorrow morning).

In the meantime, I'm curious if that error is correct. As seen earlier in the thread, that error cropped up simply because of a syntax error in the compose file. Maybe its possible the issue is coming from trying to build images via docker-compose on a host that doesnt support your Dockerfiles? If thats the problem you could of course just convert back to single stage building the service images with that

@GrahamDSmith
Copy link

GrahamDSmith commented Mar 8, 2018

Thanks @dakale for setting me right regarding multi-stage builds - I'd forgotten I needed to look at the k8s layer for the version of docker engine. I'll add my support to those asking for an upgrade.

My compose file is generated by VS 2017 (15.5.6) and very simple so I hope I can rule out a syntax error:

version: '3'

services:
  k8s-aspnetcore:
    image: k8s-aspnetcore
    build:
      context: .
      dockerfile: k8s-aspnetcore/Dockerfile

However I too was wondering if the compose error was a symptom of an invalid Dockerfile for the version of docker running on k8s. I'll investigate and report back.

@GrahamDSmith
Copy link

@dakale I've now tested my setup with a single stage Dockerfile and everything works fine, so looks like in my case the error message was a red herring and the multi-stage Dockerfile was the problem.

I saw I had a cluster upgrade option (1.9.2) so I upgraded but sadly the docker engine version is still only at 1.13.1.

@SteveLasker
Copy link

While we certainly understand the desire to do builds in your cluster, an the desire to have AKS on a current docker version, we are preparing to ship a native container build service. Please message me if you'd like to be part of the preview.
Steve

@promontis
Copy link

@SteveLasker how do we message you :) would like to be part of the preview

@jmezach
Copy link

jmezach commented Mar 14, 2018

@SteveLasker Me 2 ;).

@GrahamDSmith
Copy link

@SteveLasker Me also please. I've emailed my contact details to you.

@SteveLasker
Copy link

@GrahamDSmith @jmezach @promontis and others interested... you can signup for ACR Preview features here: http://aka.ms/acr/preview/signup
We expect to open up access this week.
Thanks,
Steve

@GrahamDSmith
Copy link

@SteveLasker Cheers!

@stepro
Copy link
Member

stepro commented Mar 26, 2018

It looks like everyone's issues have been resolved on this thread by workaround, reading how to correctly configure the VSTS agent and/or understanding Docker's confusing error messages for these scenarios. I'm going to close out this issue.

@stepro stepro closed this as completed Mar 26, 2018
@SurushS
Copy link

SurushS commented Oct 26, 2018

I've recently had the same issue. We've been running our vsts agents (microsoft docker hub images with ubuntu) for almost a year now on our AKS cluster (first preview, then GA, now GA with RBAC enabled). We've never changed the deployment yaml (which pretty much looks identical to the one mentioned in the thread). However since moving to the RBAC enabled cluster, it seems that every three days or so the agents lose connection with the docker daemon on the nodes with the same error "Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?" . This is the case with build steps, but also when i use kubectl exec to log into the containers. The only way i can resolve it is to use the kubectl scale command, scale down to 0, wait for all of them to be terminated and scale back up to the amount i had. Then the docker commands work again. killing and respawning without scaling to 0 does not work. I don't know if it has something to do with the fact that we moved to an RBAC enabled cluster and i need to specify additional permissions in the deployment yaml.

@dennisvanderpool
Copy link

Actually @Sawtaytoes, the command you are using to run the agent locally is giving an expected error because you didn't volume mount the docker socket into the container. See the "docker images" section of this page. @marc-mueller, I'm still trying to reproduce the issue in your case.

Thanks, this was it for me.

docker run -v /var/run/docker.sock:/var/run/docker.sock -e TFS_URL="https://*****.com/tfs" -e VSTS_TOKEN=mq2af6y****bq --rm -it microsoft/vsts-agent:ubuntu-16.04-tfs-2018-u2-docker-18.06.1-ce-standard

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests