Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to debug "Failed to find docker auth for repo"? #7357

Closed
davidr912 opened this issue Mar 16, 2020 · 8 comments
Closed

How to debug "Failed to find docker auth for repo"? #7357

davidr912 opened this issue Mar 16, 2020 · 8 comments

Comments

@davidr912
Copy link
Contributor

Nomad version

Nomad v0.10.4 (f750636ca68e17dcd2445c1ab9c5a34f9ac69345)

Operating system and Environment details

Linux, Ubuntu 18.04

Issue

I'm receiving "Failed to find docker auth for repo" and I'm using

  • docker-credential-ecr-login
  • A file in /etc/docker-auth.json with credHelpers specified in format <repo>: ecr-login
  • The following block in client.hcl:
plugin "docker" {
  config {
    auth {
      config = "/etc/docker-auth.json"
    }
  }
}

What I'm looking for mostly is how to debug this? Even at DEBUG log level the server and client do not log anything about what command they are actually running, or the user they run it as

Is there some way I can get this detail?

(I suspect this detail will help me because executing <repo> | docker-credential-ecr-login get as a regular user successfully authenticates)

@davidr912
Copy link
Contributor Author

Just for additional context - I'm using a systemd service, and what I'm finding particularly difficult to understand is that if I change my nomad service to do this:

ExecStart=/bin/bash -c "echo 'XXXXX.eu-west-2.amazonaws.com' | docker-credential-ecr-login get"

the output is a valid response:

{"ServerURL":"XXXXX.eu-west-2.amazonaws.com","Username":"AWS","Secret":"<SECRET>"}

How can I see the exact format of what Nomad is actually doing / what environment the job is actually seeing?

@davidr912
Copy link
Contributor Author

davidr912 commented Mar 18, 2020

My 'fix' for this has been giving /root/ an .aws folder with credentials (and specifying $HOME in systemd service config), but this should not really be necessary if I can specify the AWS environment variables

Also seems notable that most of the other issues similar to this have just been ignored and closed?

@tgross
Copy link
Member

tgross commented Mar 18, 2020

Hi @davidr912!

Even at DEBUG log level the server and client do not log anything about what command they are actually running, or the user they run it as
...
How can I see the exact format of what Nomad is actually doing / what environment the job is actually seeing?

Nomad doesn't run Docker commands like a shell; it uses the Docker API. You're seeing the logs that bubble up from that API in the docker task driver. That being said, some of the auth helpers do call out to external command (ex authFromHelper)

The auth process goes through some task-specific configuration, then the Docker config, then the helper config (see driver.go#L578). It would probably help if you could provide the /etc/docker-auth.json file and the full error message you're receiving, rather than just the leading part.

More places to debug:

  • Is docker-credential-ecr-login somewhere on Nomad's $PATH?
  • Is dockerd is reporting any error messages?
  • Is your AWS audit logging reporting any error messages?

Also seems notable that most of the other issues similar to this have just been ignored and closed?

Oh? Sorry about that, can you link the ones you found so I can close them up?

@pySilver
Copy link

pySilver commented Apr 8, 2020

@davidr912 it's better to expose environment variables in your systemd service for nomad
see: #3526 (comment)
see: #3526 (comment)

@stale
Copy link

stale bot commented Jul 8, 2020

Hey there

Since this issue hasn't had any activity in a while - we're going to automatically close it in 30 days. If you're still seeing this issue with the latest version of Nomad, please respond here and we'll keep this open and take another look at this.

Thanks!

@reedlaw
Copy link

reedlaw commented Jan 12, 2021

I am having the same trouble with Nomad v1.0.1 (c9c68aa).

Failed to find docker auth for repo "XXXXXXXXXX.dkr.ecr.us-west-2.amazonaws.com/image_name": 
docker-credential-ecr-login with input "XXXXXXXXXX.dkr.ecr.us-west-2.amazonaws.com/image_name" 
failed with stderr: exit status 1

From that message it seems the entire image path is being passed in as the repo. The correct way to call docker-credential-ecr-login should be:

echo "XXXXXXXXXX.dkr.ecr.us-west-2.amazonaws.com" | docker-credential-ecr-login get

The /image_name should not be appended. I have tried every configuration I could think of in the client.hcl and job.nomad files. Everything works fine from the command line (I can do docker pull image). Current job.nomad:

job "fs-example" {
  datacenters = ["dc1"]

  task "fs-example" {
    driver = "docker"

    config {
      image = "XXXXXXXXXX.dkr.ecr.us-west-2.amazonaws.com/image_name"
    }

    resources {
      cpu    = 500
      memory = 512
    }
  }
}

client.hcl:

client {
  enabled = true
  options   = {
    "docker.auth.config"     = "/etc/docker/config.json"
    "docker.auth.helper"     = "ecr-login"
  }
  servers = ["127.0.0.1:4646"]
}


plugin "docker" {
  config {
    auth {
      config = "/etc/docker/config.json"
      helper = "ecr-login"
    }
  }
}

/etc/docker.config.json:

{
  "credHelpers": {
      "XXXXXXXXXX.dkr.ecr.us-west-2.amazonaws.com": "ecr-login"
  }
}

@reedlaw
Copy link

reedlaw commented Jan 12, 2021

I resolved the issue by removing lines from the systemd service file for nomad. I was attempting to set the AWS variables inside nomad.service like so:

Environment=AWS_ACCESS_KEY_ID="XXXXXXXXXXXXXX"
Environment=AWS_SECRET_ACCESS_KEY="XXXXXXXXXXXXXXXXXXX"

It turns out that is not the correct way to set env vars. The quoting is off as explained in this answer. Setting those in /etc/environment is sufficient.

@github-actions
Copy link

I'm going to lock this issue because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Oct 25, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

4 participants