Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nomad auth for private ECR repo not working #3526

Closed
tarpanpathak opened this issue Nov 9, 2017 · 26 comments
Closed

Nomad auth for private ECR repo not working #3526

tarpanpathak opened this issue Nov 9, 2017 · 26 comments

Comments

@tarpanpathak
Copy link

tarpanpathak commented Nov 9, 2017

Nomad version

0.5.6 and 0.7.0

Operating system and Environment details

CentOS 7.3 and CentOS 7.4 (on-premise datacenter)

Issue

Nomad not acknowledging the docker-credential-ecr-login credentials. I've followed this documentation: https://www.nomadproject.io/docs/drivers/docker.html#authentication and this is NOT working.

Reproduction steps

  • Create a docker-cfg file with the following contents:
{
    "credHelpers": {
        "<XYZ>.dkr.ecr.us-west-2.amazonaws.com": "ecr-login"
    },
    "credsStore": "ecr-login"
}
  • Configure the Nomad client to use a helper:
client {
  enabled   = true
  options   = {
    "docker.auth.config"     = "/root/.docker/config.json"
    "docker.auth.helper"     = "ecr-login"
  }
  • Use the job file below.

Nomad Server logs (if appropriate)

N/A

Nomad Client logs (if appropriate)

Driver Failure   failed to initialize task "test-api" for alloc "c828eb7b-e396-d947-0d60-f6cf2064120d": Failed to find docker auth for repo "<XYZ>.dkr.ecr.us-west-2.amazonaws.com/test-api": docker-credential-ecr-login with input "https://<XYZ>.dkr.ecr.us-west-2.amazonaws.com/test-api" failed with stderr: 2017-11-09T06:22:10Z [ERROR] Error retrieving credentials: NoCredentialProviders: no valid providers in chain. Deprecated.
	For verbose messaging see aws.Config.CredentialsChainVerboseErrors
credentials not found in native keychain 

Job file (if appropriate)

job "test-api" {
  region = "us-west-2"
  datacenters = ["us-west-2"]
  type = "service"

  constraint {
      attribute = "${node.class}"
      value = "test-worker"
  }

  update {
   stagger      = "15s"
   max_parallel = 1
  }

  group "web" {
    # Specify the number of these tasks we want.
    count = 1

    # Create an individual task (unit of work). This particular
    # task utilizes a Docker container to front a web application.
    task "test-api" {
      # Specify the driver to be "docker". Nomad supports
      # multiple drivers.
      driver = "docker"
      # Configuration is specific to each driver.
      config {
        image = "<XYZ>.dkr.ecr.us-west-2.amazonaws.com/test-api"

        port_map {
            http = 8080
        }
        labels {
            service = "${NOMAD_JOB_NAME}"
        }

        logging {
          type = "syslog"
          config {
            tag = "test-api"
          }
        }
      }

      # The service block tells Nomad how to register this service
      # with Consul for service discovery and monitoring.
      service {
        name = "${JOB}"
        # This tells Consul to monitor the service on the port
        # labled "http".
        port = "http"

        check {
          type     = "http"
          path     = "/v1/status"
          interval = "20s"
          timeout  = "2s"
        }
      }
      # Specify the maximum resources required to run the job,
      # include CPU, memory, and bandwidth.
      resources {
        cpu    = 300 # MHz
        memory = 2048 # MB

        network {
          mbits = 1
          port "http" {}
        }
      }
    }
  }
}

Note: I've tried this with Nomad version 0.5.6 as well and am receiving the following error:

Driver Failure  failed to initialize task "test-api" for alloc "ed86bb06-29ba-a31b-a006-e8633b075d90": Failed to pull `<XYZ>.dkr.ecr.us-west-2.amazonaws.com/test-api`: unauthorized: authentication required. 

Am I doing something wrong?

@tarpanpathak tarpanpathak changed the title Nomad auth for ECR repo not working Nomad auth for private ECR repo not working Nov 9, 2017
@schmichael
Copy link
Member

Is docker-credential-ecr on Nomad's $PATH? Does using "docker.auth.helper" = "ecr" as in the docs work?

@tarpanpathak
Copy link
Author

Nomad and docker-credential-ecr-login are in /usr/local/bin/.
Tried "docker.auth.helper" = "ecr" and "docker.auth.helper" = "ecr-login" but neither work and throw the following error:

11/09/17 21:26:27 EST  Driver Failure   failed to initialize task "test-api" for alloc "a22ad474-0a09-24a8-1415-5287d123ddac": Failed to find docker auth for repo "<XYZ>.dkr.ecr.us-west-2.amazonaws.com/test-api": docker-credential-ecr-login with input "https://<XYZ>.dkr.ecr.us-west-2.amazonaws.com/test-api" failed with stderr: 2017-11-10T02:26:27Z [ERROR] Error retrieving credentials: NoCredentialProviders: no valid providers in chain. Deprecated.
        For verbose messaging see aws.Config.CredentialsChainVerboseErrors
credentials not found in native keychain

FYI, docker pull <XYZ>.dkr.ecr.us-west-2.amazonaws.com/test-api works as expected.

@dadgar
Copy link
Contributor

dadgar commented Nov 10, 2017

@ptarpan Can you try:

{
    "credHelpers": {
        "<XYZ>.dkr.ecr.us-west-2.amazonaws.com": "ecr-login"
    },
}

And

client {
  enabled   = true
  options   = {
    "docker.auth.config"     = "/root/.docker/config.json"
  }
}

@tarpanpathak
Copy link
Author

Hey @dadgar,

Same issue:

Time                   Type            Description
11/10/17 15:48:22 EST  Restarting      Task restarting in 18.630682115s
11/10/17 15:48:22 EST  Driver Failure  failed to initialize task "test-api" for alloc "51f5de15-7892-e04a-058f-c01b27ff057e": Failed to pull `<XYZ>.dkr.ecr.us-west-2.amazonaws.com/test-api`: unauthorized: authentication required
11/10/17 15:48:21 EST  Driver          Downloading image <XYZ>.dkr.ecr.us-west-2.amazonaws.com/test-api:latest
11/10/17 15:48:21 EST  Task Setup      Building Task Directory
11/10/17 15:48:21 EST  Received        Task received by client

@jrasell
Copy link
Member

jrasell commented Nov 14, 2017

If its any help; I use ECS repos by running a cron script which works as expected using the following configuration:

Cron:

#!/bin/bash
eval $(/usr/local/bin/aws ecr get-login --region us-east-1)

Nomad Client Config (portion):

"options":{"docker.auth.config":"/root/.docker/config.json"

IAM Role Permissions:

"ecr:GetAuthorizationToken",
"ecr:DescribeRepositories",
"ecr:BatchCheckLayerAvailability",
"ecr:GetDownloadUrlForLayer",
"ecr:GetRepositoryPolicy",
"ecr:ListImages",
"ecr:DescribeImages",
"ecr:BatchGetImage"

@tarpanpathak
Copy link
Author

Thx @jrasell. We use cron with instance roles in AWS but this is an issue for agents running in a datacenter. One quick question before I continue testing this on-prem: does eval live in a separate script that cron calls or does the crontab invoke the AWS CLI directly?

@thequailman
Copy link

I can confirm this is happening as well. Adding the AWS keys as environment variables (and the corresponding keys) in the service fixed the issue for me. It seems like Nomad isn't reading AWS keys from IAM roles or the ~/.aws/credentials file.

My config files:

/etc/systemd/system/nomad.service

[Unit]
Description=Nomad
Documentation=https://nomadproject.io/docs/

[Service]
Environment=AWS_ACCESS_KEY_ID=<secret>
Environment=AWS_SECRET_ACCESS_KEY=<secret>
Environment=AWS_DEFAULT_REGION=us-east-1
ExecStart=/usr/bin/nomad agent -config /etc/nomad.d
ExecReload=/bin/kill -HUP $MAINPID
LimitNOFILE=65536

[Install]
WantedBy=multi-user.target

/etc/nomad.d/config.json

{
  "client": {
    "enabled": true,
    "options": {
      "docker.auth.config": "/root/.docker/config.json"
    }
  ...
}

/root/.docker/config.json

{
  "credHelpers": {
    "my.ecr.URL.amazonaws.com": "ecr-login"
  }
}

@MatthiasScholz
Copy link

MatthiasScholz commented Mar 21, 2018

I got this working now on Amazon Linux and nomad 0.7.1:

  1. Putting docker-ecr-login-helper to /usr/bin - Ensure everyone can execute it!
  2. Putting docker configuration to /etc/docker/config.json - Ensure everyone has read access to the file!
  3. Referencing to the config.json in the nomad configuration (default.hcl)
  4. Ensure instance has read access to the ECR via an IAM policy.

One should not to get disturbed by the nomad user account on the instance and the linux init system running nomad:

  • Putting the docker config in the /root/ folder gives access problems.
  • The file ~/.aws/credentials is not used by the init system.

Docker Config - config.json:

{
  "credHelpers": {
    "my.ecr.URL.amazonaws.com": "ecr-login"
  }
}

Nomad Config - default.hcl:

...
client {
  enabled = true

  options   = {
    "docker.auth.config"     = "/etc/docker/config.json"
    "docker.auth.helper"     = "ecr-login"
  }
}
...

@CumpsD
Copy link

CumpsD commented Apr 16, 2018

Fixed this for us using @MatthiasScholz his remarks, thanks!

@mindnuts
Copy link

@MatthiasScholz Sorry to resurrect this old thread. May i ask you were you able to pull mixed images? What we are seeing is that, using your config we are able to pull ECR images, but images from DockerHub which do not require auth fails because it tries to pull that image from ECR:

05/29/18 13:03:12 UTC  Driver Failure  failed to initialize task "mysql" for alloc "73668dda-14c3-a8ec-d421-6ac9de16c42d": Failed to find docker auth for repo "registry.hub.docker.com/library/mysql": docker-credential-ecr-login with input "https://registry.hub.docker.com/library/mysql" failed with stderr: 2018-05-29T13:03:12Z [ERROR] Error parsing the serverURL: https://registry.hub.docker.com/library/mysql, error: docker-credential-ecr-login can only be used with Amazon Elastic Container Registry.

Is there a way to mix ECR private images and Public images?

@MatthiasScholz
Copy link

MatthiasScholz commented Jun 2, 2018

It is a valid questions. We only use ECR since we want to have a bit more control of the used images.

When I get the documentation of AWS ECR Helper right: 

With Docker 1.13.0 or greater, you can configure Docker to use different credential helpers for different registries. To use this credential helper for a specific ECR registry, create a credHelpers section with the URI of your ECR registry

-> then it would mean mixed image pulling should be supported.

Did you try to play a little bit around with the configuration? Like:

{
  "credHelpers": {
    "registry.example.com": "registryhelper",
    "awesomereg.example.org": "hip-star",
    "unicorn.example.io": "vcbait"
  }
}

mentioned in the Docker documentation?

I am not 100% sure if it will work out since there is still the configuration of the default.hcl mentioned above. I did not tested it yet.

@mindnuts
Copy link

mindnuts commented Jun 4, 2018

@MatthiasScholz

I solved this temporarily:

config {
        image = "registry.hub.docker.com/library/mysql:5.7.19"
        auth {
           username = "xxxxxxxx"
           password = "yyyyyyy"
        }
        force_pull = true

        volumes = [
          "/opt/mysql:/var/lib/mysql"
        ]
        port_map {
          mysql_port = 3306
        }
 }

It is not ideal but works for now.

@nickethier
Copy link
Member

This has been fixed in #4266 and released in 0.8.4-rc1. Nomad now correctly uses the AWS ECR helper as configured in your docker configuration.
Apologies for missing this issue in the list of ones this PR fixed.

@momania
Copy link

momania commented Jun 13, 2018

@nickethier Is there also a summary on how to configure it properly now? I've updated to 0.8.4 and still can't figure it out.
Got my /root/.docker/config.json as:

{
  "credHelpers": {
    "my.ecr.URL.amazonaws.com": "ecr-login"
  }
}

And also aws credentials setup. When I do a manual pull it all works fine, so why doesn't Nomad just pick up these settings? Do I really also need to set the Nomad client properties? And what should they be (why are they needed at all I question, but ok)

Any directions or working sample would be welcome, as this is all confusing as hell..

@MatthiasScholz
Copy link

Do not put the setup in the /root/ directory - only the root user has access to it then.
Nomad is not running as root.

Please check my comment above for the correct folders to use for the configuration.

@momania
Copy link

momania commented Jun 14, 2018

Ok got it in the end.
So I moved the docker config into /etc/docker/ and then added
"docker.auth.config" = "/etc/docker/config.json" to my nomad config.

"docker.auth.helper" = "ecr-login" is not needed as that would probably make all docker pulls use ecr-login.

The missing link in the end was that the init system doesn't use the aws credentials file so you need to add the credential environment variables.

Finally! Thanks for the support 👍

@nattvasan
Copy link

@ptarpan Can you try:

{
    "credHelpers": {
        "<XYZ>.dkr.ecr.us-west-2.amazonaws.com": "ecr-login"
    },
}

And

client {
  enabled   = true
  options   = {
    "docker.auth.config"     = "/root/.docker/config.json"
  }
}

Do I need to make the client section at my nomad client or in Nomad server configuration stanza, please clarify ?

@nattvasan
Copy link

Ok got it in the end.
So I moved the docker config into /etc/docker/ and then added
"docker.auth.config" = "/etc/docker/config.json" to my nomad config.

"docker.auth.helper" = "ecr-login" is not needed as that would probably make all docker pulls use ecr-login.

The missing link in the end was that the init system doesn't use the aws credentials file so you need to add the credential environment variables.

Finally! Thanks for the support 👍

Sorry to quoting it, Did you moved the config.json of docker in client machine and updated the client.hcl file in the Client machine ? Please clarify

@momania
Copy link

momania commented Oct 12, 2018

@nattvasan
Yes, it's all configured in the Nomad client part of the config. (The server doesn't execute jobs 😉 )

I ended up putting my AWS credentials in an environment file for my systemd setup for Nomad.

@nattvasan
Copy link

@momania Yes, I figured that lately.! i'm new to nomad, ! Figured out the issue and it resolved!

I ended up putting my AWS credentials in an environment file for my systemd setup for Nomad.

Yes, I did the same !

@rodriguezsergio
Copy link

#3526 (comment) is definitely the way to go.

Thanks.

@ronaldegmar
Copy link

ronaldegmar commented Jun 24, 2019

I can't make this work for the case we are assuming a role with docker-credential-ecr-login as documented here https://github.com/awslabs/amazon-ecr-credential-helper/issues/34.

"If you are working with an assumed role please set the environment variable: AWS_SDK_LOAD_CONFIG=true also."

I don't know where to put AWS_SDK_LOAD_CONFIG and AWS_PROFILE env variables. I tried everything:

  1. Setting the env variables in docker systemd.
  2. Setting the env variables in nomad systemd.
  3. Setting the env variables in /etc/profile.d/. This only works in the command line but not with nomad.
  4. Setting the env variables in the nomad job specification and using them as args in the docker config/args stanza.

@njones
Copy link

njones commented Jan 29, 2021

From Above

I can confirm this is happening as well. Adding the AWS keys as environment variables (and the corresponding keys) in the service fixed the issue for me. It seems like Nomad isn't reading AWS keys from IAM roles or the ~/.aws/credentials file.

My config files:

/etc/systemd/system/nomad.service

[Unit]
Description=Nomad
Documentation=https://nomadproject.io/docs/

[Service]
Environment=AWS_ACCESS_KEY_ID=<secret>
Environment=AWS_SECRET_ACCESS_KEY=<secret>
Environment=AWS_DEFAULT_REGION=us-east-1
ExecStart=/usr/bin/nomad agent -config /etc/nomad.d
ExecReload=/bin/kill -HUP $MAINPID
LimitNOFILE=65536

[Install]
WantedBy=multi-user.target

The above works, but is not ideal.

What it seems is happening, is that Nomad is running in a subshell, that doesn't bring in the env vars from the external shell. So if you have your ~/.aws/credentials setup then you can pass in $HOME to point to where ~ is. This way you don't need to pass your keys along each time. Your service file would look something like (if it's running as root as expected):

/etc/systemd/system/nomad.service

[Unit]
Description=Nomad
Documentation=https://nomadproject.io/docs/
 
[Service]
Environment=HOME=/root
ExecStart=/bin/sh -c '/usr/bin/nomad agent -config /etc/nomad.d'
ExecReload=/bin/kill -HUP $MAINPID
LimitNOFILE=65536

[Install]
WantedBy=multi-user.target

It seems like this has been suggested before: #7357 (comment)

I missed it without the example.

Note that the current documentation suggests running the client agent as root

https://learn.hashicorp.com/tutorials/nomad/get-started-run?in=nomad/get-started

Note: Typically any agent in client mode must be started with root level privilege. Nomad makes use of 
operating system primitives for resource isolation which require elevated permissions. The agent will 
function as non-root, but certain task drivers will not be available.

@laukaichung
Copy link

laukaichung commented Mar 5, 2021

After reading the suggestions, I have finally made it work. I am using nomad in localhost.

I run the client agent as root, but my credentials are stored in my user directory ~/.aws. Since the agent is run as root, I have to create a soft link in /root/.aws pointing to ~/.aws. Also, my credentials have different profiles, so when starting the nomad agent, I need to set the AWS_PROFILE environment for the profile authorized to access ECR.

my-config.hcl

plugin "docker" {
  config {
    auth {
      config = "/docker.json"
    }
  }
}

docker.json

{
  # for ecr images
  "credHelpers": {
    "xxxxx.amazonaws.com": "ecr-login"
  },
  # for images from docker repository
  "auths": {
    "https://index.docker.io/v1/": {}
  }
}

My ~./aws/credentials is like this

[default]
aws_access_key_id = xxx
aws_secret_access_key = xxx
[ecr_profile]
aws_access_key_id = xxx
aws_secret_access_key = xxx
sudo ln -s /home/user/.aws /root/.aws
sudo VAULT_TOKEN=xxx AWS_PROFILE=ecr_profile nomad agent -dev -log-level INFO -config=my-config.hcl

@danielberndt
Copy link

Thanks so much @njones!

I was seeing really odd error messages like MissingRegion: could not find region configuration when loading an artifact from s3 like this:

artifact {
  options {
    aws_profile = "my-profile"
  }
  source = "bucket.s3.amazonaws.com/nomad/my-file"
}

adding Environment=HOME=/root to my /etc/systemd/system/nomad.service was indeed solving the problem for me 👍

@github-actions
Copy link

I'm going to lock this issue because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Oct 18, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests