-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Docker Image Pull fails #4934
Comments
@n-junge any more diagnostic info you can share? that looks like it took 6 minutes and still timed out but the log messages are a bit misleading ( Pls share your job spec file so we can try to reproduce this. |
I noticed that to, it always shows that all layers are pulled. I think Nomad times out while Docker is extracting the layers, but this shouldn't happen. The job file is very simple.
This is the server/client log (extracts):
|
? |
I have experienced this as well. The same status pulled N/N 0 waiting/0 pulling but it stays like that until times out. |
Seeing a lot of this in 0.9.0-beta3 (Linux)
|
I see the same, especially on windows server 2016.
Get Outlook for Android<https://aka.ms/ghei36>
…________________________________
From: Rich Jones <[email protected]>
Sent: Wednesday, February 27, 2019 11:40:12 AM
To: hashicorp/nomad
Cc: Matthew Shooshtari; Comment
Subject: Re: [hashicorp/nomad] Docker Image Pull fails (#4934)
Seeing a lot of this in 0.9.0-beta3
2019-02-27T17:34:40Z Driver Failure Failed to pull `myorg/myimage:v1.9.4-dev`: context canceled
2019-02-27T17:34:10Z Driver Docker image pull progress: Pulled 28/28 (9.108 GiB/9.108 GiB) layers: 0 waiting/0 pulling
2019-02-27T17:32:00Z Driver Docker image pull progress: Pulled 27/28 (6.908 GiB/9.108 GiB) layers: 0 waiting/1 pulling - est 36.5s remaining
2019-02-27T17:30:00Z Driver Downloading image
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub<#4934 (comment)>, or mute the thread<https://github.com/notifications/unsubscribe-auth/AMoQ-WbnWo_-a4USEJmqA9jUw1No-fXKks5vRsL8gaJpZM4Y5l0C>.
|
In 0.8.x and 0.9.0, if Nomad doesn't receive any updates from the Docker daemon for over two minutes, it will abort and reschedule the task. In general, this isn't a problem, because the Docker daemon is somewhat chatty. I tried to verify this with some very large Docker images on a machine with very low bandwidth, and things worked as expected (in my case, after about 20 minutes, the image finished downloading and the task ran). If any of you could provide the following, it would be helpful for reproducing this and fixing it:
I appreciate your help. |
Is that two minutes configurable? |
Also does that reschedule occur even if the jobspec forbids retrying? |
The two minutes is not configurable right now... it certainly could/should be made configurable, on a global level if not at the job-level. This error (timeout during pull) is marked as recoverable error and is subject to the configured restart and reschedule policy. |
This issue will be auto-closed because there hasn't been any activity for a few months. Feel free to open a new one if you still experience this problem 👍 |
@cgbaker Do you know if this is configurable now? |
I just came across this, hopefully this fixes the issue: https://www.nomadproject.io/docs/drivers/docker#pull_activity_timeout |
I'm going to lock this issue because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active issues. |
Nomad version
Nomad v0.8.6 ('ab54ebcfcde062e9482558b7c052702d4cb8aa1b'+CHANGES)
Operating system and Environment details
Windows 10 Enterprise 64-bit
Issue
Nomad Task fails trying to pull Docker image. Some images can be pulled, others cannot.
Example with 2 restarts:
There is no problem pulling the images manually.
The text was updated successfully, but these errors were encountered: