Nomad deployments getting stuck with allocations that fail due to client issues #4738

dgonzalezruiz · 2018-10-01T10:03:22Z

Nomad version

Nomad v0.8.1 (46aa11b)

Operating system and Environment details

Ubuntu 14.04.5 LTS

Issue

When a deployment is being performed and is in running state, a failed allocation due to an error inherent to the client makes the deployment get stuck, and not even draininly the failed client makes the deployment go on; manually failing such deployment and triggering it again is required.

In my specific case, this happens because I have some nodes that have networking issues and end up with read-only filesystems, which esentially generates this issue in the end:

Client Status       = failed
Client Description  = failed to build task dirs for 'your_fancy_service'

I haven't found anything useful in the logs for the matter.

Does anyone have a clue/tip re: what happens? Could nomad implement (if it does not do it already, in some form or shape) some logic in order to handle these issues more gracefully?

The text was updated successfully, but these errors were encountered:

dadgar · 2018-10-01T17:00:09Z

Hey with Nomad 0.8.4 or greater the allocation will get rescheduled and the deployment will continue until the progress deadline is hit. I suggest you update to 0.8.6!

github-actions · 2022-11-28T02:19:41Z

I'm going to lock this issue because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

dadgar closed this as completed Oct 1, 2018

luca-eb mentioned this issue Feb 3, 2020

Multi task-groups jobs hit canary progress deadline in promote #7058

Closed

github-actions bot locked as resolved and limited conversation to collaborators Nov 28, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Nomad deployments getting stuck with allocations that fail due to client issues #4738

Nomad deployments getting stuck with allocations that fail due to client issues #4738

dgonzalezruiz commented Oct 1, 2018 •

edited

Loading

dadgar commented Oct 1, 2018

github-actions bot commented Nov 28, 2022

Nomad deployments getting stuck with allocations that fail due to client issues #4738

Nomad deployments getting stuck with allocations that fail due to client issues #4738

Comments

dgonzalezruiz commented Oct 1, 2018 • edited Loading

Nomad version

Operating system and Environment details

Issue

dadgar commented Oct 1, 2018

github-actions bot commented Nov 28, 2022

dgonzalezruiz commented Oct 1, 2018 •

edited

Loading