-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
System job with constrains fails to plan #12748
Comments
I'm facing the same problem (1.2.6): Job: "stage-cron" Scheduler dry-run:
But if I stop job before submitting a new job, it works as expected: $ nomad job stop stage-cron $ nomad job plan ... +/- Job: "stage-cron" Scheduler dry-run:
|
I have found a temporary workaround. You need to add 1.1.x server to the cluster and stop-start 1.2.6 leaders until 1.1.x becomes a leader. |
Hi @chilloutman! This definitely seems like it could be related to #12016. I'm not going to mark it as a duplicate just in case it's not but I'll cross-reference here so that whomever tackles that issue will see this as well. I don't have a good workaround for you other than to ignore warnings (they're warnings and not errors), but I realize that isn't ideal. Just FYI @cr0c0dylus:
This is effectively downgrading Nomad into a mixed-version cluster, which is not supported and highly likely to result in state store corruption. Doing so in order to suppress something that's only a warning is not advised. |
Unfortunately, it is not only a warning. It cannot allocate a job at all. Another trick - to change one of the limits in resources stanza. For example, to add +1 to the CPU limit. But it doesn't work with some of my jobs. |
I wonder if this is related #11778 (comment) It really looks like some bug in the scheduler that incorrectly fails placement during the node feasibility check. It is almost like it's not iterating through all nodes but for some reason returns a placement failure while it hasn't exhausted the full list yet. |
I am also facing this issue and I had to downgrade nomad. |
I'm wondering if this could be the cause: https://github.com/hashicorp/nomad/pull/11111/files#diff-c4e3135b7aa83ba07d59d003a8ab006915207425b8728c4cf070eee20ab9157a "// track node filtering, to only report an error if all nodes have been filtered" might not be working as intended. Or maybe instead of only warnings #11111 ended up causing errors. |
Verified we hit this with constraints on 1.2.6 as well. Mitigation was reverting this to 1.1.5. I do not know how bugs are prioritized but this should probably be pretty high. |
BTW, it would be great if I those warnings can be completely disabled in config. If I have 50 nodes in cluster and make constraint for 3 nodes - what the sense to see "47 Not Scheduled"? System jobs are very useful for scaling in HA configuration - I don't need to modify job stanza, just add or remove nodes with a special meta variable. |
It's the cause indeed. Reverting this pull request fixed the issue for me on 1.3.1. |
Nomad v1.2.9 (86192e4) The problem persists. I still need to stop the 1.2.9 masters in sequence until 1.0.18 becomes the leader and allows deployment. |
There may be a fix in 1.3.2, at least it looks that way: https://github.com/hashicorp/nomad/blob/v1.3.2/scheduler/scheduler_system.go#L298 |
Issue still exists in v1.5.3, frequently run into this when upgrading system jobs. While the nomad CLI reports this error, the rollout will still actually happen in Nomad. |
I am seeing the same behavior as @seanamos in v1.6.3 |
The problem continues to occur in v1.7.3 |
Can confirm still present in Nomad v1.7.7. |
I'm seeing this with Nomad 1.8.3, but additionally its failing with not only I have 12 nodes running: $ nomad node status -quiet
b15f6629-da08-0f17-8058-0a3032a769e1
31090485-dbe1-4b72-00bb-0e1282d82210
dc5177d0-7e07-28c2-8ddc-584be7c66c75
22a71c04-f531-7680-0019-b0e51bf83ba1
be37c669-d199-a716-2866-e4642aec3665
dcf03550-47ef-fe32-cbe1-67b711744608
d0e7cbfc-b934-ba23-54ff-cf38531c355a
3a7e8085-44bc-a150-6a1d-0040353a8528
2e01087a-f3a6-f86d-fd36-a18d86b92da2
8d8ba3e3-c180-9f1e-b2a2-08d42aad4e4d
ffbbb77f-744a-26f2-6a21-bf1d19316865
2f0b6be8-6564-6bbe-d85c-2457e532243f I have a single system job with the following constraint (and no others): constraint {
attribute = "${node.class}"
value = "private-t38"
} Which matches node ffbbb77f: $ nomad node status ffbbb77f
ID = ffbbb77f-744a-26f2-6a21-bf1d19316865
Name = prod-ap-northeast-1-private-t38-i-SOME_INSTANCE_ID
Node Pool = default
Class = private-t38
... When I try to place the job, it fails: $ nomad job run job.nomad
==> 2024-09-16T19:04:26-04:00: Monitoring evaluation "6c4f7100"
2024-09-16T19:04:26-04:00: Evaluation triggered by job "metadataproxy"
2024-09-16T19:04:28-04:00: Evaluation status changed: "pending" -> "complete"
==> 2024-09-16T19:04:28-04:00: Evaluation "6c4f7100" finished with status "complete" but failed to place all allocations:
2024-09-16T19:04:28-04:00: Task Group "app" (failed to place 1 allocation):
* Class "private-nomad": 3 nodes excluded by filter
* Class "private-common": 1 nodes excluded by filter
* Class "public-dt316": 6 nodes excluded by filter
* Constraint "${node.class} = private-t38": 11 nodes excluded by filter The job itself is already running on the 1 node that matches that constraint. Once I stop the job, it can get placed. One potential interesting detail is that the job itself hasn't changed in between invocations. If I change the job contents somehow, it'll place properly. This only happens when attempting to re-apply a job as it already exists - my guess is that it detects that the jobs haven't changed, and therefore marks that node as a conflict for some reason as opposed to "already placed" (dunno if there is a word for that). Not sure if this is exactly the above issue, but happy to dive in further if folks think its related :) |
Nomad version
v1.2.6
(Nomad v1.2.6 has problem described below, while Nomad v1.1.5 works as expected.)
Operating system and Environment details
Nomad nodes are running Ubuntu. Docker driver is used for all tasks.
A set of nodes has
node.class
set toworker
and there are few other nodes in the cluster.Issue
System job with constrains fails to plan.
Reproduction steps
A job with
type = "system"
is used to schedule tasks on the worker nodes. So the following constraint is added to the worker group:Expected Result
All the worker nodes should run the worker task, all other nodes should not.
Actual Result
This works sometimes, in particular when there are no allocations on the cluster. But running nomad job plan after allocations are running displays the following warning:
This should not be a warning, as the allocations match the job definition, considering the constraints.
nomad job run
produces the desired state and the job state is displayed as “not scheduled” on all non-worker nodes.Removing the constrains shows no warning, but obviously schedules the worker task on non-worker nodes, which is unwanted.
The only workaround seems be to ignore warnings, which defeats the purpose of
nomad job plan
, or create a entire separate cluster for the workers.Possibly related:
The text was updated successfully, but these errors were encountered: