-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
batch jobs get rescheduled due to certain constraints #6471
Comments
thanks, @shantanugadgil ... looking at this. |
@shantanugadgil , some clarification, please. When you say the following:
what i'm seeing is that any change to the job file causes the task to run again, but i'm not getting repeated executions of the task without changing the file. |
@cgbaker the steps I did are as follows: Scenario 1:
Scenario 2:
So, I think, adding that particular constraint of |
ok, thank you. that's what i thought you were saying. having said that, i wasn't able to reproduce scenario 2; my first
My job status is as expected; one version, one allocation:
Can you paste your server logs? |
Until I scrub the server logs ... here are the commands and more outputs ... BTW; this is an upgraded cluster from
|
can i see the output from |
@cgbaker apologies, I missed to send the outputs sooner. here is a brand new batch:
The log file on the server for this run is attached to this post again. |
Hi @cgbaker were you able to reproduce this on a clustered setup. I am in the process of creating a 3-4 node Vagrant based CentOS 7 setup to see if I can recreate this. |
@cgbaker I want to confirm that this is happening on my Vagrant (VirtualBox) setup, which uses CentOS 7 as the OS. *** The Vagrantfile sets up the server configs fine, the client configs need to be setup after the first "vagrant up" Do let me know if you are able to reproduce the issue. N.B. I am no Ruby/Vagrant expert and the Vagrantfile has been put together from various sources. 😎 Cheers, |
poke 👉 Just wondering if anyone else was able to repro the issue using the Vagrantfile.compact file (above) or on their own clustered setup using 0.10.1 GA ? |
hi @shantanugadgil , thank you for the vagrantfile. we haven't replicated it yet, but we've got this on our board for 0.10.x; we won't forget it! |
Hey there Since this issue hasn't had any activity in a while - we're going to automatically close it in 30 days. If you're still seeing this issue with the latest version of Nomad, please respond here and we'll keep this open and take another look at this. Thanks! |
/comment |
/need-verification-with-latest-version |
Sorry for the silence @shantanugadgil ! I just wanted to clarify why this has not been a priority for us: Right now any jobspec change causes a new version of the job to be created and run. The benefit here is that
What is your use case around altering a job, submitting it, and wanting it to not be run again? We could differentiate between "meaningful" and "meaningless" jobspec changes, but any logic we introduce will necessarily be harder to understand than our current naive "if it changed, run it again." |
HI @schmichael I think my original bug description may have been ambiguous or confusing. The issue I was observing was: When I have two constraints (the kernel linux and node class), and I do When I add a second constraint, I fully expect the job to re-run the first time, which it does. |
Nomad version: Nomad v1.0.1 (c9c68aa)
job "uptime1" {
datacenters = ["dc1"]
type = "batch"
constraint {
attribute = "${attr.kernel.name}"
value = "linux"
}
group "mygroup" {
count = 1
task "mytask" {
driver = "raw_exec"
config {
command = "/usr/bin/uptime"
}
}
}
} $ nomad run uptime1.nomad
==> Monitoring evaluation "963d04d6"
Evaluation triggered by job "uptime1"
==> Monitoring evaluation "963d04d6"
Evaluation status changed: "pending" -> "complete"
==> Evaluation "963d04d6" finished with status "complete"
$ nomad run uptime1.nomad
==> Monitoring evaluation "4d21fa75"
Evaluation triggered by job "uptime1"
==> Monitoring evaluation "4d21fa75"
Evaluation status changed: "pending" -> "complete"
==> Evaluation "4d21fa75" finished with status "complete"
$ nomad run uptime1.nomad
==> Monitoring evaluation "886272ef"
Evaluation triggered by job "uptime1"
==> Monitoring evaluation "886272ef"
Evaluation status changed: "pending" -> "complete"
==> Evaluation "886272ef" finished with status "complete"
$ nomad run uptime1.nomad
==> Monitoring evaluation "0fdaf19a"
Evaluation triggered by job "uptime1"
==> Monitoring evaluation "0fdaf19a"
Evaluation status changed: "pending" -> "complete"
==> Evaluation "0fdaf19a" finished with status "complete"
$ nomad run uptime1.nomad
==> Monitoring evaluation "6a17a108"
Evaluation triggered by job "uptime1"
==> Monitoring evaluation "6a17a108"
Evaluation status changed: "pending" -> "complete"
==> Evaluation "6a17a108" finished with status "complete" notice: no new allocation, which, I think, is appropriate.
job "uptime2" {
datacenters = ["dc1"]
type = "batch"
constraint {
attribute = "${attr.kernel.name}"
value = "linux"
}
constraint {
attribute = "${node.class}"
value = "git-builder"
}
group "mygroup" {
count = 1
task "mytask" {
driver = "raw_exec"
config {
command = "/usr/bin/uptime"
}
}
}
} $ nomad run uptime2.nomad
==> Monitoring evaluation "28b3a697"
Evaluation triggered by job "uptime2"
==> Monitoring evaluation "28b3a697"
Allocation "29e59367" created: node "6e2b10f5", group "mygroup"
Evaluation status changed: "pending" -> "complete"
==> Evaluation "28b3a697" finished with status "complete"
$ nomad run uptime2.nomad
==> Monitoring evaluation "5c91d8bf"
Evaluation triggered by job "uptime2"
==> Monitoring evaluation "5c91d8bf"
Allocation "477c5f9c" created: node "abbc3eb4", group "mygroup"
Evaluation status changed: "pending" -> "complete"
==> Evaluation "5c91d8bf" finished with status "complete"
$ nomad run uptime2.nomad
==> Monitoring evaluation "fd6ca454"
Evaluation triggered by job "uptime2"
==> Monitoring evaluation "fd6ca454"
Allocation "6a51dfb8" created: node "abbc3eb4", group "mygroup"
Evaluation status changed: "pending" -> "complete"
==> Evaluation "fd6ca454" finished with status "complete"
$ nomad run uptime2.nomad
==> Monitoring evaluation "e545d228"
Evaluation triggered by job "uptime2"
==> Monitoring evaluation "e545d228"
Allocation "75021be3" created: node "abbc3eb4", group "mygroup"
Evaluation status changed: "pending" -> "complete"
==> Evaluation "e545d228" finished with status "complete"
$ nomad run uptime2.nomad
==> Monitoring evaluation "19dc083b"
Evaluation triggered by job "uptime2"
==> Monitoring evaluation "19dc083b"
Allocation "b4421d2f" created: node "abbc3eb4", group "mygroup"
Evaluation status changed: "pending" -> "complete"
==> Evaluation "19dc083b" finished with status "complete"
notice: new allocation triggered everytime, which I think, is incorrect. btw: for both the |
Thanks @shantanugadgil - that makes sense.
Sounds like a bug to me as it breaks the "if nothing changes Nomad won't reschedule" behavior I mentioned above. I'm going to lightly edit the original issue to make the issue+repro+solution easier to find. |
Hi, so unfortunately I was unable to reproduce this behavior on 1.0.3-dev HEAD or the released 1.0.1, on either a single-node Vagrant cluster or an E2E cluster deployed to AWS. (I wasn't able to get your Vagrant stack up, @shantanugadgil... there's something I'm missing about the context you're running it in I think but it was easier for me to just launch a real cluster.) My configuration diff from the E2E cluster: $ git diff
diff --git a/e2e/terraform/.terraform.lock.hcl b/e2e/terraform/.terraform.lock.hcl
old mode 100755
new mode 100644
diff --git a/e2e/terraform/config/dev-cluster/nomad/client-linux/client.hcl b/e2e/terraform/config/dev-cluster/nomad/client-linux/client.hcl
index 439aa72f3..455c30bb7 100644
--- a/e2e/terraform/config/dev-cluster/nomad/client-linux/client.hcl
+++ b/e2e/terraform/config/dev-cluster/nomad/client-linux/client.hcl
@@ -3,6 +3,8 @@ plugin_dir = "/opt/nomad/plugins"
client {
enabled = true
+ node_class = "test-class"
+
options {
# Allow jobs to run as root
"user.denylist" = "" And the resulting cluster:
The first job which we know should behave correctly: job "uptime1" {
datacenters = ["dc1"]
type = "batch"
constraint {
attribute = "${attr.kernel.name}"
value = "linux"
}
group "mygroup" {
task "mytask" {
driver = "raw_exec"
config {
command = "/usr/bin/uptime"
}
}
}
} Run the job and wait for it to complete:
Then run the job again, unchanged, and observe we get no new allocation, just as expected.
So far so good. Now the second job:
Run it and wait for it to complete:
Re-run the job unchanged, and note no new allocations are run:
I've tried a couple different combinations of valid constraints for the cluster and was unable to reproduce. @shantanugadgil I'm sorry this issue got left open so long but I'm not really sure what else to explore here without more info. The log files that were previously provided are all client logs and not server logs, so there might be more info available if we could figure out what the scheduler thinks is going on. |
Some differences: (none of this should really matter, but mentioning anyway)
Let me try and capture some server logs and see if I could attach them here. |
Forgot to mention, I have since updated to v1.0.2 Q: what would be a correct way to capture the logs on the server?
|
That works. Ok, so check out these lines from the log:
Notice how the
Can you provide the |
Ok, so the allocation status just shows what appears to be an identical allocation, and the server logs are showing nothing unusual. I looked at your server.hcl and client.hcl files and didn't see anything weird there. The job files you shared are the exact jobs you're running and not a redacted version? I'm worried there's a block that's being treated differently between them that's not in the job I'm running. Maybe the verbose plan will give us some more clues? Can you run
If that doesn't tell us anything, then for a somewhat extreme approach, if you were to patch the job diffing function on the server to dump the diff object to the logs it might give us more info. But really that should all be in the plan diff. diff --git a/nomad/structs/diff.go b/nomad/structs/diff.go
index 6a08e7c4e..5f56c52a8 100644
--- a/nomad/structs/diff.go
+++ b/nomad/structs/diff.go
@@ -6,6 +6,7 @@ import (
"sort"
"strings"
+ "github.com/davecgh/go-spew/spew"
"github.com/hashicorp/nomad/helper/flatmap"
"github.com/mitchellh/hashstructure"
)
@@ -166,6 +167,7 @@ func (j *Job) Diff(other *Job, contextual bool) (*JobDiff, error) {
}
}
+ spew.Dump(diff)
return diff, nil
} |
@tgross I forgot to answer the questions you asked previously: the uptime2 is the same (minor differences like explicit Also, when I run the following ...
... I see the monitor logs triggering the following ...
|
@tgross please ignore my previous conclusion about starting the monitor on the agents... that doesn't seem to be related to anything. 🐠 ... but the following does:
|
Ok, thanks @shantanugadgil. I'll keep digging and see if I can come up with a possible way this could be triggered. |
Going to make sure this gets surfaced for roadmapping but so far there's no reproduction. |
Nomad version
Nomad v0.10.0-rc1 (c49bf41)
Operating system and Environment details
Ubuntu 16.04 + updates
Issue
When I have only a single constraint (the kernel linux thing) on a batch job, and I do
nomad run ...
multiple times, the job is not updated on subsequent runs because nothing has changed. 👍When I have two constraints (the kernel linux and node class) on a batch job, and I do
nomad run ...
multiple times, the job behaves is updated on subsequent runs despite nothing changing. 👎Subsequent
nomad run ...
invocations on a job file that has not changed should be a noop (no update, no new alloc, etc).Reproduction steps
run the job mentioned below with the
node.class
commented out; you'll see that the job's task doesn't execute, even if you rerun the job. (Do "nomad run uptime.nomad" multiple times.)Un-comment the
node.class
constraint and resubmit the job. If you do "nomad run uptime.nomad", you'll observe that the task runs every time; even if there was one-and-only-one machine matching the constraint.Job file (if appropriate)
Expected behavior
Nomad should understand that the job had completed successfully (exit 0) and not re-run the task.
The text was updated successfully, but these errors were encountered: