-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
runtime error: invalid memory address or nil pointer dereference #19644
Comments
After applying the job without errors, I tried yet-another upgrade of the servers to 1.7.2 and the issue recurred. I am downgrading the servers again to 1.6.5 and leaving them for now. I shall try to reproduce the issue maybe tomorrow |
Hi @shantanugadgil and thanks for raising this issue which I have been able to reproduce locally running a dev agent. It looks like this occurs due to a specific combination of parameters which you can see within the example jobspec below. I will work on a fix for this and should have a PR ready shortly.
This bug will only impact the Nomad servers and does not impact the Nomad clients.
terraform {
required_providers {
nomad = {
source = "hashicorp/nomad"
version = "= 2.1.0"
}
}
}
provider "nomad" {
address = "http://127.0.0.1:4646"
}
resource "nomad_job" "gh19644" {
jobspec = file("${path.module}/gh19644.nomad.hcl")
}
job "gh19644" {
type = "system"
group "cache" {
max_client_disconnect = "1h"
prevent_reschedule_on_lost = true
task "redis" {
driver = "docker"
config {
image = "redis:7"
}
}
}
} The server panic output:
|
Awesome thanks. Same for me too, the above jobspec causes the server traceback for me too. Looking forward to a new release !? 🙂 =======
The above question was more from a validity perspective, where the upgrade docs say that server_version should be >= client_version. Currently on the affected cluster jobs seem to be running fine |
from a docs perspective there could be a doc bug here: |
Would Nomad 1.7.3 be happening any time soon? |
I'm going to lock this issue because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active issues. |
Nomad version
Nomad v1.7.2
BuildDate 2023-12-13T19:59:42Z
Revision 64e3dca
Operating system and Environment details
Amazon Linux 2 / Amazon Linux 2023
Issue
Reproduction steps
submitting a particular job via Terraform (1.6.6) and the Terraform provider (2.1.0) causes the server leader to crash.
Expected Result
This should not happen. Downgrading the servers to 1.6.5 make the error go away.
Actual Result
server crashes, leader changes
Job file (if appropriate)
too big and customized to share for now.
Nomad Server logs (if appropriate)
the segfault traceback has been posted above
Nomad Client logs (if appropriate)
N/A
NOTE: This has started occurring recently.
I have already upgraded all the clients to 1.7.2
How long before I would have to downgrade all client to 1.6.5 as well? OR can I leave the cluster like this until 1.7.3 (hopefully with fixes)?
The text was updated successfully, but these errors were encountered: