Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nomad server panics when upgrading to 0.9.2 #5793

Closed
vincenthuynh opened this issue Jun 7, 2019 · 1 comment · Fixed by #5794
Closed

Nomad server panics when upgrading to 0.9.2 #5793

vincenthuynh opened this issue Jun 7, 2019 · 1 comment · Fixed by #5794
Labels

Comments

@vincenthuynh
Copy link

Nomad version

v0.9.2

Operating system and Environment details

Debian 9.7

Issue

We were upgrading from 0.8.6 when we encountered this error. It happened after the nomad servers were upgraded and as we began upgrading the nomad nodes.

Our cluster was unrecoverable and we had to recreate the cluster again. This was in our dev environment so we could cope with it.

Reproduction steps

Job file (if appropriate)

Nomad Client logs (if appropriate)

Nomad Server logs (if appropriate)

Jun  7 15:46:13 nmaster-dev-01 nomad[15883]:     2019-06-07T15:46:13.702-0400 [DEBUG] worker: dequeued evaluation: eval_id=909f5f9e-92d6-a4f8-220c-044dd4a2a8e1
Jun  7 15:46:13 nmaster-dev-01 nomad[15883]:     2019-06-07T15:46:13.702-0400 [DEBUG] worker: dequeued evaluation: eval_id=1f0f9e5a-dc82-c0ac-86fe-2cef38c6e1d8
Jun  7 15:46:13 nmaster-dev-01 nomad[15883]:     2019-06-07T15:46:13.702-0400 [DEBUG] worker: dequeued evaluation: eval_id=9e160e3b-bf4f-9ac3-dbf7-b626e0085d5f
Jun  7 15:46:14 nmaster-dev-01 nomad[15883]:     2019-06-07T15:46:14.042-0400 [DEBUG] worker.batch_sched: reconciled current state with desired state: eval_id=1f0f9e5a-dc82-c0ac-86fe-2cef38c6e1d8 job_id=analyst-report-no-email/periodic-1559934300 namespace=d
efault results="Total changes: (place 0) (destructive 0) (inplace 0) (stop 0)
Jun  7 15:46:14 nmaster-dev-01 nomad[15883]: Desired Changes for "analyst-report-no-email": (place 0) (inplace 0) (destructive 0) (stop 0) (migrate 0) (ignore 0) (canary 0)"
Jun  7 15:46:14 nmaster-dev-01 nomad[15883]:     2019-06-07T15:46:14.043-0400 [DEBUG] worker.batch_sched: setting eval status: eval_id=1f0f9e5a-dc82-c0ac-86fe-2cef38c6e1d8 job_id=analyst-report-no-email/periodic-1559934300 namespace=default status=complete
Jun  7 15:46:14 nmaster-dev-01 nomad[15883]:     2019-06-07T15:46:14.042-0400 [DEBUG] worker.system_sched: reconciled current state with desired state: eval_id=909f5f9e-92d6-a4f8-220c-044dd4a2a8e1 job_id=config namespace=default place=1 update=0 migrate=0 st
op=0 ignore=3 lost=1
Jun  7 15:46:14 nmaster-dev-01 nomad[15883]: panic: runtime error: index out of range
Jun  7 15:46:14 nmaster-dev-01 nomad[15883]: goroutine 86 [running]:
Jun  7 15:46:14 nmaster-dev-01 nomad[15883]: github.com/hashicorp/nomad/scheduler.(*Preemptor).PreemptForNetwork(0xc00001d580, 0xc00161bdc0, 0xc00001d550, 0x2, 0xc0003ae830, 0x1)
Jun  7 15:46:14 nmaster-dev-01 nomad[15883]: #011/opt/gopath/src/github.com/hashicorp/nomad/scheduler/preemption.go:306 +0x144f
Jun  7 15:46:14 nmaster-dev-01 nomad[15883]: github.com/hashicorp/nomad/scheduler.(*BinPackIterator).Next(0xc0006636c0, 0x40ef9d)
Jun  7 15:46:14 nmaster-dev-01 nomad[15883]: #011/opt/gopath/src/github.com/hashicorp/nomad/scheduler/rank.go:254 +0x161b
Jun  7 15:46:14 nmaster-dev-01 nomad[15883]: github.com/hashicorp/nomad/scheduler.(*ScoreNormalizationIterator).Next(0xc00080a880, 0x2431540)
Jun  7 15:46:14 nmaster-dev-01 nomad[15883]: #011/opt/gopath/src/github.com/hashicorp/nomad/scheduler/rank.go:625 +0x38
Jun  7 15:46:14 nmaster-dev-01 nomad[15883]: github.com/hashicorp/nomad/scheduler.(*SystemStack).Select(0xc00161bd50, 0xc0002beb00, 0x0, 0x24)
Jun  7 15:46:14 nmaster-dev-01 nomad[15883]: #011/opt/gopath/src/github.com/hashicorp/nomad/scheduler/stack.go:272 +0x186
Jun  7 15:46:14 nmaster-dev-01 nomad[15883]: github.com/hashicorp/nomad/scheduler.(*SystemScheduler).computePlacements(0xc00085e6e0, 0xc00080a9e0, 0x1, 0x1, 0xc0008488a8, 0x0)
Jun  7 15:46:14 nmaster-dev-01 nomad[15883]: #011/opt/gopath/src/github.com/hashicorp/nomad/scheduler/system_sched.go:284 +0x746
Jun  7 15:46:14 nmaster-dev-01 nomad[15883]: github.com/hashicorp/nomad/scheduler.(*SystemScheduler).computeJobAllocs(0xc00085e6e0, 0xc000d465a0, 0xc00161bd50)
Jun  7 15:46:14 nmaster-dev-01 nomad[15883]: #011/opt/gopath/src/github.com/hashicorp/nomad/scheduler/system_sched.go:262 +0x101d
Jun  7 15:46:14 nmaster-dev-01 nomad[15883]: github.com/hashicorp/nomad/scheduler.(*SystemScheduler).process(0xc00085e6e0, 0x1c55c20, 0xc000302f10, 0x1c55c20)
Jun  7 15:46:14 nmaster-dev-01 nomad[15883]: #011/opt/gopath/src/github.com/hashicorp/nomad/scheduler/system_sched.go:128 +0x431
Jun  7 15:46:14 nmaster-dev-01 nomad[15883]: github.com/hashicorp/nomad/scheduler.(*SystemScheduler).process-fm(0xc000303450, 0x0, 0x0)
Jun  7 15:46:14 nmaster-dev-01 nomad[15883]: #011/opt/gopath/src/github.com/hashicorp/nomad/scheduler/system_sched.go:74 +0x2a
Jun  7 15:46:14 nmaster-dev-01 nomad[15883]: github.com/hashicorp/nomad/scheduler.retryMax(0x5, 0xc00001ddf0, 0xc00001de00, 0xb, 0x0)
Jun  7 15:46:14 nmaster-dev-01 nomad[15883]: #011/opt/gopath/src/github.com/hashicorp/nomad/scheduler/util.go:271 +0x40
Jun  7 15:46:14 nmaster-dev-01 nomad[15883]: github.com/hashicorp/nomad/scheduler.(*SystemScheduler).Process(0xc00085e6e0, 0xc000fb4160, 0x245b260, 0xc000365620)
Jun  7 15:46:14 nmaster-dev-01 nomad[15883]: #011/opt/gopath/src/github.com/hashicorp/nomad/scheduler/system_sched.go:74 +0x2b6
Jun  7 15:46:14 nmaster-dev-01 nomad[15883]: github.com/hashicorp/nomad/nomad.(*Worker).invokeScheduler(0xc00048cfc0, 0xc00059c870, 0xc000fb4160, 0xc0011b6a20, 0x24, 0x0, 0x0)
Jun  7 15:46:14 nmaster-dev-01 nomad[15883]: #011/opt/gopath/src/github.com/hashicorp/nomad/nomad/worker.go:268 +0x357
Jun  7 15:46:14 nmaster-dev-01 nomad[15883]: github.com/hashicorp/nomad/nomad.(*Worker).run(0xc00048cfc0)
Jun  7 15:46:14 nmaster-dev-01 nomad[15883]: #011/opt/gopath/src/github.com/hashicorp/nomad/nomad/worker.go:129 +0x2ea
Jun  7 15:46:14 nmaster-dev-01 nomad[15883]: created by github.com/hashicorp/nomad/nomad.NewWorker
Jun  7 15:46:14 nmaster-dev-01 nomad[15883]: #011/opt/gopath/src/github.com/hashicorp/nomad/nomad/worker.go:81 +0x14f
Jun  7 15:46:14 nmaster-dev-01 systemd[1]: nomad.service: Main process exited, code=exited, status=2/INVALIDARGUMENT
Jun  7 15:46:14 nmaster-dev-01 systemd[1]: nomad.service: Unit entered failed state.
Jun  7 15:46:14 nmaster-dev-01 systemd[1]: nomad.service: Failed with result 'exit-code'.
notnoop pushed a commit that referenced this issue Jun 7, 2019
When examining preemption for networks, only consider allocs that have
networks.

Fixes #5793
@github-actions
Copy link

I'm going to lock this issue because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Nov 22, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants