Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nomad server panic upon upgrade to 0.5.4 #2266

Closed
BSick7 opened this issue Feb 1, 2017 · 9 comments
Closed

nomad server panic upon upgrade to 0.5.4 #2266

BSick7 opened this issue Feb 1, 2017 · 9 comments

Comments

@BSick7
Copy link

BSick7 commented Feb 1, 2017

If you have a question, prepend your issue with [question] or preferably use the nomad mailing list.

If filing a bug please include the following:

Nomad version

Output from nomad version

Nomad v0.5.4

Operating system and Environment details

Container Linux by CoreOS stable (1235.6.0)

Issue

Servers 0 and 1 were already upgraded to 0.5.4 and server 2 was the active leader upon recreation. Leader transitioned to server 1.

When bringing on a new server 2 with nomad 0.5.4, a panic occurred (error log below).
Restarting server 2 caused server 0 to panic the same way.
They continued to ping-pong if the nomad service attempted to start.

Reproduction steps

Nomad Server logs (if appropriate)

 panic: interface conversion: error is nil, not *structs.RecoverableError
Feb 01 19:18:59 manager-2 nomad[4152]: goroutine 414 [running]:
Feb 01 19:18:59 manager-2 nomad[4152]: panic(0x100c3c0, 0xc422484b80)
Feb 01 19:18:59 manager-2 nomad[4152]:         /opt/go/src/runtime/panic.go:500 +0x1a1
Feb 01 19:18:59 manager-2 nomad[4152]: github.com/hashicorp/nomad/nomad.(*Node).DeriveVaultToken.func1(0x0, 0x0, 0x0)
Feb 01 19:18:59 manager-2 nomad[4152]:         /opt/gopath/src/github.com/hashicorp/nomad/nomad/node_endpoint.go:947 +0x1ae
Feb 01 19:18:59 manager-2 nomad[4152]: github.com/hashicorp/nomad/nomad.(*Node).DeriveVaultToken(0xc4203f4cc0, 0xc420caa980, 0xc421215d10, 0x0, 0x0)
Feb 01 19:18:59 manager-2 nomad[4152]:         /opt/gopath/src/github.com/hashicorp/nomad/nomad/node_endpoint.go:952 +0x207d
Feb 01 19:18:59 manager-2 nomad[4152]: reflect.Value.call(0xc42042cae0, 0xc420024aa8, 0x13, 0x11741a8, 0x4, 0xc422985de8, 0x3, 0x3, 0xc8daaf, 0xc42086d080, ...)
Feb 01 19:18:59 manager-2 nomad[4152]:         /opt/go/src/reflect/value.go:434 +0x5c8
Feb 01 19:18:59 manager-2 nomad[4152]: reflect.Value.Call(0xc42042cae0, 0xc420024aa8, 0x13, 0xc422985de8, 0x3, 0x3, 0xc421215d10, 0x16, 0xc421b98f40)
Feb 01 19:18:59 manager-2 nomad[4152]:         /opt/go/src/reflect/value.go:302 +0xa4
Feb 01 19:18:59 manager-2 nomad[4152]: net/rpc.(*service).call(0xc4203f4e80, 0xc4203f4c80, 0xc4226e8fc0, 0xc420422800, 0xc421b98f40, 0x109b440, 0xc420caa980, 0x16, 0x103bcc0, 0xc421215d10, ...) Feb 01 19:18:59 manager-2 nomad[4152]:         /opt/go/src/net/rpc/server.go:383 +0x148
Feb 01 19:18:59 manager-2 nomad[4152]: net/rpc.(*Server).ServeRequest(0xc4203f4c80, 0x1925c00, 0xc42086d080, 0x3f800000, 0x0)
Feb 01 19:18:59 manager-2 nomad[4152]:         /opt/go/src/net/rpc/server.go:498 +0x270
Feb 01 19:18:59 manager-2 nomad[4152]: github.com/hashicorp/nomad/nomad.(*Server).handleNomadConn(0xc42042a4e0, 0x192abc0, 0xc4224e1e10)
Feb 01 19:18:59 manager-2 nomad[4152]:         /opt/gopath/src/github.com/hashicorp/nomad/nomad/rpc.go:165 +0x12c
Feb 01 19:18:59 manager-2 nomad[4152]: created by github.com/hashicorp/nomad/nomad.(*Server).handleMultiplex
Feb 01 19:18:59 manager-2 nomad[4152]:         /opt/gopath/src/github.com/hashicorp/nomad/nomad/rpc.go:150 +0x197
@dadgar
Copy link
Contributor

dadgar commented Feb 1, 2017

Thanks for reporting! Sorry about that will get fixed for the next release

@BSick7
Copy link
Author

BSick7 commented Feb 1, 2017

Accepting pull requests? This seems like an easy one for me to contribute. :)

@schmichael
Copy link
Member

@BSick7 thanks for the bug report and fix!

dadgar added a commit that referenced this issue Feb 2, 2017
Prep for 0.5.5 and add changelog entry for #2266
@tugbabodrumlu
Copy link

Hi,
Will you create a new release soon with the fix?

@schmichael
Copy link
Member

@tugbabodrumlu Yes, 0.5.5 will be coming soon; probably with an RC first.

@stevenscg
Copy link

stevenscg commented Feb 10, 2017

@schmichael FYI, I am experiencing a panic in a similar area (DeriveVaultToken.func1, etc) with servers running 0.5.4. But in my case, the cluster was running and we were not adding or removing servers.

I can run several jobs before I get the panic which seems to affect all of the nomad servers and all 3 server processes exit. The jobs are all using raw_exec.

Nomad 0.5.4
Vault 0.6.5
Consul 0.7.4

There are 3 nomad servers and 1 nomad worker. All CentOS 7 linux and raw_exec driver.

Here is some syslog output from one of the servers:

Feb  9 20:45:46 ip-10-101-25-243 nomad-runner.sh: panic: interface conversion: error is nil, not *structs.RecoverableError
Feb  9 20:45:46 ip-10-101-25-243 nomad-runner.sh: goroutine 82512 [running]:
Feb  9 20:45:46 ip-10-101-25-243 nomad-runner.sh: panic(0x100c3c0, 0xc420639400)
Feb  9 20:45:46 ip-10-101-25-243 nomad-runner.sh: /opt/go/src/runtime/panic.go:500 +0x1a1
Feb  9 20:45:46 ip-10-101-25-243 nomad-runner.sh: github.com/hashicorp/nomad/nomad.(*Node).DeriveVaultToken.func1(0x0, 0x0, 0x0)
Feb  9 20:45:46 ip-10-101-25-243 nomad-runner.sh: /opt/gopath/src/github.com/hashicorp/nomad/nomad/node_endpoint.go:947 +0x1ae
Feb  9 20:45:46 ip-10-101-25-243 nomad-runner.sh: github.com/hashicorp/nomad/nomad.(*Node).DeriveVaultToken(0xc4203a6740, 0xc42063d180, 0xc42019de60, 0x0, 0x0)
Feb  9 20:45:46 ip-10-101-25-243 nomad-runner.sh: /opt/gopath/src/github.com/hashicorp/nomad/nomad/node_endpoint.go:952 +0x207d
Feb  9 20:45:46 ip-10-101-25-243 nomad-runner.sh: reflect.Value.call(0xc4203abc20, 0xc420020d08, 0x13, 0x11741a8, 0x4, 0xc42079bde8, 0x3, 0x3, 0xc8daaf, 0xc42000c400, ...)
Feb  9 20:45:46 ip-10-101-25-243 nomad-runner.sh: /opt/go/src/reflect/value.go:434 +0x5c8
Feb  9 20:45:46 ip-10-101-25-243 nomad-runner.sh: reflect.Value.Call(0xc4203abc20, 0xc420020d08, 0x13, 0xc42079bde8, 0x3, 0x3, 0xc42019de60, 0x16, 0xc4201a0320)
Feb  9 20:45:46 ip-10-101-25-243 nomad-runner.sh: /opt/go/src/reflect/value.go:302 +0xa4
Feb  9 20:45:46 ip-10-101-25-243 nomad-runner.sh: net/rpc.(*service).call(0xc4203a6900, 0xc4203a6700, 0xc4201b5ae8, 0xc420398b00, 0xc4201a0320, 0x109b440, 0xc42063d180, 0x16, 0x103bcc0, 0xc42019de60, ...)
Feb  9 20:45:46 ip-10-101-25-243 nomad-runner.sh: /opt/go/src/net/rpc/server.go:383 +0x148
Feb  9 20:45:46 ip-10-101-25-243 nomad-runner.sh: net/rpc.(*Server).ServeRequest(0xc4203a6700, 0x1925c00, 0xc42000c400, 0x3f800000, 0x0)
Feb  9 20:45:46 ip-10-101-25-243 nomad-runner.sh: /opt/go/src/net/rpc/server.go:498 +0x270

Feb  9 20:45:46 ip-10-101-25-243 nomad-runner.sh: github.com/hashicorp/nomad/nomad.(*Server).handleNomadConn(0xc4203ce340, 0x192abc0, 0xc4203ec5b0)
Feb  9 20:45:46 ip-10-101-25-243 nomad-runner.sh: /opt/gopath/src/github.com/hashicorp/nomad/nomad/rpc.go:165 +0x12c
Feb  9 20:45:46 ip-10-101-25-243 nomad-runner.sh: created by github.com/hashicorp/nomad/nomad.(*Server).handleMultiplex
Feb  9 20:45:46 ip-10-101-25-243 nomad-runner.sh: /opt/gopath/src/github.com/hashicorp/nomad/nomad/rpc.go:150 +0x197
Feb  9 20:45:46 ip-10-101-25-243 systemd: nomad.service: main process exited, code=exited, status=2/INVALIDARGUMENT
Feb  9 20:45:46 ip-10-101-25-243 systemd: Unit nomad.service entered failed state.
Feb  9 20:45:46 ip-10-101-25-243 systemd: nomad.service failed.

@stevenscg
Copy link

I am running a build from master and have yet to see the panic from above.

nomad -v
Nomad v0.5.5-dev (a16709ef4360ec4e453ec4560fe1bbebf3cb3be5)

@dadgar
Copy link
Contributor

dadgar commented Feb 10, 2017

@stevenscg Hey that has been fixed in master! We will hopefully have 0.5.5 out next week!

@github-actions
Copy link

I'm going to lock this issue because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Dec 16, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

5 participants