nomad server panic: runtime error: invalid memory address or nil pointer dereference #4463

dcparker88 · 2018-07-02T19:20:58Z

If you have a question, prepend your issue with [question] or preferably use the nomad mailing list.

If filing a bug please include the following:

Nomad version

Nomad v0.8.3 (c85483d)

Operating system and Environment details

Linux nomad-97d52edaa6767264 2.6.32-696.30.1.el6.centos.plus.x86_64 #1 SMP Wed May 23 20:32:06 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

cat /etc/redhat-release
CentOS release 6.9 (Final)

Issue

Our Nomad cluster went it to a weird state over the weekend, all 3 servers started crashing on startup with the following:

Desired Changes for "curator": (place 1) (inplace 0) (destructive 0) (stop 0) (migrate 0) (ignore 0) (canary 0)
    2018/07/02 15:14:38 [DEBUG] sched: <Eval "d875e98a-8db0-64f2-5dc9-c12157823669" JobID: "curator" Namespace: "default">: setting status to complete
    2018/07/02 15:14:38 [DEBUG] sched: <Eval "2cc7039e-7f45-1d56-ca4e-23bc7f4a9045" JobID: "curator/periodic-1530334800" Namespace: "default">: Total changes: (place 0) (destructive 0) (inplace 0) (stop 0)
Desired Changes for "curator": (place 0) (inplace 0) (destructive 0) (stop 0) (migrate 0) (ignore 1) (canary 0)
    2018/07/02 15:14:38 [DEBUG] sched: <Eval "2cc7039e-7f45-1d56-ca4e-23bc7f4a9045" JobID: "curator/periodic-1530334800" Namespace: "default">: setting status to complete
    2018/07/02 15:14:38 [DEBUG] sched: <Eval "a90bfea6-9e6d-6714-6f99-4a249e32e00a" JobID: "elk" Namespace: "default">: Total changes: (place 1) (destructive 0) (inplace 0) (stop 0)
Desired Changes for "es-cluster-master": (place 1) (inplace 0) (destructive 0) (stop 0) (migrate 0) (ignore 13) (canary 0)
Desired Changes for "logstash": (place 0) (inplace 0) (destructive 0) (stop 0) (migrate 0) (ignore 4) (canary 0)
Desired Changes for "kibana": (place 0) (inplace 0) (destructive 0) (stop 0) (migrate 0) (ignore 2) (canary 0)

panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0xc8 pc=0xf8d94a]

goroutine 32 [running]:
github.com/hashicorp/nomad/nomad/structs.(*Node).Ready(...)
	/opt/gopath/src/github.com/hashicorp/nomad/nomad/structs/structs.go:1431
github.com/hashicorp/nomad/scheduler.(*GenericScheduler).findPreferredNode(0xc42034de00, 0x2016ee0, 0xc4207d2450, 0x11, 0x20d3d40, 0xc420585f00)
	/opt/gopath/src/github.com/hashicorp/nomad/scheduler/generic_sched.go:596 +0xfa
github.com/hashicorp/nomad/scheduler.(*GenericScheduler).computePlacements(0xc42034de00, 0x20d2740, 0x0, 0x0, 0xc42003f230, 0x1, 0x1, 0xc4201bc200, 0x14)
	/opt/gopath/src/github.com/hashicorp/nomad/scheduler/generic_sched.go:448 +0x2eb
github.com/hashicorp/nomad/scheduler.(*GenericScheduler).computeJobAllocs(0xc42034de00, 0xc420614680, 0xc420910080)
	/opt/gopath/src/github.com/hashicorp/nomad/scheduler/generic_sched.go:410 +0x178d
github.com/hashicorp/nomad/scheduler.(*GenericScheduler).process(0xc42034de00, 0xc4207b3980, 0xc4206bd710, 0xc420098c60)
	/opt/gopath/src/github.com/hashicorp/nomad/scheduler/generic_sched.go:245 +0x535
github.com/hashicorp/nomad/scheduler.(*GenericScheduler).(github.com/hashicorp/nomad/scheduler.process)-fm(0x7f07d45b4d90, 0xc420715220, 0x3)
	/opt/gopath/src/github.com/hashicorp/nomad/scheduler/generic_sched.go:144 +0x2a
github.com/hashicorp/nomad/scheduler.retryMax(0x5, 0xc4206bd8a0, 0xc4206bd8b0, 0xc, 0xffffffffffffffff)
	/opt/gopath/src/github.com/hashicorp/nomad/scheduler/util.go:271 +0x46
github.com/hashicorp/nomad/scheduler.(*GenericScheduler).Process(0xc42034de00, 0xc420098c60, 0xc4201d24b0, 0x2017d20)
	/opt/gopath/src/github.com/hashicorp/nomad/scheduler/generic_sched.go:144 +0x123
github.com/hashicorp/nomad/nomad.(*nomadFSM).reconcileQueuedAllocations(0xc4201e3560, 0x7d70, 0x0, 0x0)
	/opt/gopath/src/github.com/hashicorp/nomad/nomad/fsm.go:1321 +0x947
github.com/hashicorp/nomad/nomad.(*nomadFSM).applyReconcileSummaries(0xc4201e3560, 0xc420257c51, 0x8, 0x8, 0x7d70, 0x4abcc748, 0xc4206bdcd8)
	/opt/gopath/src/github.com/hashicorp/nomad/nomad/fsm.go:746 +0x7e
github.com/hashicorp/nomad/nomad.(*nomadFSM).Apply(0xc4201e3560, 0xc4202f6030, 0x20af600, 0xbec6bc4793cdeffb)
	/opt/gopath/src/github.com/hashicorp/nomad/nomad/fsm.go:210 +0x6f1
github.com/hashicorp/nomad/vendor/github.com/hashicorp/raft.(*Raft).runFSM.func1(0xc4207893a0)
	/opt/gopath/src/github.com/hashicorp/nomad/vendor/github.com/hashicorp/raft/fsm.go:57 +0x17b
github.com/hashicorp/nomad/vendor/github.com/hashicorp/raft.(*Raft).runFSM(0xc42025c000)
	/opt/gopath/src/github.com/hashicorp/nomad/vendor/github.com/hashicorp/raft/fsm.go:120 +0x31e
github.com/hashicorp/nomad/vendor/github.com/hashicorp/raft.(*Raft).(github.com/hashicorp/nomad/vendor/github.com/hashicorp/raft.runFSM)-fm()
	/opt/gopath/src/github.com/hashicorp/nomad/vendor/github.com/hashicorp/raft/api.go:506 +0x2a
github.com/hashicorp/nomad/vendor/github.com/hashicorp/raft.(*raftState).goFunc.func1(0xc42025c000, 0xc420337c20)
	/opt/gopath/src/github.com/hashicorp/nomad/vendor/github.com/hashicorp/raft/state.go:146 +0x53
created by github.com/hashicorp/nomad/vendor/github.com/hashicorp/raft.(*raftState).goFunc
	/opt/gopath/src/github.com/hashicorp/nomad/vendor/github.com/hashicorp/raft/state.go:144 +0x66

The servers join together in a cluster, and a leader is elected, but the Nomad boxes crash instantly afterward.

peers.json recovery doesn't seem to work either, it crashes with the same error.

I am assuming I can fix this by fully cleaning my data-dir and restarting, but ideally we wouldn't need to do that.

Reproduction steps

This is the only time this has happened to us - so I'm not sure what the reproduction would be.

Nomad Server logs (if appropriate)

posted above - can post more if needed.

Nomad Client logs (if appropriate)

Job file (if appropriate)

The text was updated successfully, but these errors were encountered:

dcparker88 · 2018-07-02T19:35:18Z

I think I might have it narrowed down to a job in our cluster causing it - but I'm not sure how to delete/kill this job since the servers aren't up long enough for me to stop it.

nickethier · 2018-07-02T19:39:53Z

@dcparker88 I'm looking into this now, what about the job makes you think its causing it?

dcparker88 · 2018-07-02T19:46:02Z

I might be way off - but I turned on Debug logs, and it lists out jobs, but never lists out a batch job that we have. The batch job also seems to be "flapping" - appearing and disappearing in the job status list, etc.

This could just be a symptom of the nomad servers constantly restarting, however.

dcparker88 · 2018-07-02T19:57:54Z

here is a full log coming from a clean start (deleted everything in nomad data dir and started it fresh)


                Client: false
             Log Level: DEBUG
                Region: global (DC: datacenter)
                Server: true
               Version: 0.8.3

==> Nomad agent started! Log data will stream in below:

    2018/07/02 15:55:56.503321 [WARN] consul.sync: Consul does NOT support TLSSkipVerify; please upgrade to Consul 0.7.2 or newer
    2018/07/02 15:55:56.505131 [DEBUG] consul.sync: registered 0 services, 0 checks; deregistered 0 services, 0 checks
    2018/07/02 15:55:56 [INFO] raft: Initial configuration (index=0): []
    2018/07/02 15:55:56 [INFO] raft: Node at 10.60.151.181:4647 [Follower] entering Follower state (Leader: "")
    2018/07/02 15:55:56 [INFO] serf: EventMemberJoin: nomad-d2f68611ddf4d4de 10.60.151.181
    2018/07/02 15:55:56.513826 [INFO] nomad: starting 4 scheduling worker(s) for [service batch system _core]
    2018/07/02 15:55:56.514156 [INFO] nomad: adding server nomad-d2f68611ddf4d4de (Addr: 10.60.151.181:4647) (DC: datacenter)
    2018/07/02 15:55:56.514301 [DEBUG] server.nomad: lost contact with Nomad quorum, falling back to Consul for server list
    2018/07/02 15:55:56.514880 [DEBUG] consul.sync: registered 0 services, 0 checks; deregistered 0 services, 0 checks
    2018/07/02 15:55:56 [DEBUG] memberlist: Initiating push/pull sync with: 10.60.153.102:4648
    2018/07/02 15:55:56 [INFO] serf: EventMemberJoin: nomad-97d52edaa6767264 10.60.152.87
    2018/07/02 15:55:56 [INFO] serf: EventMemberJoin: nomad-0047a04d84848676 10.60.153.102
    2018/07/02 15:55:56.520885 [INFO] nomad: adding server nomad-97d52edaa6767264 (Addr: 10.60.152.87:4647) (DC: datacenter)
    2018/07/02 15:55:56 [DEBUG] memberlist: Initiating push/pull sync with: 10.60.152.87:4648
    2018/07/02 15:55:56.523247 [INFO] server.nomad: successfully contacted 2 Nomad Servers
    2018/07/02 15:55:56.523282 [INFO] nomad: Existing Raft peers reported by nomad-97d52edaa6767264 (10.60.152.87:4647), disabling bootstrap mode
    2018/07/02 15:55:56.523305 [INFO] nomad: adding server nomad-0047a04d84848676 (Addr: 10.60.153.102:4647) (DC: datacenter)
    2018/07/02 15:55:56 [DEBUG] raft-net: 10.60.151.181:4647 accepted connection from: 10.60.153.102:46554
    2018/07/02 15:55:56 [DEBUG] raft-net: 10.60.151.181:4647 accepted connection from: 10.60.153.102:46556
    2018/07/02 15:55:56 [WARN] raft: Failed to get previous log: 34628 log not found (last: 0)
    2018/07/02 15:55:56 [INFO] snapshot: Creating new snapshot at /nomad/server/raft/snapshots/20-24581-1530561356812.tmp
    2018/07/02 15:55:56 [INFO] raft: Copied 335172 bytes to local snapshot
    2018/07/02 15:55:56 [INFO] raft: Installed remote snapshot
    2018/07/02 15:55:57 [ERR] raft-net: Failed to decode incoming command: read tcp 192.168.208.98:4647->10.60.153.102:46554: read: connection reset by peer
    2018/07/02 15:55:57 [DEBUG] memberlist: TCP connection from=10.60.153.102:50526
    2018/07/02 15:55:57 [INFO] serf: EventMemberUpdate: nomad-0047a04d84848676
    2018/07/02 15:55:57 [INFO] serf: EventMemberUpdate: nomad-97d52edaa6767264
    2018/07/02 15:55:58 [WARN] raft: Heartbeat timeout from "10.60.153.102:4647" reached, starting election
    2018/07/02 15:55:58 [INFO] raft: Node at 10.60.151.181:4647 [Candidate] entering Candidate state in term 75876
    2018/07/02 15:55:58 [DEBUG] raft: Votes needed: 2
    2018/07/02 15:55:58 [DEBUG] raft: Vote granted from 10.60.151.181:4647 in term 75876. Tally: 1
    2018/07/02 15:55:58 [DEBUG] raft-net: 10.60.151.181:4647 accepted connection from: 10.60.152.87:59094
    2018/07/02 15:55:58 [INFO] raft: Node at 10.60.151.181:4647 [Follower] entering Follower state (Leader: "")
    2018/07/02 15:55:58 [WARN] raft: Failed to get previous log: 34630 log not found (last: 27525)
    2018/07/02 15:55:59 [ERR] raft-net: Failed to decode incoming command: read tcp 192.168.208.98:4647->10.60.152.87:59094: read: connection reset by peer
    2018/07/02 15:55:59 [DEBUG] memberlist: TCP connection from=10.60.152.87:52420
    2018/07/02 15:55:59 [INFO] serf: EventMemberUpdate: nomad-97d52edaa6767264
    2018/07/02 15:55:59 [INFO] serf: EventMemberUpdate: nomad-0047a04d84848676
    2018/07/02 15:56:00 [WARN] raft: Heartbeat timeout from "10.60.152.87:4647" reached, starting election
    2018/07/02 15:56:00 [INFO] raft: Node at 10.60.151.181:4647 [Candidate] entering Candidate state in term 75878
    2018/07/02 15:56:00 [ERR] raft: Failed to make RequestVote RPC to {Voter 10.60.152.87:4647 10.60.152.87:4647}: EOF
    2018/07/02 15:56:00 [ERR] raft: Failed to make RequestVote RPC to {Voter 10.60.153.102:4647 10.60.153.102:4647}: EOF
    2018/07/02 15:56:00 [DEBUG] raft: Votes needed: 2
    2018/07/02 15:56:00 [DEBUG] raft: Vote granted from 10.60.151.181:4647 in term 75878. Tally: 1
    2018/07/02 15:56:00 [DEBUG] raft-net: 10.60.151.181:4647 accepted connection from: 10.60.153.102:46588
    2018/07/02 15:56:00 [INFO] raft: Duplicate RequestVote for same term: 75878
    2018/07/02 15:56:00 [WARN] raft: Failed to get previous log: 34632 log not found (last: 30149)
    2018/07/02 15:56:00 [INFO] raft: Node at 10.60.151.181:4647 [Follower] entering Follower state (Leader: "10.60.153.102:4647")
    2018/07/02 15:56:00.464539 [DEBUG] http: Request GET /v1/agent/health?type=server (3.568436ms)
    2018/07/02 15:56:00 [DEBUG] raft-net: 10.60.151.181:4647 accepted connection from: 10.60.153.102:46590
    2018/07/02 15:56:00 [DEBUG] sched: <Eval "f819b920-f9c3-0643-4949-ca25eb373926" JobID: "curator" Namespace: "default">: Total changes: (place 1) (destructive 0) (inplace 0) (stop 0)
Desired Changes for "curator": (place 1) (inplace 0) (destructive 0) (stop 0) (migrate 0) (ignore 0) (canary 0)
    2018/07/02 15:56:00 [DEBUG] sched: <Eval "f819b920-f9c3-0643-4949-ca25eb373926" JobID: "curator" Namespace: "default">: setting status to complete
    2018/07/02 15:56:00 [DEBUG] sched: <Eval "2b52eb38-341b-f00c-a3f0-ae2c6138a058" JobID: "curator/periodic-1530334800" Namespace: "default">: Total changes: (place 0) (destructive 0) (inplace 0) (stop 0)
Desired Changes for "curator": (place 0) (inplace 0) (destructive 0) (stop 0) (migrate 0) (ignore 1) (canary 0)
    2018/07/02 15:56:00 [DEBUG] sched: <Eval "2b52eb38-341b-f00c-a3f0-ae2c6138a058" JobID: "curator/periodic-1530334800" Namespace: "default">: setting status to complete
    2018/07/02 15:56:00 [DEBUG] sched: <Eval "4f0962a1-a590-05f8-20c3-1cf406793768" JobID: "elk" Namespace: "default">: Total changes: (place 1) (destructive 0) (inplace 0) (stop 0)
Desired Changes for "es-cluster-master": (place 1) (inplace 0) (destructive 0) (stop 0) (migrate 0) (ignore 13) (canary 0)
Desired Changes for "logstash": (place 0) (inplace 0) (destructive 0) (stop 0) (migrate 0) (ignore 4) (canary 0)
Desired Changes for "kibana": (place 0) (inplace 0) (destructive 0) (stop 0) (migrate 0) (ignore 2) (canary 0)
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0xc8 pc=0xf8d94a]

goroutine 22 [running]:
github.com/hashicorp/nomad/nomad/structs.(*Node).Ready(...)
	/opt/gopath/src/github.com/hashicorp/nomad/nomad/structs/structs.go:1431
github.com/hashicorp/nomad/scheduler.(*GenericScheduler).findPreferredNode(0xc420be7cc0, 0x2016ee0, 0xc420c19c50, 0x11, 0x20d3d40, 0xc42055ef00)
	/opt/gopath/src/github.com/hashicorp/nomad/scheduler/generic_sched.go:596 +0xfa
github.com/hashicorp/nomad/scheduler.(*GenericScheduler).computePlacements(0xc420be7cc0, 0x20d2740, 0x0, 0x0, 0xc420451bb0, 0x1, 0x1, 0xc4201aa100, 0x14)
	/opt/gopath/src/github.com/hashicorp/nomad/scheduler/generic_sched.go:448 +0x2eb
github.com/hashicorp/nomad/scheduler.(*GenericScheduler).computeJobAllocs(0xc420be7cc0, 0xc420bd24e0, 0xc4205c8e40)
	/opt/gopath/src/github.com/hashicorp/nomad/scheduler/generic_sched.go:410 +0x178d
github.com/hashicorp/nomad/scheduler.(*GenericScheduler).process(0xc420be7cc0, 0xc4205c85c0, 0xc420a85710, 0xc4205f09a0)
	/opt/gopath/src/github.com/hashicorp/nomad/scheduler/generic_sched.go:245 +0x535
github.com/hashicorp/nomad/scheduler.(*GenericScheduler).(github.com/hashicorp/nomad/scheduler.process)-fm(0x7f5db91fb000, 0xc4206e95d0, 0x3)
	/opt/gopath/src/github.com/hashicorp/nomad/scheduler/generic_sched.go:144 +0x2a
github.com/hashicorp/nomad/scheduler.retryMax(0x5, 0xc420a858a0, 0xc420a858b0, 0xc, 0xffffffffffffffff)
	/opt/gopath/src/github.com/hashicorp/nomad/scheduler/util.go:271 +0x46
github.com/hashicorp/nomad/scheduler.(*GenericScheduler).Process(0xc420be7cc0, 0xc4205f09a0, 0xc420082eb0, 0x2017d20)
	/opt/gopath/src/github.com/hashicorp/nomad/scheduler/generic_sched.go:144 +0x123
github.com/hashicorp/nomad/nomad.(*nomadFSM).reconcileQueuedAllocations(0xc42034faa0, 0x7d70, 0x0, 0x0)
	/opt/gopath/src/github.com/hashicorp/nomad/nomad/fsm.go:1321 +0x947
github.com/hashicorp/nomad/nomad.(*nomadFSM).applyReconcileSummaries(0xc42034faa0, 0xc420304461, 0x8, 0x8, 0x7d70, 0xf02c2a8b, 0xc420239cd8)
	/opt/gopath/src/github.com/hashicorp/nomad/nomad/fsm.go:746 +0x7e
github.com/hashicorp/nomad/nomad.(*nomadFSM).Apply(0xc42034faa0, 0xc420a8a8a0, 0x20af600, 0xbec6beb41eb197ce)
	/opt/gopath/src/github.com/hashicorp/nomad/nomad/fsm.go:210 +0x6f1
github.com/hashicorp/nomad/vendor/github.com/hashicorp/raft.(*Raft).runFSM.func1(0xc420508c70)
	/opt/gopath/src/github.com/hashicorp/nomad/vendor/github.com/hashicorp/raft/fsm.go:57 +0x17b
github.com/hashicorp/nomad/vendor/github.com/hashicorp/raft.(*Raft).runFSM(0xc420246000)
	/opt/gopath/src/github.com/hashicorp/nomad/vendor/github.com/hashicorp/raft/fsm.go:120 +0x31e
github.com/hashicorp/nomad/vendor/github.com/hashicorp/raft.(*Raft).(github.com/hashicorp/nomad/vendor/github.com/hashicorp/raft.runFSM)-fm()
	/opt/gopath/src/github.com/hashicorp/nomad/vendor/github.com/hashicorp/raft/api.go:506 +0x2a
github.com/hashicorp/nomad/vendor/github.com/hashicorp/raft.(*raftState).goFunc.func1(0xc420246000, 0xc4201bc340)
	/opt/gopath/src/github.com/hashicorp/nomad/vendor/github.com/hashicorp/raft/state.go:146 +0x53
created by github.com/hashicorp/nomad/vendor/github.com/hashicorp/raft.(*raftState).goFunc
	/opt/gopath/src/github.com/hashicorp/nomad/vendor/github.com/hashicorp/raft/state.go:144 +0x66

42wim · 2018-07-02T19:58:21Z

Looking at the code it seems to lookup a node based on id of the allocation (in findPreferredNode) but is getting a empty node (nil) back.

Some inconsistency somewhere, to get your servers (probably) running again you could add a if n == nil { return false } before

nomad/nomad/structs/structs.go

Line 1431 in 1eedb77

    
           return n.Status == NodeStatusReady && !n.Drain && n.SchedulingEligibility == NodeSchedulingEligible

But maybe hashicorp wants you to try other stuff first. :)

nickethier · 2018-07-02T20:31:52Z

@dcparker88 I think I found the bug, but I don't have a work around for fixing your state yet (and I'm not sure if I will (but I'm trying!)).

I think a job (maybe the problematic one you mentioned) trying to get scheduled to a node that for some reason doesn't exist in the state store. You could ultimately fix this by stopping all nomad server nodes, wiping the datadir and starting them backup.

I'll let you know as I get more info.

nickethier · 2018-07-02T20:32:55Z

@42wim correct, the fix proper is here: 1acbf1d

dcparker88 · 2018-07-02T20:41:52Z

thanks - resetting the data dir on all my servers did work. I lost all my jobs - but that's ok for now since we can recreate them quickly in terraform.

nickethier · 2018-07-02T20:52:27Z

@dcparker88 glad you’re working again and it wasn’t too much of an impact, never a route you should have to take though.

The current hypothesis is that it’s related to sticky volumes. Did the job you mentioned have a sticky enabled volume by chance?

dcparker88 · 2018-07-02T21:20:40Z

one of our jobs does, yes. the one I thought was the impact did not, but again I might be wrong about what actual job it was. the sticky job also has a distinct_host constraint turned on.

chelseakomlo · 2018-07-03T16:07:13Z

@dcparker88 Can you please include the job files for the one which requires sticky volumes, and the other job that you thought was suspect mentioned above?

dcparker88 · 2018-07-05T12:55:57Z

yeah - here is the relevant group (with the sticky volumes): https://gist.github.com/dcparker88/2f450f8976a43490db0654e738b4e5ba

dcparker88 · 2018-07-05T12:57:23Z

the one I thought was potentially causing it is here: https://gist.github.com/dcparker88/705effd1b374bfc51399e3c54f25e571

github-actions · 2022-11-29T02:18:28Z

I'm going to lock this issue because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

nickethier added theme/crash stage/needs-investigation labels Jul 2, 2018

nickethier mentioned this issue Jul 5, 2018

scheduler: fixed nil panic and clearer error handling #4474

Merged

nickethier closed this as completed in #4474 Jul 11, 2018

github-actions bot locked as resolved and limited conversation to collaborators Nov 29, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

nomad server panic: runtime error: invalid memory address or nil pointer dereference #4463

nomad server panic: runtime error: invalid memory address or nil pointer dereference #4463

dcparker88 commented Jul 2, 2018 •

edited

Loading

dcparker88 commented Jul 2, 2018

nickethier commented Jul 2, 2018

dcparker88 commented Jul 2, 2018

dcparker88 commented Jul 2, 2018

42wim commented Jul 2, 2018

nickethier commented Jul 2, 2018

nickethier commented Jul 2, 2018

dcparker88 commented Jul 2, 2018

nickethier commented Jul 2, 2018

dcparker88 commented Jul 2, 2018

chelseakomlo commented Jul 3, 2018 •

edited

Loading

dcparker88 commented Jul 5, 2018 •

edited

Loading

dcparker88 commented Jul 5, 2018

github-actions bot commented Nov 29, 2022

nomad server panic: runtime error: invalid memory address or nil pointer dereference #4463

nomad server panic: runtime error: invalid memory address or nil pointer dereference #4463

Comments

dcparker88 commented Jul 2, 2018 • edited Loading

Nomad version

Operating system and Environment details

Issue

Reproduction steps

Nomad Server logs (if appropriate)

Nomad Client logs (if appropriate)

Job file (if appropriate)

dcparker88 commented Jul 2, 2018

nickethier commented Jul 2, 2018

dcparker88 commented Jul 2, 2018

dcparker88 commented Jul 2, 2018

42wim commented Jul 2, 2018

nickethier commented Jul 2, 2018

nickethier commented Jul 2, 2018

dcparker88 commented Jul 2, 2018

nickethier commented Jul 2, 2018

dcparker88 commented Jul 2, 2018

chelseakomlo commented Jul 3, 2018 • edited Loading

dcparker88 commented Jul 5, 2018 • edited Loading

dcparker88 commented Jul 5, 2018

github-actions bot commented Nov 29, 2022

dcparker88 commented Jul 2, 2018 •

edited

Loading

chelseakomlo commented Jul 3, 2018 •

edited

Loading

dcparker88 commented Jul 5, 2018 •

edited

Loading