You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Calling /v1/system/gc and /v1/system/reconcile/summaries doesn't fix the problem. The only way to clean this is to run nomad stop --purge ${JOB_NAME} and submit the new job.
Reproduction steps
Submit parametrized or periodic job
restart current cluster leader
Nomad Server logs (from new leader)
Feb 20 09:39:10 ip-10-x-y-212 nomad[1567]: 2018/02/20 09:39:10.411107 [ERR] worker: failed to dequeue evaluation: rpc error: eval broker disabled
Feb 20 09:39:10 ip-10-x-y-212 nomad[1567]: 2018/02/20 09:39:10.411217 [ERR] worker: failed to dequeue evaluation: rpc error: eval broker disabled
Feb 20 09:39:11 ip-10-x-y-212 nomad[1567]: 2018/02/20 09:39:11 [INFO] serf: EventMemberLeave: i-0319fb6222c41dec5.us-east-1 10.x.x.123
Feb 20 09:39:11 ip-10-x-y-212 nomad[1567]: 2018/02/20 09:39:11.135682 [INFO] nomad: removing server i-0319fb6222c41dec5.us-east-1 (Addr: 10.x.x.123:4647) (DC: us-east-1)
Feb 20 09:39:11 ip-10-x-y-212 nomad[1567]: 2018/02/20 09:39:11.137045 [ERR] worker: failed to dequeue evaluation: rpc error: No cluster leader
Feb 20 09:39:11 ip-10-x-y-212 nomad[1567]: 2018/02/20 09:39:11.137171 [ERR] worker: failed to dequeue evaluation: rpc error: No cluster leader
Feb 20 09:39:11 ip-10-x-y-212 nomad[1567]: 2018/02/20 09:39:11 [WARN] raft: Heartbeat timeout from "10.x.x.123:4647" reached, starting election
Feb 20 09:39:11 ip-10-x-y-212 nomad[1567]: 2018/02/20 09:39:11 [INFO] raft: Node at 10.x.y.212:4647 [Candidate] entering Candidate state in term 3443
Feb 20 09:39:11 ip-10-x-y-212 nomad[1567]: 2018/02/20 09:39:11 [INFO] serf: EventMemberJoin: i-0319fb6222c41dec5.us-east-1 10.x.x.123
Feb 20 09:39:11 ip-10-x-y-212 nomad[1567]: 2018/02/20 09:39:11.858108 [INFO] nomad: adding server i-0319fb6222c41dec5.us-east-1 (Addr: 10.x.x.123:4647) (DC: us-east-1)
Feb 20 09:39:12 ip-10-x-y-212 nomad[1567]: 2018/02/20 09:39:12 [INFO] raft: Duplicate RequestVote for same term: 3443
Feb 20 09:39:12 ip-10-x-y-212 nomad[1567]: 2018/02/20 09:39:12 [WARN] raft: Election timeout reached, restarting election
Feb 20 09:39:12 ip-10-x-y-212 nomad[1567]: 2018/02/20 09:39:12 [INFO] raft: Node at 10.x.y.212:4647 [Candidate] entering Candidate state in term 3444
Feb 20 09:39:12 ip-10-x-y-212 nomad[1567]: 2018/02/20 09:39:12 [INFO] raft: Election won. Tally: 3
Feb 20 09:39:12 ip-10-x-y-212 nomad[1567]: 2018/02/20 09:39:12 [INFO] raft: Node at 10.x.y.212:4647 [Leader] entering Leader state
Feb 20 09:39:12 ip-10-x-y-212 nomad[1567]: 2018/02/20 09:39:12 [INFO] raft: Added peer 10.x.z.134:4647, starting replication
Feb 20 09:39:12 ip-10-x-y-212 nomad[1567]: 2018/02/20 09:39:12 [INFO] raft: Added peer 10.x.c.234:4647, starting replication
Feb 20 09:39:12 ip-10-x-y-212 nomad[1567]: 2018/02/20 09:39:12 [INFO] raft: Added peer 10.x.z.92:4647, starting replication
Feb 20 09:39:12 ip-10-x-y-212 nomad[1567]: 2018/02/20 09:39:12.953449 [INFO] nomad: cluster leadership acquired
Feb 20 09:39:12 ip-10-x-y-212 nomad[1567]: 2018/02/20 09:39:12 [INFO] raft: pipelining replication to peer {Voter 10.x.c.234:4647 10.x.c.234:4647}
Feb 20 09:39:12 ip-10-x-y-212 nomad[1567]: 2018/02/20 09:39:12 [INFO] raft: pipelining replication to peer {Voter 10.x.z.134:4647 10.x.z.134:4647}
Feb 20 09:39:12 ip-10-x-y-212 nomad[1567]: 2018/02/20 09:39:12.967748 [ERR] worker: failed to dequeue evaluation: eval broker disabled
Feb 20 09:39:12 ip-10-x-y-212 nomad[1567]: 2018/02/20 09:39:12 [INFO] raft: pipelining replication to peer {Voter 10.x.z.92:4647 10.x.z.92:4647}
Feb 20 09:39:13 ip-10-x-y-212 nomad[1567]: 2018/02/20 09:39:13 [WARN] raft: Rejecting vote request from 10.x.x.123:4647 since we have a leader: 10.x.y.212:4647
The text was updated successfully, but these errors were encountered:
I'm going to lock this issue because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.
Nomad version
Nomad v0.7.1 (0b295d3)
Operating system and Environment details
Distributor ID: Ubuntu
Description: Ubuntu 16.04.2 LTS
Release: 16.04
Issue
If nomad server leader is reelected (restart of leader) periodic or parametrized Batch jobs are transitioned to
Queued
status:Calling
/v1/system/gc
and/v1/system/reconcile/summaries
doesn't fix the problem. The only way to clean this is to runnomad stop --purge ${JOB_NAME}
and submit the new job.Reproduction steps
Nomad Server logs (from new leader)
The text was updated successfully, but these errors were encountered: