-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Nomad services linger with invalid allocIDS #17182
Comments
Possible duplicate of #16762, the script mentioned here: #16762 (comment) solved my issue. Looking into how that works I'm thinking the manual service delete via CLI would have worked as well. It just isn't obvious that the service ID was not As my current issue is resolved I propose someone takes a look at the first post and decides if it has relevant information for fixing the underlying bug. If it doesn't feel free to close this issue. |
Hi @SamMousa! I agree this most likely sounds like another case of #16762.
The IDs for the services are very long (ex one running on my machine right now is As far as this bug goes can you clarify this bit?:
Was the job still present (that is, visible via |
After purging the job the job and Service were still visible. The UI showed no allocations for the job but going to the services for the job shows a Service and a nonexistent allocation |
Ok, I'm going to close this as a duplicate of #17079 so that we can centralize our efforts around that. We've got a release coming out very soon with the patch. |
Nomad version
Output from
nomad version
Operating system and Environment details
Running Ubuntu 22.04.2LTS, 3 nodes.
This is not yet a full production cluster, mostly running support workloads where some downtime is acceptable.
Issue
We use Traefik and it's Nomad service discovery for routing traffic. Sometimes we notice a bad gateway for a service that according to Nomad is running just fine.
Diving into this we tried purging the job from Nomad (with the intention of running it after everything is cleaned up).
After purging the job we noticed the service in Traefik still persisted, so it was time to look a little deeper.
So the situation, summarized as I understand it:
Reproduction steps
Don't know
Nomad Server logs (if appropriate)
The text was updated successfully, but these errors were encountered: