Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Phantom Job and Service #1794

Closed
likwid opened this issue Oct 6, 2016 · 3 comments
Closed

Phantom Job and Service #1794

likwid opened this issue Oct 6, 2016 · 3 comments

Comments

@likwid
Copy link

likwid commented Oct 6, 2016

If you have a question, prepend your issue with [question] or preferably use the nomad mailing list.

If filing a bug please include the following:

Nomad version

Output from nomad version
Nomad v0.4.0

Operating system and Environment details

Ubuntu trust 14.04
Linux ip-10-10-72-34 3.13.0-93-generic #140-Ubuntu SMP Mon Jul 18 21:21:05 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux

Issue

I have a job using a docker driver for the task. The job is no longer registered according to nomad:

ubuntu@ip-10-10-72-34:~$ nomad status
ID                         Type     Priority  Status
catalog-blue               service  50        running
catalog-green              service  50        running
catalog-hub-swagger-blue   service  50        running
catalog-hub-swagger-green  service  50        running
playground-b               service  50        running
playground-g               service  50        running
router                     system   50        running

However the service still shows up in consul, and any attempt to delete the containers associated with this phantom job start up again after being removed. I don't know how to remove the job from nomad, because nomad reports no job by that name running.

Reproduction steps

I don't know how I got into this state, so unknown how to reproduce.

Nomad Server logs (if appropriate)

I can provide whatever logs are required

Nomad Client logs (if appropriate)

Not sure these are needed, but I can supply them if need be.

Job file (if appropriate)

job "swagger-ui-blue" {
        datacenters = ["prod"]
        region = "us-east-1"
        constraint {
                attribute = "${attr.kernel.name}"
                value = "linux"
        }

        update {
                stagger = "2s"
                max_parallel = 2
        }

        group "web-blue" {
                count = 2
                restart {
                        attempts = 10
                        interval = "5m"
                        delay = "25s"
                        mode = "delay"
                }

                task "swagger-ui-blue" {
                        driver = "docker"
                        config {
                                image = "localhost:5000/swagger-ui-builder"
                                port_map {
                                        http = 8080
                                }
                        }
                        service {
                                name = "swagger-ui-blue"
                                tags = ["global", "web", "blue"]
                                port = "http"
                                check {
                                        type = "http"
                                        interval = "10s"
                                        timeout = "2s"
                                        path = "/"
                                }
                        }
                        resources {
                                cpu = 500 # 500 Mhz
                                memory = 256 # 256MB
                                network {
                                        mbits = 10
                                        port "http" {
                                        }
                                }
                        }
                }
        }
}
@dadgar
Copy link
Contributor

dadgar commented Oct 6, 2016

Duplicate of #1524.

The way to get out of that state is to find the machine registering the service and on that machine do a ps aux | grep nomad. Then kill the executor that is associated to the alloc-id that should be dead.

@dadgar dadgar closed this as completed Oct 6, 2016
@dadgar
Copy link
Contributor

dadgar commented Oct 6, 2016

If that doesn't solve it for you please post in the other issue

@github-actions
Copy link

I'm going to lock this issue because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Dec 19, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants