Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nomad UI Bug: CPU Total Displays as 0 When Task Resources Use cores #24691

Open
ynl opened this issue Dec 17, 2024 · 2 comments
Open

Nomad UI Bug: CPU Total Displays as 0 When Task Resources Use cores #24691

ynl opened this issue Dec 17, 2024 · 2 comments
Assignees

Comments

@ynl
Copy link

ynl commented Dec 17, 2024

Nomad version

1.9.4-dev

Operating system and Environment details

Linux

Issue

When a task's resources block specifies CPU using cores, the allocation monitoring UI incorrectly displays "0 Total" CPU under the Resources tab.

Reproduction steps

image

Expected Result

Actual Result

Job file (if appropriate)

job "cpu-test" {
  datacenters = ["dc1"]
  type = "service"

  group "test" {
    count = 1

    task "cpu-burn" {
      driver = "docker"

      config {
        image = "alpine:latest"
        command = "/bin/sh"
        args = [
          "-c",
          "for i in $(seq 1 6); do dd if=/dev/zero of=/dev/null & done; sleep 3600"
        ]
      }

      resources {
        cores = 6   
        memory = 256
      }
    }
  }
}

Nomad Server logs (if appropriate)

Nomad Client logs (if appropriate)

@Hendrik-vandenBoogaard-Intergas
Copy link

Hendrik-vandenBoogaard-Intergas commented Dec 23, 2024

We see the same, but we don't specify cores. We do however specify cpu so if a job goes over the limit we get a prometheus based alert that the job is using more than its allocated CPU.

Currently, those jobs hover around 0% but before (Nomad 1.9.3) their value worked ok.

Anonymized job file:

job "mqtt-processor" {
    type = "service"
    region = "global"
    datacenters = ["spanned-dc-production"]
    meta { run_uuid = "${uuidv4()}" 

    group "mqtt-processor" {

        count = 1

        network { port "mqtt-metrics-port" { to = 9095 } }
        task "mqtt-processor" {
            driver = "docker"
            config {
                image = xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
                force_pull = true
                ports = ["mqtt-metrics-port"]
                dns_servers = ["${attr.unique.network.ip-address}"]
            }

            service {
                name = "mqtt-processor"
                port = "mqtt-metrics-port"
                tags = [ "prometheus_target" ]
                check {
                    type     = "http"
                    path     = "/"
                    interval = "10s"
                    timeout  = "2s"
                }
            }

            env {
                xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
            }

            resources {
                cpu    = 1500
                memory = 128
                memory_max = 512
            }
        }
    }
}

Grafana output of job CPU consumption on 1.9.3 vs 1.9.4:
image

image

image

docker stats also shows that the CPU is a lot more busy:

CONTAINER ID   NAME                                                          CPU %     MEM USAGE / LIMIT   MEM %     NET I/O           BLOCK I/O      PIDS
a37a9e6dd58b   mqtt-processor-b3910bbe-b277-ccce-7d1a-b156ddd04cd2           30.49%    79.34MiB / 512MiB   15.50%    3.03GB / 11.4GB   0B / 0B        14

@philrenaud
Copy link
Contributor

@ynl We have a DefaultResources() block that gets returned to the UI in the event that no resources are defined: https://github.com/hashicorp/nomad/blob/main/api/resources.go#L64-L70

But it looks like the CPU default isn't included if Cores is defined (one precludes the other).

I am thinking about what to show in the UI in this case where the task doesn't have a reserved amount. I could show the CPU's total (like what the client page shows), but that feels a little disingenuous. Might just show it without a divisor. What do you think?

=================

@Hendrik-vandenBoogaard-Intergas I am not sure your comment shows the same issue, though it does appear to show a different bug. I'm going to raise that internally and see if anyone else has noticed a CPU reporting change from 1.9.x.

@philrenaud philrenaud self-assigned this Jan 3, 2025
@philrenaud philrenaud moved this from Needs Triage to Triaging in Nomad - Community Issues Triage Jan 3, 2025
@github-project-automation github-project-automation bot moved this to Backlog in Nomad UI Jan 3, 2025
@philrenaud philrenaud moved this from Backlog to In Progress in Nomad UI Jan 3, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: In Progress
Development

No branches or pull requests

3 participants