Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

slow client GC prevents new allocs from running (periodic dispatch) #19917

Open
aneustroev opened this issue Feb 8, 2024 · 13 comments · May be fixed by #25123
Open

slow client GC prevents new allocs from running (periodic dispatch) #19917

aneustroev opened this issue Feb 8, 2024 · 13 comments · May be fixed by #25123
Assignees
Labels
hcc/jira stage/accepted Confirmed, and intend to work on. No timeline committment though. theme/gc theme/periodic type/bug

Comments

@aneustroev
Copy link
Contributor

Nomad version

Nomad v1.7.4

Operating system and Environment details

Ubuntu

Issue

If I run periodic tasks, childrens of periodic don't cleanup by GC until it become gc_max_allocs and then all jobs stuck in pending state. Waiting for childrens cleanup.

Also, I can't set gc_max_allocs very big value, because each alloc make two mounts, and with mounts more that 30k OS work unstable.

Reproduction steps

1 Make a lot frequency periodic tasks (>100)
2 Waiting for allocs > gc_max_allocs
3 See that all new allocs in pending state

Expected Result

1 GC remove old allocs by TTL
or
2 GC unmount secrets, private mounts by TTL

Actual Result

1 When gc_max_allocs small, new allocs created with delay or newer be created
2 When gc_max_allocs big, OS become unstable over time

Job file (if appropriate)

job "hello-world" {
  # Specifies the datacenter where this job should be run
  # This can be omitted and it will default to ["*"]
  datacenters = ["*"]
  
  type = "batch"
  
  periodic {
    cron             = "*/1 * * * * *"
    prohibit_overlap = true
  }

  # A group defines a series of tasks that should be co-located
  # on the same client (host). All tasks within a group will be
  # placed on the same host.
  group "servers" {

    # Specifies the number of instances of this group that should be running.
    # Use this to scale or parallelize your job.
    # This can be omitted and it will default to 1.
    count = 1

    # Tasks are individual units of work that are run by Nomad.
    task "test" {
      # This particular task starts a simple web server within a Docker container
      driver = "docker"

      config {
        image   = "alpine:latest"
        command = "/bin/sh"
        args = ["-c", "echo", "123423543"]
      }


      # Specify the maximum resources required to run the task
      resources {
        cpu    = 50
        memory = 64
      }
    }
  }
}

Nomad Server logs (if appropriate)

All is good.

Nomad Client logs (if appropriate)

also, I saw strange logs for dead allocs

Feb 08 05:04:44 php-2-eu1.adsrv.wtf nomad[2626928]:     2024-02-08T05:04:44.540-0500 [INFO]  client.alloc_runner.task_runner: Task event: alloc_id=06aeb094-0272-1940-fde9-c3a841a50b00 task=php-ai-common-delayed-processing-queue-execute-rabbit-task type=Killing msg="Sent interrupt. Waiting 10s before force killing" failed=false

why it's killing, if it's dead?

@lgfa29
Copy link
Contributor

lgfa29 commented Feb 8, 2024

Hi @aneustroev 👋

I'm not sure I fully understood the problem. Is the Nomad GC preventing new allocations from being placed until it completes?

Would you be able to share the output of the status for the periodic job and the allocations when they get into this state?

Thank you!

@aneustroev
Copy link
Contributor Author

Is the Nomad GC preventing new allocations from being placed until it completes?

yes

image

Why Nomad remove old allocations so long?

@aneustroev
Copy link
Contributor Author

sometimes task hang in queue for hours

@aneustroev
Copy link
Contributor Author

Untitled

@lgfa29
Copy link
Contributor

lgfa29 commented Feb 9, 2024

Does the task have any events? And have you noticed if this happens in a particular client or are all allocs pending, regardless of where they are scheduled?

Would be able to share logs from the Nomad client when the problem happens? You can email them to [email protected] with the issue number in the subject.

@aneustroev
Copy link
Contributor Author

aneustroev commented Feb 9, 2024

It's don't depend of client. When it happened I see next log messages.

nomad[3402844]:[INFO]  client.gc: garbage collecting allocation: alloc_id=1c459fd2-c0e8-d225-5781-96a33c10672d reason="new allocations and over max (200)"
nomad[3402844]:[INFO]  client.alloc_runner.task_runner: Task event: alloc_id=1c459fd2-c0e8-d225-5781-96a33c10672d task=<taskname> type=Killing msg="Sent interrupt. Waiting 10s before force killing" failed=false
nomad[3402844]:[INFO]  client.gc: marking allocation for GC: alloc_id=1c459fd2-c0e8-d225-5781-96a33c10672d

No one else WARN or ERROR messages.

@aneustroev
Copy link
Contributor Author

Until all allocs behind gc_max_allocs all works as expected.
Maybe GC is too slow? or don't clean allocs as expected.

server GC settings

node_gc_threshold = "24h"
job_gc_interval = "5m"
job_gc_threshold = "10m"
eval_gc_threshold = "10m"
batch_eval_gc_threshold = "10m"

client GC settings

gc_interval = "1m"
gc_max_allocs = 200
gc_parallel_destroys = 10
gc_inode_usage_threshold = 80
gc_disk_usage_threshold = 90

@lgfa29
Copy link
Contributor

lgfa29 commented Feb 13, 2024

I believe Nomad's client GC is not-blocking, so it shouldn't impact the new allocations.

Would you be able to drop the client log level to TRACE and share the client logs with us when this problem happens again?

@aneustroev
Copy link
Contributor Author

2024-02-19T03:16:56.261-0500 [TRACE] client.alloc_runner.task_runner: Kill requested: alloc_id=a21ae6da-a252-2f55-a52f-62669c90ca4e task=<taskname>
2024-02-19T03:16:56.261-0500 [TRACE] client.alloc_runner.task_runner: Kill event: alloc_id=a21ae6da-a252-2f55-a52f-62669c90ca4e task=<taskname> event_type=Killing event_reason=""
2024-02-19T03:16:56.261-0500 [INFO]  client.alloc_runner.task_runner: Task event: alloc_id=a21ae6da-a252-2f55-a52f-62669c90ca4e task=<taskname> type=Killing msg="Sent interrupt. Waiting 10s before force killing" failed=false
2024-02-19T03:16:56.262-0500 [INFO]  client.gc: marking allocation for GC: alloc_id=a21ae6da-a252-2f55-a52f-62669c90ca4e
2024-02-19T03:16:56.264-0500 [DEBUG] client.gc: alloc garbage collected: alloc_id=a21ae6da-a252-2f55-a52f-62669c90ca4e
2024-02-19T03:16:56.264-0500 [INFO]  client.gc: garbage collecting allocation: alloc_id=9d828c6a-9df0-50ce-dce0-a6fdcc5b6018 reason="new allocations and over max (100)"
2024-02-19T03:16:56.264-0500 [DEBUG] client.alloc_runner.runner_hook.group_services: delay before killing tasks: alloc_id=9d828c6a-9df0-50ce-dce0-a6fdcc5b6018 group=<taskname> shutdown_delay=30s
2024-02-19T03:17:26.266-0500 [TRACE] client.alloc_runner.task_runner: Kill requested: alloc_id=9d828c6a-9df0-50ce-dce0-a6fdcc5b6018 task=<taskname>
2024-02-19T03:17:26.266-0500 [TRACE] client.alloc_runner.task_runner: Kill event: alloc_id=9d828c6a-9df0-50ce-dce0-a6fdcc5b6018 task=<taskname> event_type=Killing event_reason=""
2024-02-19T03:17:26.266-0500 [INFO]  client.alloc_runner.task_runner: Task event: alloc_id=9d828c6a-9df0-50ce-dce0-a6fdcc5b6018 task=<taskname> type=Killing msg="Sent interrupt. Waiting 10s before force killing" failed=false
2024-02-19T03:17:26.267-0500 [INFO]  client.gc: marking allocation for GC: alloc_id=9d828c6a-9df0-50ce-dce0-a6fdcc5b6018
2024-02-19T03:17:26.269-0500 [DEBUG] client.gc: alloc garbage collected: alloc_id=9d828c6a-9df0-50ce-dce0-a6fdcc5b6018
2024-02-19T03:17:26.269-0500 [INFO]  client.gc: garbage collecting allocation: alloc_id=27225c79-23f9-8bbd-ec89-6015c77d2bbe reason="new allocations and over max (100)"
2024-02-19T03:17:26.269-0500 [DEBUG] client.alloc_runner.runner_hook.group_services: delay before killing tasks: alloc_id=27225c79-23f9-8bbd-ec89-6015c77d2bbe group=<taskname> shutdown_delay=30s

why is it happend?

client.alloc_runner.runner_hook.group_services: delay before killing tasks: alloc_id=27225c79-23f9-8bbd-ec89-6015c77d2bbe group=<taskname> shutdown_delay=30s

@cl-bvl
Copy link

cl-bvl commented Feb 27, 2024

Some details here #7787 (comment)

@aneustroev
Copy link
Contributor Author

aneustroev commented Apr 3, 2024

Lower shutdown_delay, is a solution.

@tgross tgross changed the title GC for periodic tasks slow client GC prevents new allocs from running (periodic dispatch) Jun 24, 2024
@tgross tgross added stage/accepted Confirmed, and intend to work on. No timeline committment though. hcc/jira and removed stage/waiting-reply labels Jun 24, 2024
@tgross tgross moved this from Triaging to Needs Roadmapping in Nomad - Community Issues Triage Jun 24, 2024
@tgross
Copy link
Member

tgross commented Jun 24, 2024

Sorry about the delays in returning to this (and the seemingly related #7787). I'm going to mark this as accepted and for roadmapping.

@tgross
Copy link
Member

tgross commented Feb 11, 2025

@mismithhisler mismithhisler self-assigned this Feb 14, 2025
@mismithhisler mismithhisler linked a pull request Feb 14, 2025 that will close this issue
6 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hcc/jira stage/accepted Confirmed, and intend to work on. No timeline committment though. theme/gc theme/periodic type/bug
Projects
Status: Needs Roadmapping
Development

Successfully merging a pull request may close this issue.

5 participants