You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Nomad does not remove cgroups for terminated exec tasks.
This causes that more and more memory is used on the host system by the kernfs_node_cache and task_struct SLAB caches.
This causes that the host system becomes unstable by running out of memory, starting to swap and then page allocation failure happens.
Reproduction steps
1.) Start a batch job via nomad that:
runs a command that is available in the exec chroot and finish fast, e.g. /bin/ls
runs periodically every 1 second (optionally with prohibit_overlap = true)
2.)
Monitor the number of cgroups on the system created by nomad,
e.g. via watch -n 1 'find $(ls /sys/fs/cgroup/*/nomad -d) -type d| wc -l', the number is continously growing
Monitor slab caches via slabtop -s c -d1, the kernfs_node_cache and task_struct caches are continuously growing
Somewhen the system runs out of available memory, swaps and page allocation failures happen.
@fho anytime! It'll go out in 0.10.3. Thank you so much for reporting it.
For context, Nomad leaked cgroups in a regression since 0.9.0 :(. If an exec task exits with zero exit code, nomad 0.9 didn't clean up the cgroups. Nomad 0.10.2 fixed this issue in #6722 . But systemd cgroup was special, and we didn't properly clean it up; we addressed it in #6839 .
Let us know if you have any questions or further observations!
I'm going to lock this issue because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.
Nomad version
Reproduced with:
Operating system and Environment details
Reproduced with Linux kernels:
Issue
Nomad does not remove cgroups for terminated exec tasks.
This causes that more and more memory is used on the host system by the
kernfs_node_cache
andtask_struct
SLAB caches.This causes that the host system becomes unstable by running out of memory, starting to swap and then page allocation failure happens.
Reproduction steps
1.) Start a batch job via nomad that:
/bin/ls
prohibit_overlap = true
)2.)
e.g. via
watch -n 1 'find $(ls /sys/fs/cgroup/*/nomad -d) -type d| wc -l'
, the number is continously growingslabtop -s c -d1
, thekernfs_node_cache
andtask_struct
caches are continuously growingSomewhen the system runs out of available memory, swaps and page allocation failures happen.
Fix: Remove cgroups when an exec task terminates
Job file (if appropriate)
The text was updated successfully, but these errors were encountered: