Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature: set limits for exec driver #2352

Closed
ashald opened this issue Feb 23, 2017 · 17 comments
Closed

Feature: set limits for exec driver #2352

ashald opened this issue Feb 23, 2017 · 17 comments
Labels
stage/accepted Confirmed, and intend to work on. No timeline committment though. theme/driver/exec type/enhancement

Comments

@ashald
Copy link

ashald commented Feb 23, 2017

Is it possible to configure Linux limits (as in https://linux.die.net/man/5/limits.conf) with Nomad when exec driver is used? It'd be great if it was possible to set them on the task level or at least on the client config level.

@dadgar
Copy link
Contributor

dadgar commented Feb 25, 2017

What exactly are you trying to accomplish with that? We generally try to avoid leaking to many low level details into the job spec!

@ashald
Copy link
Author

ashald commented Feb 25, 2017

Some processes need more resources (that controlled via limits) than others so I thought that it might be a nice fit for driver-specific options within job description. OTOH one may argue that it can be set on user level and different users can be used instead.

@dadgar
Copy link
Contributor

dadgar commented Feb 27, 2017

Which limits are you interested, process and fd limits?

@ashald
Copy link
Author

ashald commented Feb 27, 2017

Yeah, nofile and nproc.

@ashald
Copy link
Author

ashald commented Mar 5, 2017

Just for the record, Docker has an option to tweak limits:

--ulimit ulimit                         Ulimit options (default [])

If Nomad's exec driver aim to be on par with Docker (at least that is my impression) then it might make sense to implement this feature.

@rokka-n
Copy link

rokka-n commented May 9, 2017

That would be really nice to have since one misbehaving container can easily bring down a node by leaking connections.

@tantra35
Copy link
Contributor

Any chance that this will be implemented?

@dadgar
Copy link
Contributor

dadgar commented Jan 24, 2018

@tantra35 Yes it will be. Not currently slated for a particular release.

@prologic
Copy link

We are currently blocked by this in our production setup and cannot run ElasticSearch using the exec driver as ES apparently requires a ulimit of at least 65535.

@prologic
Copy link

OTOH ES's requirement of 65k open fds seems absurd to me :/ (separate issue)

@prologic
Copy link

Work-around (if running Nomad via systemd based systems):

Set LimitNOFILE=65536 in your [Service] section of your Nomad systemd unit.

@prologic
Copy link

CGroups (which the exec driver uses through libcontainers/runc) inherits the RLimits of the parent process (in this case nomad).

@mr-karan
Copy link
Contributor

mr-karan commented Jul 1, 2022

I tried a bunch of workarounds to apply sysctl limits inside an exec task, but none seem to be applied correctly:

Inside the task. /etc/sysctl.conf (which is present inside exec task, chroot_env is the default)

net.core.somaxconn=60000

However, it's not "in use":

cat /proc/sys/net/core/somaxconn
4096

Running sysctl -p is also not helping:

sysctl -p
sysctl: setting key "fs.file-max", ignoring: Read-only file system
sysctl: setting key "net.ipv4.ip_local_port_range", ignoring: Read-only file system
sysctl: setting key "net.netfilter.nf_conntrack_max": Read-only file system
sysctl: setting key "net.ipv4.tcp_syncookies", ignoring: Read-only file system
sysctl: setting key "net.core.somaxconn", ignoring: Read-only file system
sysctl: setting key "net.ipv4.tcp_max_syn_backlog", ignoring: Read-only file system
sysctl: setting key "net.ipv4.tcp_slow_start_after_idle", ignoring: Read-only file system
sysctl: setting key "net.ipv4.tcp_fin_timeout", ignoring: Read-only file system
sysctl: setting key "net.ipv4.tcp_keepalive_time", ignoring: Read-only file system
sysctl: setting key "vm.overcommit_memory", ignoring: Read-only file system

However, I was able to set a higher limit for max open files by the process, by adjusting LimitNOFILE in the systemd service.

sed -i 's/LimitNOFILE=65536/LimitNOFILE=900000/g' /lib/systemd/system/nomad.service

So, now ulimit correctly shows the max number of open files for that process:

ulimit -n
900000

But other sysctl limits have no effect until I restart Nomad agent when an alloc is already running here:

cat /proc/sys/net/core/somaxconn
4096

On restarting the agent:

sudo systemctl restart nomad
cat /proc/sys/net/core/somaxconn
60000

image

This seems like a weird edge case where Nomad applies the limits inside the task only after a restart.


Another workaround I tried, was to mount /proc/sys from the host inside chroot_env but even that had no effect:

      # Kernel Params
      "/proc/sys/fs/file-max" = "/proc/sys/fs/file-max"
      "/proc/sys/net/ipv4/ip_local_port_range" = "/proc/sys/net/ipv4/ip_local_port_range"
      "/proc/sys/net/netfilter/nf_conntrack_max" = "/proc/sys/net/netfilter/nf_conntrack_max" 
      "/proc/sys/net/ipv4/tcp_syncookies" = "/proc/sys/net/ipv4/tcp_syncookies"
      "/proc/sys/net/core/somaxconn" = "/proc/sys/net/core/somaxconn" 
      "/proc/sys/net/ipv4/tcp_max_syn_backlog" = "/proc/sys/net/ipv4/tcp_max_syn_backlog"
      "/proc/sys/net/ipv4/tcp_slow_start_after_idle" = "/proc/sys/net/ipv4/tcp_slow_start_after_idle"
      "/proc/sys/net/ipv4/tcp_fin_timeout" = "/proc/sys/net/ipv4/tcp_fin_timeout"
      "/proc/sys/net/ipv4/tcp_keepalive_time" = "/proc/sys/net/ipv4/tcp_keepalive_time"
      "/proc/sys/vm/overcommit_memory" = "/proc/sys/vm/overcommit_memory"

Is there any way to signal "exec" driver to load sysctl kernel params before the task gets started? Seems like a huge bummer for deploying high throughput apps on exec.

@tgross
Copy link
Member

tgross commented Sep 30, 2022

👋 I'm going to close out this issue because frankly, we're not likely to implement limits in the exec driver at this point without a full refresh of the driver which we've been chatting about internally. When we open an issue around an exec driver refresh, we'll be sure to keep this problem in mind.

@mr-karan, as for what you're seeing note that Nomad's exec driver doesn't set limits at all: that's actually what this ancient feature request is all about! The limits are being set elsewhere on your system, usually inherited the systemd service unit. I suspect what's happening here is that the executor is getting reparented to PID1 (systemd) when you're restarting the Nomad client, and then systemd is applying the default limits there. But if you dig into that some more and need more help, please feel free to open a new issue.

@tgross tgross closed this as not planned Won't fix, can't repro, duplicate, stale Sep 30, 2022
@prologic
Copy link

Hmmm it's not that hard to write the code to exec the process in a new cgroups and apply some of hte same limits you can apply in other types of Nomad jobs.

@tgross
Copy link
Member

tgross commented Oct 1, 2022

It's not a matter of difficulty, it's a matter of piling more features onto the exec driver in its current form, which we're not intending to do.

@github-actions
Copy link

I'm going to lock this issue because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Jan 29, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
stage/accepted Confirmed, and intend to work on. No timeline committment though. theme/driver/exec type/enhancement
Projects
None yet
Development

No branches or pull requests

7 participants