-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FEATURE] Change RealMemory of compute nodes to match total instance type memory #283
Comments
I fully endorse this proposal. I already do this with my submission scripts or the way I specify the submission. This proposal requires that a spec of --mem=8g will get an 8g machine, i.e. PC (or slurm) sees all 8G. I'm all for this. |
The default is to set it to 95% which makes it difficult to correctly request memory because you could submit a job and request 4GB and wind up running on an 8GB machine. This just makes it more intuitive. Resolves #283
The default is to set it to 95% which makes it difficult to correctly request memory because you could submit a job and request 4GB and wind up running on an 8GB machine. This just makes it more intuitive. Resolves #283
Turned out to be a super easy change and now things work the way that I expect. You still might want to use the memory based partitions to make sure that a job doesn't land on a larger machine. |
Is your feature request related to a problem? Please describe.
By default, PC sets the RealMemory of a compute node to 95% of the total instance type memory.
This is because not 100% of the memory is available to jobs.
When slurmd starts, it reports the amount of free memory and if it is less than the configured RealMemory then it flags an error and marks the node as Drain.
This behavior is modified by the by the following Slurm config parameter:
SlurmctldParameters=node_reg_mem_percent=75
This allows the node to register with the controller as long as the available memory is at least 75% of the configured RealMemory.
This works because slurmd will not configure a cgroup a job that exceeds the amount of available real memory.
So even if the scheduler allocates a job that uses more than the available memory, the job will not be allowed to cause an OOM situation and cause the instance to creash.
The current configuration results in unintuitive behavior and waste of memory.
Lets consider an instance with 8 GB of real memory.
PC configures the RealMemory as 8 * 1024 * 0.95 = 7782 MiB = 7.6 GiB.
However, you'd kind of expect to be able to fit two 4 GB jobs on that machine.
But to do that today you'd have to request 3.8 GB for each job or else they won't fit.
If you specify 4 GB then the first job will start and 8GB instance, not a 4GB instance and reserve 4 GB for the job.
The scheduler will see the instance as having only 3.6 GB memory free for additional jobs.
So, the next 4 GB machine will start another 8GB machine for the 2nd job.
This will use double the compute resources as would be expected.
What I propose is to configure each compute resource as having 100% RealMemory and set node_reg_mem_percent to a value that allows the compute nodes to register successfully.
This should allow jobs to use round numbers for memory requests and allow more efficient instance utilization.
Job memory requests have to be in excess of actual memory requirements to prevent jobs from running out of memory
so the fact that available memory is less shouldn't be an issue.
Note that this is already the case anyway.
If anything, this should make more memory available to jobs.
The text was updated successfully, but these errors were encountered: