-
-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Codify processor RAM usage into nomad job specs #54
Comments
I emailed Steven Piccolo (the creator of SCAN.UPC) about this a while ago. This was his response:
On Linux, you can use a command like this:
|
Can tag @srp33 too. 😄 |
I've updated the title to reflect the fact that this should probably be addressing multiple processor types. @kurtwheeler can you go back through and note what additional testing/experimentation, if any, needs to be done? |
Thanks for updating the title, this should in fact address multiple processor types. I do not have a definitive answer as to what the full extent of testing we _should_do, but I can provide a lot of information that will hopefully facilitate a discussion which will get us to a good enough solution. The reason for this is that this is partially an optimization and partially extremely necessary to have things run smoothly. So let me start by explaining the extremely necessary part of this. Nomad Jobs have memory specifications on them which it uses to bin pack jobs onto instances which have the resources available to run them. This is a nice feature because it lets us put as many jobs onto a machine as possible without running out of memory and having to switch to swap. Switching to swap would be bad because it would be an enormous performance degradation. However this feature also has a downside, because Nomad disables swap for those containers and will not let them exceed their memory requirement. If they do it kills the job. The key takeaway from that is that we have to tell Nomad how much memory to allocate for a job, and if the job exceeds that amount of RAM then it will be killed. Therefore underestimating the amount of memory to specify for a type of job (possibly done on a per-processor basis, but even different platforms can have different memory requirements as demonstrated above) will mean that any jobs which exceed it will never complete, until we notice and raise the memory limit. So what this means for us is that we need to have a pessimistic estimation of how much memory each type of job can take. By pessimistic estimation I mean that we need an estimation that is probably higher than the highest amount of memory that type of job will ever use. That's the extremely necessary part of this issue. The part of this issue that relates to optimization is the granularity of the job types and the accuracy of our estimates. The more granular our job types are, the "personalized" the job types will be to the requirements of the jobs which fall under that type. For example, consider the numbers Dr. Piccolo provided above for SCAN. He broke them out by platform type and found that while U133 only needs a little more than one GB of RAM, that HTA 2 arrays need close to 12. This means that if we use processor type as the way to differentiate job types that we would have to assume that all SCAN jobs need 12 GB of RAM (although we'd probably want to pad that by x1.5 or something to account for the fact that Dr. Piccolo probably did not choose to run the sample which had the very highest memory usage for that platform). So we would allocate 12 GB of RAM for every SCAN job when the average job probably is using less than 4 GB (rough estimate based on the other platforms he provided data for). Therefore doing things that way has the potential to increase the costs of processing by ~3x!!! This is not a minor optimization. However there are a lot of ways that we can tackle this, which is why I'm writing all of this out instead of just providing a definitive answer to this. I have a few rough ideas to start with, but I think that as a team we potentially could come up with even better ones:
Finally, it appears to be possible to rebuild Nomad so that it accepts a memory limit which isn't used for bin packing. This would allow the memory limit used for allocation planning and the actual hard memory limit on the docker container to be different. I would say that rebuilding Nomad is definitely not the best long term solution, but it may be a reasonable way to do the initial million samples without sinking an immense amount of time into optimization. We could potentially even keep track of the peak memory usage while running those to give us data to inform a more robust solution later. So to finally directly address @jaclyn-taroni's question of "note what additional testing/experimentation, if any, needs to be done", it depends how we decide to specify job types. That being said, the following will output the max memory usage in KB of /usr/bin/time -v $command 2>&1 >/dev/null | grep Maximum | awk '{print $NF}' Really I think CPU is not that big of a deal given that memory is the constraining factor for us. We probably won't run into that issue as much, so we can probably just specify fairly low CPU numbers so Nomad basically ignores them. However I have put some thought into this so I will record it below. Feel free to stop reading here. Testing CPU is more difficult because even a command like @Miserlou suggested that the best way to determine CPU requirements for a certain process is to limit how much CPU it can use and see where performance drops off. Since these processors are memory-bound, it should be that they have relatively constant performance until they don't have enough CPU power and then their performance will drop off drastically. It appears that this could be done using climit although that enforces the CPU limit using SIGSTOP and SIGSTART so I'm not actually sure that would end up having the desired effect after all. |
So I discussed this some with @cgreene and a few relevant things came up:
Therefore it should be possible to loop through these two lists:
For each item of each list, query for a few samples (maybe 5 or 6), record the peak memory usage for processing each one, average them together, and use that as an estimate of the RAM needs for that platform or organism. Therefore two things need to happen for this issue to be resolved:
|
Noting here that MultiQC can also use a large amount of memory (or at least more than we currently allocate it, as I had a few of these in the logs):
|
I've submitted a PR to get better telemetry out of Nomad. We should be able to use this to reevaluate our job requirements after we run a staging crunch. |
Context
#55 creates different Nomad job specifications for different processor job types. One benefit of this is that we can specify the resource requirements (probably just RAM/CPU) for each job type so that Nomad can schedule the work in a (hopefully) intelligent way.
Problem or idea
Assuming this works well, we'll want to do this for all job types. However we should make sure that Nomad does in fact do a good job of scheduling work before we do this for everything. Therefore we should start with just one, so why not SCAN.UPC?
Solution or next step
We need to test out SCAN.UPC on a variety of file sizes and see what a reasonable upper bound for RAM and CPU is. Once we've determined these, they should be encoded into the Nomad job specification for SCAN.UPC jobs created in #55.
The text was updated successfully, but these errors were encountered: