Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Less intelligent mode for GPU allocation #11

Open
kouyk opened this issue Nov 8, 2021 · 7 comments
Open

Less intelligent mode for GPU allocation #11

kouyk opened this issue Nov 8, 2021 · 7 comments
Labels
enhancement New feature or request

Comments

@kouyk
Copy link

kouyk commented Nov 8, 2021

Let's say I have 2 GPUs that are shared with others, I would like to allocate a single job to a single GPU.

Using the --gpus option will require that a GPU is considered free, but setting the right free percentage might be tricky. The -g flag will ignore the free requirement but consecutive jobs assigned to the same GPU will start as long as there are available slots. The high level view is that there will be a single slot for each GPU and jobs will run on a GPU as long as the current user does not have a process running on it.

Essentially I want to be able to just specify the number of gpus needed by a job, and task-spooler will allocate the gpus based on whether there are any running jobs on the GPU, regardless of the memory usage. It is a hybrid mode between the automatic allocation and manual allocation.

What I am currently doing is to create two different ts servers that uses different TMPDIR and use the -g flag to force a single GPU for jobs submitted to a given server, which isn't ideal and kinda defeats the purpose of ts.

BTW could there be a configuration file that permanently sets the env vars? It would be great if things like the GPU wait could be set permanently as well.

@justanhduc
Copy link
Owner

Hi @kouyk. Let me clarify my understanding a bit.

Let's say I have 2 GPUs that are shared with others, I would like to allocate a single job to a single GPU.

Using the --gpus option will require that a GPU is considered free, but setting the right free percentage might be tricky. The -g flag will ignore the free requirement but consecutive jobs assigned to the same GPU will start as long as there are available slots. The high level view is that there will be a single slot for each GPU and jobs will run on a GPU as long as the current user does not have a process running on it.

Essentially I want to be able to just specify the number of gpus needed by a job, and task-spooler will allocate the gpus based on whether there are any running jobs on the GPU, regardless of the memory usage. It is a hybrid mode between the automatic allocation and manual allocation.

So you have 2 GPUs and there might be some other people using them. That's why -G and setting free percentage don't work because you don't know how much memory will be used by other users' processes, but you still want to run your process anyway? I would advise against running like this as it may crash both your and the other user's processes.

It is possible to determine free GPUs based on process (I have to check again whether it is possible based on process from a certain user though). Are you sure you want to do this given the risk of crashing not only your but also other people's jobs.

BTW could there be a configuration file that permanently sets the env vars?

The idea of a configuration file is cool. Thanks for the suggestion.

It would be great if things like the GPU wait could be set permanently as well.

GPU wait time is removed due to an internal change in the way GPUs are assigned to jobs in the next release. Specifically, now the server will be in charge of assigning GPUs to clients instead of letting clients choose as the current version.

@justanhduc justanhduc added the enhancement New feature or request label Nov 9, 2021
@kouyk
Copy link
Author

kouyk commented Nov 9, 2021

So you have 2 GPUs and there might be some other people using them. That's why -G and setting free percentage don't work because you don't know how much memory will be used by other users' processes, but you still want to run your process anyway? I would advise against running like this as it may crash both your and the other user's processes.

It is possible to determine free GPUs based on process (I have to check again whether it is possible based on process from a certain user though). Are you sure you want to do this given the risk of crashing not only your but also other people's jobs.

Yup I understand the risk, my view on this is that since ts already has the -g option, the same risk is already present. An idea appeared as I thought about the crashing issue. Since ts monitors the exit code, would it be useful to have a queue and retry n times option to deal with insufficient memory? The situation is quite dynamic as all jobs by me and others do not have a fixed memory usage, without a reliable way to calculate this the best alternative seems to be trial and error.

GPU wait time is removed due to an internal change in the way GPUs are assigned to jobs in the next release. Specifically, now the server will be in charge of assigning GPUs to clients instead of letting clients choose as the current version.

Oh so there it won't be possible to choose a specific gpu anymore?

@justanhduc
Copy link
Owner

So you have 2 GPUs and there might be some other people using them. That's why -G and setting free percentage don't work because you don't know how much memory will be used by other users' processes, but you still want to run your process anyway? I would advise against running like this as it may crash both your and the other user's processes.
It is possible to determine free GPUs based on process (I have to check again whether it is possible based on process from a certain user though). Are you sure you want to do this given the risk of crashing not only your but also other people's jobs.

Yup I understand the risk, my view on this is that since ts already has the -g option, the same risk is already present. An idea appeared as I thought about the crashing issue. Since ts monitors the exit code, would it be useful to have a queue and retry n times option to deal with insufficient memory? The situation is quite dynamic as all jobs by me and others do not have a fixed memory usage, without a reliable way to calculate this the best alternative seems to be trial and error.

IMO, when a user uses -g, they understand what would happen in terms of memory, usage, etc... On the other hand, letting ts choose among occupied GPUs is somewhat russian roulette.

Anyway, we can have an option that tells ts to choose GPUs based on processes. Would you like to make a PR?

The idea of retrying a process is cool btw! This is not difficult as far as I can tell now.

GPU wait time is removed due to an internal change in the way GPUs are assigned to jobs in the next release. Specifically, now the server will be in charge of assigning GPUs to clients instead of letting clients choose as the current version.

Oh so there it won't be possible to choose a specific gpu anymore?

Yes you still can use -g like before. Basically nothing will change from the user's viewpoint, except that the two flags --set_gpu_wait and --get_gpu_wait will be removed.

@kouyk
Copy link
Author

kouyk commented Nov 11, 2021

Anyway, we can have an option that tells ts to choose GPUs based on processes. Would you like to make a PR?

I would love to make a PR, however, since I don't have a Nvidia card on my personal device it is rather challenging for me to test and debug the issues on a remote node which I don't have root privileges. After my current commitments are over I might look into contributing as well :)

@yanggthomas
Copy link

hi, I would like to know the current status of set_gpu_wait flag. Is there any other mechanism that avoids the crash due to insufficient memory space? Or how can I force 1 job per GPU?

@justanhduc
Copy link
Owner

Hey @yanggthomas. Due to some internal changes, set_gpu_wait has been deprecated for some time and is just a no-op for compatibility. ts never selects a "working" GPU to run a job unless the GPU ID is specified explicitly. A GPU is determined to be free or not based on the percentage of free memory, which can be seen and set via get_gpu_free_perc and set_gpu_free_perc. By default, if more than 90% of the GPU mem is free, the GPU is eligible to run a queued job. Let me know if there's anything unclear.

@lucasb-eyer
Copy link

hi, could we please have a way for ts to consider scheduling multiple jobs on a single GPU automatically?

I understand you didn't like the "free memory" heuristic because of job init time.

May I suggest a "set_max_jobs_per_cpu" kind of flag? If I know that my GPU can fit 3 of my jobs, then I can just set this and things will work, and I'll use my GPUs better.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants