-
Notifications
You must be signed in to change notification settings - Fork 115
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Need to increase shm size for ilab launcher script #721
Comments
relyt0925
added a commit
to relyt0925/ai-lab-recipes
that referenced
this issue
Aug 4, 2024
Include ILAB_GLOBAL_CONFIG, VLLM_LOGGING_LEVEL, and NCCL_DEBUG as environment variables when starting the ilab container. Also add shared memory size of 10G to enable vllm execution. Resolves: containers#721
relyt0925
added a commit
to relyt0925/ai-lab-recipes
that referenced
this issue
Aug 4, 2024
Include ILAB_GLOBAL_CONFIG, VLLM_LOGGING_LEVEL, and NCCL_DEBUG as environment variables when starting the ilab container. Also add shared memory size of 10G to enable vllm execution. Resolves: containers#721 Signed-off-by: Tyler Lisowski <[email protected]>
jhutar
pushed a commit
to jhutar/ai-lab-recipes
that referenced
this issue
Aug 5, 2024
…r vllm to 10GB Include ILAB_GLOBAL_CONFIG, VLLM_LOGGING_LEVEL, and NCCL_DEBUG as environment variables when starting the ilab container. Also add shared memory size of 10G to enable vllm execution. Resolves: containers#721 Signed-off-by: Tyler Lisowski <[email protected]>
jhutar
pushed a commit
to jhutar/ai-lab-recipes
that referenced
this issue
Aug 5, 2024
…r vllm to 10GB Include ILAB_GLOBAL_CONFIG, VLLM_LOGGING_LEVEL, and NCCL_DEBUG as environment variables when starting the ilab container. Also add shared memory size of 10G to enable vllm execution. Resolves: containers#721 Signed-off-by: Tyler Lisowski <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Currently: the ilab wrapper script here: https://github.com/containers/ai-lab-recipes/blob/main/training/ilab-wrapper/ilab does not update the shm size to be 10 Gigs. It is noted in requirements for running deepspeed and vllm that a shm size of 10GB is necessary to run full scale model inferencing and training.
A related issue: vllm-project/vllm#1710 and just to note: the earlier scripts that launched vllm directly had a shm size of 10GB
The text was updated successfully, but these errors were encountered: