[Bug]: pt_main_thread
processes are not killed after main process is killed in MP distributed executor backend
#6766
Labels
bug
Something isn't working
Your current environment
🐛 Describe the bug
I am trying to understand the vllm's workflow for distributed serving via multiprocessing. The original setup is deploying a model with tensor parallel size = 2 through Triton Inference Server and
distributed_executor_backend: mp
. While inference is going well, when server is shutting down , 2 processespt_main_thread
are not killed and their status isState: S (sleeping)
.The closes reproducer outside of Triton is this:
And the workflow is the following:
And same, the above 2 processes are in the sleeping state based on
cat /proc/_PID_/status
Any insights on vllm's distributed serving with multiprocessing is greatly appreciated.
The text was updated successfully, but these errors were encountered: