-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[bug] CPU load when idle #177
Comments
When the number of cores is increased further, than the number of cores with high CPU load also increases. So it seems the mpi broadcast https://github.com/pyiron/pylammpsmpi/blob/main/pylammpsmpi/mpi/lmpmpi.py#L488 is waiting for the socket to receive information https://github.com/pyiron/pympipool/blob/main/pympipool/shared/communication.py#L140 |
The CPU load is related to the MPI broadcast, here is a reduced example. Script import sys
from mpi4py import MPI
from pympipool.shared.communication import (
interface_connect,
interface_send,
interface_receive,
)
from pympipool.shared.backend import parse_arguments
def main(argument_lst=None):
if argument_lst is None:
argument_lst = sys.argv
argument_dict = parse_arguments(argument_lst=argument_lst)
if MPI.COMM_WORLD.rank == 0:
context, socket = interface_connect(
host=argument_dict["host"], port=argument_dict["zmqport"]
)
else:
context, socket = None, None
while True:
if MPI.COMM_WORLD.rank == 0:
input_dict = interface_receive(socket=socket)
else:
input_dict = None
input_dict = MPI.COMM_WORLD.bcast(input_dict, root=0)
if MPI.COMM_WORLD.rank == 0 and input_dict is not None:
interface_send(socket=socket, result_dict={"result": input_dict})
if __name__ == "__main__":
main(argument_lst=sys.argv) jupyter notebook to control the import os
from pympipool import interface_bootup, interface_send, interface_receive
interface = interface_bootup(
command_lst=["python", os.path.join(os.path.abspath("."), "reply.py")],
cwd=None,
cores=8,
gpus_per_core=0,
oversubscribe=False,
enable_flux_backend=False,
enable_slurm_backend=False,
queue_adapter=None,
queue_type=None,
queue_adapter_kwargs=None,
)
interface.send_and_receive_dict(input_dict={"a": 1}) With this code 8 cores remain busy. In contrast when the reply script is replaced with: import sys
from pympipool.shared.communication import (
interface_connect,
interface_send,
interface_receive,
)
from pympipool.shared.backend import parse_arguments
def main(argument_lst=None):
if argument_lst is None:
argument_lst = sys.argv
argument_dict = parse_arguments(argument_lst=argument_lst)
context, socket = interface_connect(
host=argument_dict["host"], port=argument_dict["zmqport"]
)
while True:
input_dict = interface_receive(socket=socket)
interface_send(socket=socket, result_dict={"result": input_dict})
if __name__ == "__main__":
main(argument_lst=sys.argv) And the number of cores is reduced to |
@pmrv I moved the issue to |
If you are testing with OpenMPI, you might have to set |
The debugging is simplified by #178 |
Maybe it is related to mpi4py/mpi4py#468 |
This fix was added in #279 |
It works for a single core:
but calling:
results in one process with 100% CPU load.
Found by @pmrv
The text was updated successfully, but these errors were encountered: