Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] Too many open files #26

Open
tomviering opened this issue Aug 2, 2024 · 0 comments
Open

[Bug] Too many open files #26

tomviering opened this issue Aug 2, 2024 · 0 comments

Comments

@tomviering
Copy link

I am using pynisher to run many experiments, but after a while my job crashes with errors related to too many open files. I have found a workaround, that is, by setting ulimit -n 8000 on a (linux) machine before executing any experiments, but I am not super sure if this will resolve the problem if I run a very long experiment (?). Below is the minimal code to reproduce the problem:

from pynisher import limit
import tqdm

def get_entry_learner():
	pass

def main():
	timelimit = 1
	get_entry_learner_pynisher = limit(get_entry_learner, wall_time=(timelimit, "h"))
	for i in tqdm.tqdm(range(0, 1000)):
		get_entry_learner_pynisher()

if __name__ == '__main__':
	main()

On my machine (linux) this crashes after around 500 iterations. This happens both for the context='fork' and context='spawn', others I have not tried. The errors are:

In case I run the above with context=fork I get the following error:

    working on problem 506
    /home/tjviering/.pyenv/versions/lcdb11/lib/python3.10/site-packages/joblib/externals/loky/backend/context.py:136: UserWarning: Could not find the number of physical cores for the following reason:
    [Errno 24] Too many open files
    Returning the number of logical cores instead. You can silence this warning by setting LOKY_MAX_CPU_COUNT to the number of cores you want to use.
      warnings.warn(
      File "/home/tjviering/.pyenv/versions/lcdb11/lib/python3.10/site-packages/joblib/externals/loky/backend/context.py", line 250, in _count_physical_cores
        cpu_info = subprocess.run(
      File "/home/tjviering/.pyenv/versions/3.10.14/lib/python3.10/subprocess.py", line 503, in run
        with Popen(*popenargs, **kwargs) as process:
      File "/home/tjviering/.pyenv/versions/3.10.14/lib/python3.10/subprocess.py", line 971, in __init__
        self._execute_child(args, executable, preexec_fn, close_fds,
      File "/home/tjviering/.pyenv/versions/3.10.14/lib/python3.10/subprocess.py", line 1762, in _execute_child
        errpipe_read, errpipe_write = os.pipe()
    score 1.00
    working on problem 507
    /home/tjviering/.pyenv/versions/lcdb11/lib/python3.10/site-packages/joblib/externals/loky/backend/context.py:136: UserWarning: Could not find the number of physical cores for the following reason:
    [Errno 24] Too many open files
    Returning the number of logical cores instead. You can silence this warning by setting LOKY_MAX_CPU_COUNT to the number of cores you want to use.
      File "/home/tjviering/.pyenv/versions/lcdb11/lib/python3.10/site-packages/joblib/externals/loky/backend/context.py", line 250, in _count_physical_cores
      File "/home/tjviering/.pyenv/versions/3.10.14/lib/python3.10/subprocess.py", line 503, in run
      File "/home/tjviering/.pyenv/versions/3.10.14/lib/python3.10/subprocess.py", line 837, in __init__
      File "/home/tjviering/.pyenv/versions/3.10.14/lib/python3.10/subprocess.py", line 1643, in _get_handles
    OSError: Traceback (most recent call last):
      File "/home/tjviering/.pyenv/versions/lcdb11/lib/python3.10/site-packages/pynisher/limiters/limiter.py", line 143, in __call__
      File "/home/tjviering/lcdb11_new/lcdb1.1/analysis_tom/problempynisher.py", line 8, in get_entry_learner
      File "/home/tjviering/.pyenv/versions/lcdb11/lib/python3.10/site-packages/sklearn/base.py", line 764, in score
      File "/home/tjviering/.pyenv/versions/lcdb11/lib/python3.10/site-packages/sklearn/neighbors/_classification.py", line 259, in predict
      File "/home/tjviering/.pyenv/versions/lcdb11/lib/python3.10/site-packages/sklearn/neighbors/_classification.py", line 366, in predict_proba
      File "/home/tjviering/.pyenv/versions/lcdb11/lib/python3.10/site-packages/sklearn/neighbors/_base.py", line 850, in kneighbors
      File "/home/tjviering/.pyenv/versions/lcdb11/lib/python3.10/site-packages/sklearn/metrics/_pairwise_distances_reduction/_dispatcher.py", line 278, in compute
      File "sklearn/metrics/_pairwise_distances_reduction/_argkmin.pyx", line 90, in sklearn.metrics._pairwise_distances_reduction._argkmin.ArgKmin64.compute
      File "/home/tjviering/.pyenv/versions/lcdb11/lib/python3.10/site-packages/sklearn/utils/fixes.py", line 90, in threadpool_limits
      File "/home/tjviering/.pyenv/versions/lcdb11/lib/python3.10/site-packages/sklearn/utils/fixes.py", line 84, in _get_threadpool_controller
      File "/home/tjviering/.pyenv/versions/lcdb11/lib/python3.10/site-packages/threadpoolctl.py", line 818, in __init__
      File "/home/tjviering/.pyenv/versions/lcdb11/lib/python3.10/site-packages/threadpoolctl.py", line 976, in _load_libraries
      File "/home/tjviering/.pyenv/versions/lcdb11/lib/python3.10/site-packages/threadpoolctl.py", line 988, in _find_libraries_with_dl_iterate_phdr
      File "/home/tjviering/.pyenv/versions/lcdb11/lib/python3.10/site-packages/threadpoolctl.py", line 1227, in _get_libc
      File "/home/tjviering/.pyenv/versions/3.10.14/lib/python3.10/ctypes/util.py", line 330, in find_library
      File "/home/tjviering/.pyenv/versions/3.10.14/lib/python3.10/ctypes/util.py", line 116, in _findLib_gcc
      File "/home/tjviering/.pyenv/versions/3.10.14/lib/python3.10/tempfile.py", line 575, in NamedTemporaryFile
      File "/home/tjviering/.pyenv/versions/3.10.14/lib/python3.10/tempfile.py", line 572, in opener
      File "/home/tjviering/.pyenv/versions/3.10.14/lib/python3.10/tempfile.py", line 256, in _mkstemp_inner
    OSError: [Errno 24] Too many open files: '/tmp/tmpmjfqowrx'
    
    The above exception was the direct cause of the following exception:
    
    Traceback (most recent call last):
      File "/home/tjviering/lcdb11_new/lcdb1.1/analysis_tom/problempynisher.py", line 26, in <module>
        test()
      File "/home/tjviering/lcdb11_new/lcdb1.1/analysis_tom/problempynisher.py", line 22, in test
        temp = get_entry_learner_pynisher(X, y)
      File "/home/tjviering/.pyenv/versions/lcdb11/lib/python3.10/site-packages/pynisher/pynisher.py", line 525, in __call__
        return self._handle_return(result=result, err=err, tb=tb)
      File "/home/tjviering/.pyenv/versions/lcdb11/lib/python3.10/site-packages/pynisher/pynisher.py", line 635, in _handle_return
        raise err from err.__class__(tb)
    OSError: [Errno 24] Too many open files: '/tmp/tmpmjfqowrx'

I also tried it with with context='spawn' then I get the following error:

    Traceback (most recent call last):
      File "/home/tjviering/lcdb11_new/lcdb1.1/analysis_tom/problempynisher.py", line 29, in <module>
        test()
      File "/home/tjviering/lcdb11_new/lcdb1.1/analysis_tom/problempynisher.py", line 25, in test
        temp = get_entry_learner_pynisher(X, y)
      File "/home/tjviering/.pyenv/versions/lcdb11/lib/python3.10/site-packages/pynisher/pynisher.py", line 379, in __call__
        subprocess.start()
      File "/home/tjviering/.pyenv/versions/3.10.14/lib/python3.10/multiprocessing/process.py", line 121, in start
        self._popen = self._Popen(self)
      File "/home/tjviering/.pyenv/versions/3.10.14/lib/python3.10/multiprocessing/context.py", line 288, in _Popen
        return Popen(process_obj)
      File "/home/tjviering/.pyenv/versions/3.10.14/lib/python3.10/multiprocessing/popen_spawn_posix.py", line 32, in __init__
        super().__init__(process_obj)
      File "/home/tjviering/.pyenv/versions/3.10.14/lib/python3.10/multiprocessing/popen_fork.py", line 19, in __init__
        self._launch(process_obj)
      File "/home/tjviering/.pyenv/versions/3.10.14/lib/python3.10/multiprocessing/popen_spawn_posix.py", line 58, in _launch
        self.pid = util.spawnv_passfds(spawn.get_executable(),
      File "/home/tjviering/.pyenv/versions/3.10.14/lib/python3.10/multiprocessing/util.py", line 450, in spawnv_passfds
        errpipe_read, errpipe_write = os.pipe()
    OSError: [Errno 24] Too many open files
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant