Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add num_processes to reader.train() to configure multiprocessing #271

Merged
merged 1 commit into from
Jul 29, 2020

Conversation

tholor
Copy link
Member

@tholor tholor commented Jul 28, 2020

In some cases (e.g. debugging or fine-tuning on small datasets) it can be useful to disable multiprocessing.
Let's add an argument to reader.train() that allows this.

Usage:

num_processes = 0 or 1  --> Plain, single python process 
num_processes = 12 --> Multiprocessing with 12 processes
num_processes = None --> Multiprocessing with "number of CPU cores -1" processes

Related to #268

@tholor tholor requested a review from Timoeller July 28, 2020 12:55
Copy link
Contributor

@Timoeller Timoeller left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looking good.

I changed the PR message, since when max_processes is either 0 or 1 no multiprocessing pool is initialized.

I also made use of num_processes in the FARM Inferencer and max_processes in FARM DataSilo more consistent in deepset-ai/FARM#480

@tholor tholor merged commit abec1be into master Jul 29, 2020
@julian-risch julian-risch deleted the add_multiproc_arg_reader_train branch November 15, 2021 07:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants