Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to prepare for forking? #65

Open
hvds opened this issue Apr 23, 2022 · 2 comments
Open

How to prepare for forking? #65

hvds opened this issue Apr 23, 2022 · 2 comments

Comments

@hvds
Copy link
Contributor

hvds commented Apr 23, 2022

I'm writing code to fork multiple processes, each of which will test one number - anything from a primality test to a full factorization. Other than loading MPU and MPUG, what should I do in the parent to ensure that tables of primes and the like are appropriately cached before forking? I suspect I should at least be calling prime_precalc(), but I don't know how to estimate or discover how many primes it will need to factorize numbers in a given range.

I don't see anything in MPU-0.73 or MPUG-0.52 docs about this, apologies if I've missed something.

The details: I have an array of subrefs @check that typically check is_prime($value) or is_semiprime($value). I'll essentially be replacing this fragment of my code:

  return 1 if all { $_->() } @check;

with something like:

  for my $check (@check) {
    if (my $pid = fork) {
      $waitpids{$pid} = 1;
    } else {
      exit $check->() ? 0 : 1;
    }
  }
  # now look for exit codes via waitpid()
  # as soon as one exits non-zero we can kill the rest

.. because while some of the checks return in under a second, others are taking hours or days. This way, my hope is that it'll run as fast as the fastest (failing) check rather than having to do them in order. The numbers under consideration will initially be in the 80-105 digit range, and I estimate that I may need to check up to 230 million sets before finding the solution I'm looking for. (If that one works, I have another batch of 140-digit numbers to check.)

I'm hoping that fork() will prove lightweight enough to make this efficient. If not, an alternative would be to fork a pool of longer-lived servers that accept a request but can be interrupted with a signal: less process churn, but much fiddlier to code to ensure no signals are lost.

@mohawk2
Copy link

mohawk2 commented May 8, 2022

I have an aspiration that this library be available by some means with PDL (as mentioned at https://perlmonks.org/?node_id=11143660).

If it were, the use of POSIX threads (routinely available in PDL) would be an even lighter-weight solution to this problem, though there isn't yet handling for the sort of fail/succeed-fast envisaged here. If there were, that would involve real multi-processing coordination, which would require some thinking. That would probably help in utilising GPUs, so I will link PDLPorters/pdl#349

@hvds
Copy link
Contributor Author

hvds commented May 3, 2023

FWIW I gave up on doing this with forking Perl code, instead implemented C code inspired by danaj/Math-Prime-Util-GMP#30 that tests a set of numbers in parallel (testing whether divisors(n_i) == t_i for all i), doing one step of the factorization work at a time: see tau_multi_prep() and tau_multi_run() in https://github.com/hvds/divrep/blob/master/coultau.c; it would still be useful for future reference to have some docs around the original question though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants