Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Many students were asking how to parallelize grid search #16

Open
3 tasks
cimentadaj opened this issue Jul 7, 2020 · 2 comments
Open
3 tasks

Many students were asking how to parallelize grid search #16

cimentadaj opened this issue Jul 7, 2020 · 2 comments

Comments

@cimentadaj
Copy link
Owner

  • Figure out which interface uses tidymodels

  • Can we just define the parallel interface outside the tidyflow, together with the cores? Or do we have to pass anything to the tidyflow --> tidymodels as arguments?

  • Make sure you can test that the models are being ran in parallel. Don't just run the model without making sure.

  • One approach is to allow each cross-validated set from tidymodels to be verbose and if many different CV sets are pushing text without the sequential order, they're being ran in parallel.

@cimentadaj
Copy link
Owner Author

We're overthinking this. Following the vignette at tidymodels (https://tune.tidymodels.org/articles/extras/optimizations.html), we just have to specify the parallel backend before running tidyflow. We can add a small vignette with parallelization instructions.

@cimentadaj
Copy link
Owner Author

In light of tidymodels/tune#275, it would be useful if we added a test to check that seeds replicate across parallelization. I believe this should work out of the box since the seed is run before any step in tidyflow but we should make sure this is the case in the tests.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant