-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sensible Default values? #12
Comments
Exactly how many instructions a given chunk of code takes is going to vary But, you can look at a piece of code and make a decent estimate. So for
Then I guess that the inner loop is a compare, branch, load, multiply, and a store. If I had something like:
then I just need to estimate what value of (end-begin) I think will results As you vary K from K=1 to n/K=1 you usually find there is quite a large region |
"You should think about what "sensible default" means: the TBB user guide gives some guidance in the form of a "rule of thumb". Your rule of thumb should probably take into account the size of the inner loop: remember that the work is O(n^2), and the work within each original parfor iteration is O(n). But on the other side, you always want enough tasks to keep the processors occupied (no matter how many there are), so you can't set the chunk size to K=n."
The guide suggest timing to ensure they run for at least several thousand clock cycles, presumably that wasn't what we were meant to do?
and later on
"As before, if HPCE_FFT_LOOP_K is not set, choose a sensible default based on your analysis of the scaling with n, and/or experiments. Though remember, it should be a sensible default for all machines (even those with 64 cores)."
I'm not sure how to evaluate the performance in order to tune these 'sensible default's for all kinds of machines
The text was updated successfully, but these errors were encountered: