-
-
Notifications
You must be signed in to change notification settings - Fork 410
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OpenMP WIP #338
OpenMP WIP #338
Conversation
void initialize_random_kit(unsigned long seed) | ||
|
||
|
||
|
||
def montecarlo_radial1d(model, int_type_t virtual_packet_flag=0): | ||
def montecarlo_radial1d(model, int_type_t virtual_packet_flag=0, int nthreads=4): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nthreads
is still hardcoded here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, you just have to specify something for a keyword argument. It will take it from yaml and will default to one.
@orbitfold - any idea why that fails now? |
Hi @wkerzendorf, @orbitfold, @ssim I did a quick strong scaling test on one of the servers here at MPA, using the tardis_example setup. These are the results: |
Looks like progress, sorry I never got into looking into this, I had This looks like something is still not quite right since the spectra don't What is the NUMA structure of the machine you are running on? But this is Important --- is this just a strong scaling test of the transport loop, or Rollin On Wed, Jul 1, 2015 at 9:19 AM, unoebauer [email protected] wrote:
Dr. R. C. Thomas |
Hi @rcthomas, I did an overall timing of the tardis calculation. Since only the transport loop, i.e. the C part, is parallelised, these timings also contain the runtime of the serial parts. |
Hi @rcthomas, @wkerzendorf @orbitfold @ssim I also tested a different setup, in which the MC transport parts of Tardis use up a much larger percentage of the overall runtime. In this case, the maximum speed-up that I can get is higher than in the simple tardis example test, in which the transport step is relatively cheap. |
@rcthomas what unoebauer is saying is correct. We will still need to parallelize the plasma calculation. We will do this as soon as @aoifeboyle finishes the restructuring project (well currently I'm the limiting factor). So I don't know how bad the atomics are - we were also thinking of having the estimators as shared memory. Maybe that will help. |
@orbitfold on |
@ssim we (@unoebauer and others) are thinking of merging this, but were wondering if there might be some systematic trend in the spectra. can you have a look? |
@unoebauer @wkerzendorf I might be useful to do this with the "detailed" J-estimators switched on and with many packets, and then look in detail at those estimators - looking at the spectra above I would guess that perhaps the difference are consistent with MC noise but it's hard to be sure. It would be good to compare the values of estimators themselves since these are the directly derived quantities from the MC simulations (which then go into the plasma update etc. and ultimately affect the spectrum). Can we do that? I agree that it would be good to know whether the scaling above is limited by the plasma updates. I expect that it is. Avoiding the atomic/critical statements by using a reduction as @rcthomas suggests could improve the scaling if we're limited by those - but perhaps this is just the plasma part. |
@ssim @rcthomas In what little scaling testing I and @wkerzendorf did with and without atomic statements the removal of atomic improves performance only by a tiny amount. |
@ssim @unoebauer @rcthomas I feel this can be merged as a currently experimental feature of TARDIS. It is not enabled by default and the |
None here. Rollin On Sun, Jul 5, 2015 at 10:14 AM, Wolfgang Kerzendorf <
Dr. R. C. Thomas |
Fine with me too - @wkerzendorf merge at will! |
@unoebauer I can revert the merge if there are serious drawbacks. Let me know. |
No description provided.