Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OpenMP WIP #338

Merged
merged 18 commits into from
Jul 6, 2015
Merged

OpenMP WIP #338

merged 18 commits into from
Jul 6, 2015

Conversation

orbitfold
Copy link
Contributor

No description provided.

void initialize_random_kit(unsigned long seed)



def montecarlo_radial1d(model, int_type_t virtual_packet_flag=0):
def montecarlo_radial1d(model, int_type_t virtual_packet_flag=0, int nthreads=4):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nthreads is still hardcoded here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, you just have to specify something for a keyword argument. It will take it from yaml and will default to one.

@wkerzendorf
Copy link
Member

@orbitfold - any idea why that fails now?

@unoebauer
Copy link
Contributor

Hi @wkerzendorf, @orbitfold, @ssim

I did a quick strong scaling test on one of the servers here at MPA, using the tardis_example setup. These are the results:
tardis_example_spectra_lnx19
tardis_example_spectra_cmp_strongscaling_lnx19

tardis_example_strongscaling_lnx19

@rcthomas
Copy link

rcthomas commented Jul 1, 2015

Looks like progress, sorry I never got into looking into this, I had
trouble getting a development env set up.

This looks like something is still not quite right since the spectra don't
agree?

What is the NUMA structure of the machine you are running on? But this is
way suboptimal for an MC code. Could the calculations be done as a
reduction instead of with critical/atomics?

Important --- is this just a strong scaling test of the transport loop, or
does it include overheads (ie, it is just a timing of an entire run)?

Rollin

On Wed, Jul 1, 2015 at 9:19 AM, unoebauer [email protected] wrote:

Hi @wkerzendorf https://github.com/wkerzendorf, @orbitfold
https://github.com/orbitfold, @ssim https://github.com/ssim

I did a quick strong scaling test on one of the servers here at MPA and
these are the results:

[image: tardis_example_spectra_lnx19]
https://cloud.githubusercontent.com/assets/6351165/8455365/5ea8fd5e-2004-11e5-9e45-2062c390bed5.png

[image: tardis_example_strongscaling_lnx19]
https://cloud.githubusercontent.com/assets/6351165/8455366/5ea96ee2-2004-11e5-8877-e8ed516199fd.png


Reply to this email directly or view it on GitHub
#338 (comment).

Dr. R. C. Thomas
Computational Cosmology Center
Lawrence Berkeley National Lab
1 Cyclotron Road Mail Stop 50B4206
Berkeley, California 94720-8151
510.486.4697

@unoebauer
Copy link
Contributor

Hi @rcthomas,

I did an overall timing of the tardis calculation. Since only the transport loop, i.e. the C part, is parallelised, these timings also contain the runtime of the serial parts.

@unoebauer
Copy link
Contributor

Hi @rcthomas, @wkerzendorf @orbitfold @ssim

I also tested a different setup, in which the MC transport parts of Tardis use up a much larger percentage of the overall runtime. In this case, the maximum speed-up that I can get is higher than in the simple tardis example test, in which the transport step is relatively cheap.

tardis_tomography_example_spectra_lnx19
tardis_tomography_example_strongscaling_lnx19

@unoebauer unoebauer closed this Jul 1, 2015
@wkerzendorf
Copy link
Member

@rcthomas what unoebauer is saying is correct. We will still need to parallelize the plasma calculation. We will do this as soon as @aoifeboyle finishes the restructuring project (well currently I'm the limiting factor).

So I don't know how bad the atomics are - we were also thinking of having the estimators as shared memory. Maybe that will help.

@wkerzendorf wkerzendorf reopened this Jul 1, 2015
@wkerzendorf
Copy link
Member

@orbitfold on .travis.yml can you add the --no_openmp flag?

@wkerzendorf
Copy link
Member

@ssim we (@unoebauer and others) are thinking of merging this, but were wondering if there might be some systematic trend in the spectra. can you have a look?

@ssim
Copy link
Contributor

ssim commented Jul 1, 2015

@unoebauer @wkerzendorf I might be useful to do this with the "detailed" J-estimators switched on and with many packets, and then look in detail at those estimators - looking at the spectra above I would guess that perhaps the difference are consistent with MC noise but it's hard to be sure. It would be good to compare the values of estimators themselves since these are the directly derived quantities from the MC simulations (which then go into the plasma update etc. and ultimately affect the spectrum). Can we do that?

I agree that it would be good to know whether the scaling above is limited by the plasma updates. I expect that it is. Avoiding the atomic/critical statements by using a reduction as @rcthomas suggests could improve the scaling if we're limited by those - but perhaps this is just the plasma part.

@orbitfold
Copy link
Contributor Author

@ssim @rcthomas In what little scaling testing I and @wkerzendorf did with and without atomic statements the removal of atomic improves performance only by a tiny amount.

@wkerzendorf
Copy link
Member

@ssim @unoebauer @rcthomas I feel this can be merged as a currently experimental feature of TARDIS. It is not enabled by default and the --with-openmp needs to be specifically set. Are there any objections?

@rcthomas
Copy link

rcthomas commented Jul 5, 2015

None here.

Rollin

On Sun, Jul 5, 2015 at 10:14 AM, Wolfgang Kerzendorf <
[email protected]> wrote:

@ssim https://github.com/ssim @unoebauer https://github.com/unoebauer
@rcthomas https://github.com/rcthomas I feel this can be merged as a
currently experimental feature of TARDIS. It is not enabled by default and
the --with-openmp needs to be specifically set. Are there any objections?


Reply to this email directly or view it on GitHub
#338 (comment).

Dr. R. C. Thomas
Computational Cosmology Center
Lawrence Berkeley National Lab
1 Cyclotron Road Mail Stop 50B4206
Berkeley, California 94720-8151
510.486.4697

@ssim
Copy link
Contributor

ssim commented Jul 6, 2015

Fine with me too - @wkerzendorf merge at will!

wkerzendorf added a commit that referenced this pull request Jul 6, 2015
@wkerzendorf wkerzendorf merged commit aa7da17 into tardis-sn:master Jul 6, 2015
@wkerzendorf
Copy link
Member

@unoebauer I can revert the merge if there are serious drawbacks. Let me know.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants