Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplify tiling in plasma deposition #1093

Conversation

AlexanderSinn
Copy link
Member

@AlexanderSinn AlexanderSinn commented Apr 3, 2024

In this PR, the temp density arrays that were used for the plasma current deposition were removed. Instead, thread safety is ensured by splitting the tiles into four groups such that tiles within a group don’t overlap. Shown below is the chi array after the first group of tiles was deposited. This cleans up the code, as the temp densities don’t have to be allocated and managed anymore, and gives a small performance improvement because the lockAdd to the main array is not necessary anymore.

    for (int tile_perm_x=0; tile_perm_x<2; ++tile_perm_x) {
    for (int tile_perm_y=0; tile_perm_y<2; ++tile_perm_y) {
#pragma omp parallel for collapse(2) if(do_tiling)
    for (int itilex=tile_perm_x; itilex<ntilex; itilex+=2) {
    for (int itiley=tile_perm_y; itiley<ntiley; itiley+=2) {

    // the index is transposed to be the same as in amrex::DenseBins::build
    const int tile_index = itilex * ntiley + itiley;

    // Deposit one tile at tile_index 

    }}}}

image

Performance for a 2047*2047*300 grid, exactly one tile per thread on dual 48 core CPUs:

perf_no_add

  • Small enough (< few 100s of lines), otherwise it should probably be split into smaller PRs
  • Tested (describe the tests in the PR description)
  • Runs on GPU (basic: the code compiles and run well with the new module)
  • Contains an automated test (checksum and/or comparison with theory)
  • Documented: all elements (classes and their members, functions, namespaces, etc.) are documented
  • Constified (All that can be const is const)
  • Code is clean (no unwanted comments, )
  • Style and code conventions are respected at the bottom of https://github.com/Hi-PACE/hipace
  • Proper label and GitHub project, if applicable

@AlexanderSinn AlexanderSinn added component: plasma About the plasma species cleaning Code cleaning, avoid duplication, better naming, better style etc. labels Apr 3, 2024
@AlexanderSinn AlexanderSinn changed the title [WIP] Simplify tiling in plasma deposition Simplify tiling in plasma deposition Apr 4, 2024
@AlexanderSinn AlexanderSinn added the performance optimization, benchmark, profiling, etc. label Apr 4, 2024
Copy link
Member

@MaxThevenet MaxThevenet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, thanks for this PR!

@MaxThevenet MaxThevenet merged commit 2e2e592 into Hi-PACE:development Apr 12, 2024
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cleaning Code cleaning, avoid duplication, better naming, better style etc. component: plasma About the plasma species performance optimization, benchmark, profiling, etc.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants