Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

{P} CPU optimization (part 1) #35

Open
3 tasks
FlorianDeconinck opened this issue Mar 4, 2024 · 0 comments
Open
3 tasks

{P} CPU optimization (part 1) #35

FlorianDeconinck opened this issue Mar 4, 2024 · 0 comments
Labels
Team/Project Parent/Sink task for the all team (cross-repository)

Comments

@FlorianDeconinck
Copy link
Collaborator

FlorianDeconinck commented Mar 4, 2024

As of W09.24, the CPU performance is 2.5 slower than the original Fortran.

Dynamical core should be the first target of optimization:

  • Flip layout to K-axis as first stride
  • K loop needs to be aggregated between all stencils and move up so we have one K loop and many I/J
  • Cache coherency work
  • Better localization of variable
  • Vectorization

Other identified issues are:

  • OpenMP parallelism for model that run threaded exists but should be "automatic" depending on user configuration

DoD: Reach 20/30% of Fortran. Flag any further improvements for a second round of optimizations, if any

@FlorianDeconinck FlorianDeconinck changed the title CPU optimization CPU optimization (part 1) Mar 4, 2024
@FlorianDeconinck FlorianDeconinck added the Team/Project Parent/Sink task for the all team (cross-repository) label Mar 4, 2024
@FlorianDeconinck FlorianDeconinck changed the title CPU optimization (part 1) [NDSL] CPU optimization (part 1) Mar 5, 2024
@FlorianDeconinck FlorianDeconinck changed the title [NDSL] CPU optimization (part 1) {P} CPU optimization (part 1) Oct 31, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Team/Project Parent/Sink task for the all team (cross-repository)
Projects
None yet
Development

No branches or pull requests

1 participant