Feature/DD #1447

sbacchio · 2024-03-19T11:43:21Z

This is a first PR towards enabling domain decomposition (DD) features in QUDA.
The goal of this PR is to enable a red-black decomposition for the Dirac operator.

Remarks

We first consider the blocks to exactly fit in the local lattice. Generalization is left to a future PR.
We focus only on the application of the Dirac operator. Optimization of BLAS functions is left to a future PR.

Design

Under domain decomposition, the Dirac operator assumes a block structure, e.g.

$$ D = \begin{bmatrix} D_{rr} & D_{rb} \\ D_{br} & D_{bb} \end{bmatrix} $$

Thus we need a simple strategy to implement all possible operators and their application to a field.
Our strategy is to add the specifications about the domain decomposition directly to the vector (spinor) field.
E.g. the application of $D_{rb}$, i.e. $y = D_{rb} x = P_r D P_b x$, can be expressed by setting the input vector as "black", i.e. $x_b = P_b x$, and the output vector as "red", i.e. $y_r = P_r y$. A pseudocode would look like this:

x.dd_red_active();
y.dd_black_active();
applyD(y,x);

Then, we make the application of D DD-aware and only apply it to the in/out active points.

Summary of changes:

DDParam is added as a property of lattice fields (used only for ColoSpinorFields at the moment).
... TODO

TODO list:

Current issues:

Tests are failing if the block size is odd in the x direction for the PC operator, e.g. dslash_test --xdim 6 --ydim 4 --zdim 4 --tdim 4 --dd-block-size 3 2 2 2 --dd-red-black=true --test 0. It works if not PC, e.g. --test 2 or if it is odd in any other direction. Now it is not allowed to have odd block size in the x-direction.
Tests are failing if export QUDA_REORDER_LOCATION=CPU is set (dslash_test failing with QUDA_REORDER_LOCATION=CPU #1466).
~~Tests are failing for MatPC with Caught signal 8 (Floating point exception: integer divide by zero) . See e.g. dslash_test --xdim 8 --ydim 8 --zdim 8 --tdim 8 --dd-red-black=true --test 1~~ Not testing for PC operators

… ddParam

maddyscientist · 2024-11-05T19:03:26Z

I'm starting to do some testing on this PR, and I see that not all Dirac operators have the file parallelization. Is this something you can do? See the trace below, for a full build of QUDA, we can see that these files dominate compilation time, and result in a much longer compilation time. At the same time, it's clear how much faster the split Dirac operator files compile on a multi-core system 😄

pittlerf · 2024-11-05T19:49:25Z

I'm starting to do some testing on this PR, and I see that not all Dirac operators have the file parallelization. Is this something you can do? See the trace below, for a full build of QUDA, we can see that these files dominate compilation time, and result in a much longer compilation time. At the same time, it's clear how much faster the split Dirac operator files compile on a multi-core system 😄

Hi, yes, sorry, I will do the remaining ones.

…ated

maddyscientist · 2024-11-06T01:09:11Z

I have fixed the failing staggered dslash tests (was due to an issue with the long-link field being created erroneously when using regular (unimproved) staggered fermions.

sbacchio · 2024-11-06T06:02:12Z

Hi Kate, that's an impressive tracing :) Ferenc will work on the others soon. About the tests, I still see some staggered tests failing (invert and eigensolve). I guess due to similar reasons. Maybe it's better if you have a look at those too :)

…oning

maddyscientist · 2024-11-07T19:14:22Z

Looking great with latest pushes. Just dslash_twisted_mass_preconditioned.cu left I think

sbacchio · 2024-11-07T19:56:53Z

Great! and I see now also all 7 checks passed :) Thanks @pittlerf !
@maddyscientist do you have also a trace of the compilation time of the current develop? Just to appreciate the overall improvement :)

sbacchio and others added 19 commits March 18, 2024 14:48

Adding new macro for declaring enum classes with useful utils

efb27bf

Adding DDParam structure used as property for any lattice_field

2514ca7

Adding method size to Coord

1c89d4f

First implementation of DDaware dslash operator

82e3d6c

Added dd options for tests

2ac39ad

Added set and reset functions

3945185

Addind test routines for domain decomposition

c116ff8

Enabling DD::in_use by default

9ef4871

Adding missing DD::mode

5435b96

Example of usage of DD in SAP with MR solver

66c7ae6

implementing projection of ColorSpinorField to a domain determined by…

2003212

… ddParam

implementing tests for projection on domain determined by ddParam

f76859b

Adding flags QUDA_DIRAC_DD and GPU_DD_DIRAC

99fa643

Adding instantiation over DDArg

d4c448a

Cleaning DDParam

b21d9c9

Adding DDArg

f13ca71

Resolving compilation errors

aabb473

Adding constexpr

328389b

Merge branch 'develop' of github.com:lattice/quda into HEAD

bfea442

sbacchio force-pushed the feature/DD branch from 822743e to bfea442 Compare May 7, 2024 13:44

sbacchio added 8 commits May 7, 2024 17:22

Removing base struct DDArg

fee22ac

Fixing wrong use of constexpr

e1e0dbb

Adding DDArg in a number of places

b1b3dd9

Solving a few more compilation issues

6fd40ea

Removing unnecessary copy constructor

392ea76

other missing fixes

cf4705d

Enabling debug flag

ae7f918

cleaning

4327f70

sbacchio requested a review from maddyscientist May 8, 2024 07:34

Bug fix

57e5bc0

Ferenc Pittler and others added 2 commits October 26, 2024 15:27

compiling with clang

ad74642

Merge branch 'develop' of github.com:lattice/quda into feature/DD

fb6e206

sbacchio marked this pull request as ready for review November 1, 2024 08:59

sbacchio requested review from a team as code owners November 1, 2024 08:59

Fix non-improved staggered dslash tests - long link should not be cre…

a7fe5bd

…ated

pittlerf added 3 commits November 6, 2024 08:19

separate compilation units for domain wall 5f

23f889d

separate compilation for wilson clover preconditioned

4041e67

Merge branch 'feature/DD' into HEAD

f1b83a0

sbacchio changed the title ~~Feature/DD (WIP)~~ Feature/DD Nov 6, 2024

maddyscientist and others added 13 commits November 6, 2024 12:59

Fix redundant compilation with Wilson-clover using distain preconditi…

5c0b686

…oning

Fix issues with non-asqtad fermions

be72ab3

Renaming dir to dist

cb84d79

separate compilation units for different domain wall

a1fb649

Merge remote-tracking branch 'origin' into HEAD

65fec21

Merge branch 'feature/DD' into HEAD

7b88081

including wilson clover hasenbusch twist

c73991f

remove in file no longer needed

21d81e6

correcting preconditioning

907ee8f

correcting typo

8539aa3

correcting typo

a73e6a1

correcting typo

a0123fc

typo corrected

8ffb905

separate compilation for twisted mass preconditioned

9bac7f7

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature/DD #1447

Feature/DD #1447

sbacchio commented Mar 19, 2024 •

edited

Loading

maddyscientist commented Nov 5, 2024

pittlerf commented Nov 5, 2024

maddyscientist commented Nov 6, 2024

sbacchio commented Nov 6, 2024 •

edited

Loading

maddyscientist commented Nov 7, 2024

sbacchio commented Nov 7, 2024

Feature/DD #1447

Are you sure you want to change the base?

Feature/DD #1447

Conversation

sbacchio commented Mar 19, 2024 • edited Loading

Remarks

Design

Summary of changes:

TODO list:

Current issues:

maddyscientist commented Nov 5, 2024

pittlerf commented Nov 5, 2024

maddyscientist commented Nov 6, 2024

sbacchio commented Nov 6, 2024 • edited Loading

maddyscientist commented Nov 7, 2024

sbacchio commented Nov 7, 2024

sbacchio commented Mar 19, 2024 •

edited

Loading

sbacchio commented Nov 6, 2024 •

edited

Loading