Gauge fixing, pure gauge and optimized gauge I/O routines #253

maddyscientist · 2015-05-22T02:43:17Z

This pull request will add the following features to QUDA

Gauge fixing
Pure gauge generation (overrelaxation and heatbath algorithms)
Added gauge fixing to interface_quda.cpp and milc_interface.cpp
Optimized gauge::FloatNOrder::load and gauge::FloatNOrder::save routines (vectorized I/O)
Modified unitarize_links_quda.cu in order to support link unitarization for 12/8 parameters
Modified copy_gauge_extended.cu in order to support copy from extended to regular gauge

Gauge fixing files:
lib: gauge_fix_ovr_extra.cu, gauge_fix_fft.cu, gauge_fix_ovr_extra.h, gauge_fix_ovr.cu, gauge_fix_ovr_hit_devf.cuh, CUFFT_Plans.h
For pure gauge config generation:
pgauge_exchange.cu, pgauge_init.cu, pgauge_heatbath.cu, pgauge_plaquette.cu, random.cu

This pull request replaces #252.

…auge fixing code

…er::load and gauge::FloatNOrder::save.

maddyscientist · 2015-05-22T04:59:45Z

Some benchmarks for copyGauge, taken at V=16^4 (old -> new)

double -> single
- 18 -> 18 (139 -> 176 GB/s)
- 12 -> 12 (118 -> 175 GB/s)
- 8 -> 8 (86 -> 150 GB/s)
single -> half
- 18 -> 18 (121 -> 247 GB/s)
- 12 -> 12 (82 -> 213 GB/s)
- 8 -> 8 (55 -> 113 GB/s)

While some of the half precision numbers look a bit too fast (L2 cache between runs perhaps?) the speedup is undeniable. It looks like all kernels that use the new gauge::FloatNOrder accessor are much faster. This really shows the strength of using generic accessor code, optimizing the accessor gives a speedup across the board.

…ble-gauge-alg flag.

maddyscientist · 2015-05-22T07:31:12Z

I've added a new flag --enable-gauge-alg to enable these new algorithms. This is very much needed as the gauge fixing code takes a long time to compile.

Nuno, having now had a cursory look at the gauge fixing and overrelaxation codes, I can see there is a lot of code and very little comments as to what is happening. Also, the indentation is inconsistent with the rest of QUDA, which is mostly 2 space indents. It would be nice to get this fixed before it it merged into develop (I know there are many other parts of QUDA that has similar issues as well).

nmrcardoso · 2015-05-22T13:36:11Z

Thanks a lot Mike to share the bechmarks and for this configuration option.

I will try to address all this issues today, since tomorrow i'll be
traveling.

It also would be nice to have a formatter.
Some weeks ago a looked at several formatters,
https://github.com/lattice/quda/wiki/agenda-call-2015-05-07
and the most promising seems to be the uncrustify, we only need to setup a
configuration file, and then with a simple script run it for every code
file.

On Fri, May 22, 2015 at 2:31 AM, mikeaclark [email protected]
wrote:

I've added a new flag --enable-gauge-alg to enable these new algorithms.
This is very much needed as the gauge fixing code takes a long time to
compile.

Nuno, having now had a cursory look at the gauge fixing and overrelaxation
codes, I can see there is a lot of code and very little comments as to what
is happening. Also, the indentation is inconsistent with the rest of QUDA,
which is mostly 2 space indents. It would be nice to get this fixed before
it it merged into develop (I know there are many other parts of QUDA that
has similar issues as well).

—
Reply to this email directly or view it on GitHub
#253 (comment).

mathiaswagner · 2015-05-22T14:08:38Z

Thanks for reminding me of the tools. I have created #254 to remind us to define some format guide lines that we can then also use for tools.

Nuno, will you create an issue to remind us of the missing MPI support if we want to use FFT gauge fixing w/ multi GPUs?

It would be good to update the README file commenting on this restriction and (optional, if possible) catch this in the configuration process / at compilation time and not at runtime ( like we do for the asqtad force that also does not support Multi GPU).

nmrcardoso · 2015-05-22T14:52:59Z

I created a remind issue for the gauge fixing with FFTs.
If you can add this to the configuration file would be great,
since I've never done a configure file.

Can you tell me a file code that follows the proper standard style
to see and apply the same style to the gauge fixing code?

On Fri, May 22, 2015 at 9:08 AM, Mathias Wagner [email protected]
wrote:

Thanks for reminding me of the tools. I have created #254
#254 to remind us to define some
format guide lines that we can then also use for tools.

Nuno, will you create an issue to remind us of the missing MPI support if
we want to use FFT gauge fixing w/ multi GPUs?

It would be good to update the README file commenting on this restriction
and (optional, if possible) catch this in the configuration process / at
compilation time and not at runtime ( like we do for the asqtad force that
also does not support Multi GPU).

—
Reply to this email directly or view it on GitHub
#253 (comment).

maddyscientist · 2015-05-22T19:27:25Z

@nmrcardoso I've created a quick guide on how to update the configure / makefile here: https://github.com/lattice/quda/wiki/Adding-new-QUDA-features

maddyscientist · 2015-05-22T19:29:19Z

I've just push a change to this branch that enables the vectorization for the ghost I/O routines as well. This brings a performance boost to these also, though not as significant as with bulk I/O routines: this is because the data volumes are much smaller here so they are more latency bound. Nevertheless, this should give a nice boost to any extended gauge routines.

Added warning msg when using --enable-gauge-alg and --enable-multi-gpu "Gauge fixing with FFTs only supported for single-GPU. Use gauge fixing with overrelaxation in multi-GPU mode."

nmrcardoso · 2015-05-22T20:16:52Z

Thanks a lot Mike, the guide is very helpful.
I added a warning msg when using --enable-gauge-alg and --enable-multi-gpu
"Gauge fixing with FFTs only supported for single-GPU. Use gauge fixing with overrelaxation in multi-GPU mode."
to configure.ac and updated the configure.

maddyscientist · 2015-05-22T20:21:50Z

Glad it helps. Going forward, I'd like to instill a policy that whenever anyone asks a question on QUDA (like how to add a feature, make a change or a question on some parameter or whatever), that instead of someone writing an email response they spend 5 minutes more updating wiki pages and / or doxygen to answer the question. This way we'll get much better documentation and the same questions will stop being asked :)

…dg and __shfl instructions. Modified the LICENSE to include the NVIDIA license.

…performance across the board, but some regressions at 12/8 reconstruct so left switched off for now (USE_LDG macro in include/gauge_field_order.h).

…ure/gauge-fix

…deal with separate input/output fields with potentially differeing reconstruction types. Renamed hisq_links_quda.h to more appropriate unitarization_links.h.

…nto feature/gauge-fix

…ile atomic.cuh.

Added reunitarization flops and bytes

Added reunitarization flop and byte count to the performance results

Added flops and bytes count

…ex_helper.cuh.

deleted su3_testing

…nto feature/gauge-fix

Gauge fixing, pure gauge and optimized gauge I/O routines

nmrcardoso and others added 9 commits March 20, 2015 18:51

Code from master branch with FloatNOrder modified and including the g…

144d45e

…auge fixing code

Code from master branch with FloatNOrder modified and including the g…

555e0e1

…auge fixing code

Code from master branch with FloatNOrder modified and including the g…

fca8ac6

…auge fixing code

Modified gauge fixing code using FFTs in order to use less memory

6b1e744

Modified gauge fixing code using FFTs in order to use less memory

ebb7565

Modified gauge fixing code using FFTs in order to use less memory

72df100

Fix single-GPU compilation for testing/su3_testing.cpp

b4aa0f5

Fix link errors when GPU_UNITARIZE is not set.

7b3d0df

Merge branch 'develop' into feature/gauge-fix

205b88d

maddyscientist changed the title ~~Feature/gauge fix~~ Add gauge fixing, pure gauge and optimized gauge I/O routines May 22, 2015

maddyscientist changed the title ~~Add gauge fixing, pure gauge and optimized gauge I/O routines~~ Gauge fixing, pure gauge and optimized gauge I/O routines May 22, 2015

maddyscientist mentioned this pull request May 22, 2015

New FloatNOrder, gauge fixing and some pure gauge gauge configuration routines #252

Closed

Added support for half-precision types in vectorized gauge::FloadNOrd…

5db6fe3

…er::load and gauge::FloatNOrder::save.

maddyscientist added feature optimization labels May 22, 2015

maddyscientist added this to the QUDA 0.8 milestone May 22, 2015

Added updated register traits for vectorized short copy specializations.

d4d5426

Gauge fixing and pure gauge algorithms are now enabled with the --ena…

71dac73

…ble-gauge-alg flag.

Added vectorized memory I/O for ghost routines in gauge::FloatNOrder.

91e718e

nmrcardoso added 2 commits May 22, 2015 15:04

Update configure.ac

933e8a7

Added warning msg when using --enable-gauge-alg and --enable-multi-gpu "Gauge fixing with FFTs only supported for single-GPU. Use gauge fixing with overrelaxation in multi-GPU mode."

Update configure

4f6dd35

Added warning msg when using --enable-gauge-alg and --enable-multi-gpu "Gauge fixing with FFTs only supported for single-GPU. Use gauge fixing with overrelaxation in multi-GPU mode."

Mathias Wagner and others added 25 commits July 27, 2015 19:21

acknowledge command line options in gauge_alg_test

c19d2cc

Added loop unroll for computeValue kernel which gives a speedup.

0fa3833

Added generic library to QUDA, which provides generic support for __l…

738e9ce

…dg and __shfl instructions. Modified the LICENSE to include the NVIDIA license.

gauge_field::FloatNOrder can now use __ldg loads. Generally improves …

aa7049a

…performance across the board, but some regressions at 12/8 reconstruct so left switched off for now (USE_LDG macro in include/gauge_field_order.h).

Merge branch 'feature/gauge-fix' of github.com:lattice/quda into feat…

cf96f2e

…ure/gauge-fix

Applid gauge_mapper to heatbath code to reduce compilation time.

45a0b19

Removed legacy unitarization routine, and generalized replacement to …

c85dad3

…deal with separate input/output fields with potentially differeing reconstruction types. Renamed hisq_links_quda.h to more appropriate unitarization_links.h.

potential fix for MPI issues with gauge_alg_test

53308ef

Merge branch 'feature/gauge-fix' of https://github.com/lattice/quda i…

bf6d3cb

…nto feature/gauge-fix

Merge remote-tracking branch 'origin/develop' into feature/gauge-fix

88ae420

Fixed bug in setting dslash_type in staggered_dslash_test.

64a94d2

Reduced compilation time for field-strength tensor using gauge_mapper.

d5fc364

modified gauge_alg_test to use generation before every test

dbfcb30

Merge branch 'feature/gauge-fix' of https://github.com/lattice/quda i…

72b1cb2

…nto feature/gauge-fix

Moved replicated atomicAdd(double*,double) definition to new header f…

d5db5c1

…ile atomic.cuh.

Update gauge_fix_fft.cu

96e4271

Added reunitarization flops and bytes

Update gauge_fix_ovr.cu

27df381

Added reunitarization flop and byte count to the performance results

Update unitarize_links_quda.cu

c2a8cdb

Added flops and bytes count

Delete su3_testing.cpp

d67e395

Added tune option to su3_test.

3f8b873

Clean up of replicated indexing functions, moving them into index/ind…

6b05307

…ex_helper.cuh.

Updated Makefile for recent header additions.

8202051

Update Makefile

8883381

deleted su3_testing

Merge remote-tracking branch 'origin/develop' into feature/gauge-fix

d6d6e28

Merge branch 'feature/gauge-fix' of https://github.com/lattice/quda i…

112f05d

…nto feature/gauge-fix

maddyscientist added a commit that referenced this pull request Jul 29, 2015

Merge pull request #253 from lattice/feature/gauge-fix

7a91e92

Gauge fixing, pure gauge and optimized gauge I/O routines

maddyscientist merged commit 7a91e92 into develop Jul 29, 2015

maddyscientist deleted the feature/gauge-fix branch July 29, 2015 19:22

This was referenced Jul 29, 2015

Hypercubic RNG #29

Closed

Move atomic functions for double out of gauge_plaq.cu #309

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Gauge fixing, pure gauge and optimized gauge I/O routines #253

Gauge fixing, pure gauge and optimized gauge I/O routines #253

maddyscientist commented May 22, 2015

maddyscientist commented May 22, 2015

maddyscientist commented May 22, 2015

nmrcardoso commented May 22, 2015

mathiaswagner commented May 22, 2015

nmrcardoso commented May 22, 2015

maddyscientist commented May 22, 2015

maddyscientist commented May 22, 2015

nmrcardoso commented May 22, 2015

maddyscientist commented May 22, 2015

Gauge fixing, pure gauge and optimized gauge I/O routines #253

Gauge fixing, pure gauge and optimized gauge I/O routines #253

Conversation

maddyscientist commented May 22, 2015

maddyscientist commented May 22, 2015

maddyscientist commented May 22, 2015

nmrcardoso commented May 22, 2015

mathiaswagner commented May 22, 2015

nmrcardoso commented May 22, 2015

maddyscientist commented May 22, 2015

maddyscientist commented May 22, 2015

nmrcardoso commented May 22, 2015

maddyscientist commented May 22, 2015