-
Notifications
You must be signed in to change notification settings - Fork 102
QUDA Quick Start Guide
mikeaclark edited this page May 12, 2015
·
16 revisions
To aid performance modelling and debugging, it is possible to switch on communication in a given dimension, even if in actuality that dimension is local to a given GPU. The command line flag --partition N
facilitates this feature, where N is a 4-bit number, with bits 0,1,2,3 used to switch on/off communication in dimensions x,y,z,t (respectively). For example:
dslash_test --partition 1 ## enable x dimension communication
dslash_test --partition 6 ## enable y and z dimension communication
dslash_test --partition 15 ## enable full communication
QUDA has two specific debugging modes: HOST_DEBUG and DEVICE_DEBUG.
-
HOST_DEBUG compiles all host code using the
-g
flag and ensures that all CUDA error reporting is done synchronously (e.g., the GPU and CPU are synchronized prior to fetching the error state). For most debugging, HOST_DEBUG is all that should be needed since most bugs tend to be in CPU code. There is a noticeable performance impact enabling HOST_DEBUG, at the 20-50% level, with the penalty being greater at smaller local volumes. -
DEVICE_DEBUG compiles all GPU kernels using the
-G
flag. This provides for accurate line reporting in cuda-gdb and cuda-memch. There is a huge performance penalty impact from enabling this, at the 100x level.