-
Notifications
You must be signed in to change notification settings - Fork 101
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Gauge fixing, pure gauge and optimized gauge I/O routines #253
Conversation
…er::load and gauge::FloatNOrder::save.
Some benchmarks for copyGauge, taken at V=16^4 (old -> new)
While some of the half precision numbers look a bit too fast (L2 cache between runs perhaps?) the speedup is undeniable. It looks like all kernels that use the new gauge::FloatNOrder accessor are much faster. This really shows the strength of using generic accessor code, optimizing the accessor gives a speedup across the board. |
…ble-gauge-alg flag.
I've added a new flag Nuno, having now had a cursory look at the gauge fixing and overrelaxation codes, I can see there is a lot of code and very little comments as to what is happening. Also, the indentation is inconsistent with the rest of QUDA, which is mostly 2 space indents. It would be nice to get this fixed before it it merged into develop (I know there are many other parts of QUDA that has similar issues as well). |
Thanks a lot Mike to share the bechmarks and for this configuration option. I will try to address all this issues today, since tomorrow i'll be It also would be nice to have a formatter. On Fri, May 22, 2015 at 2:31 AM, mikeaclark [email protected]
|
Thanks for reminding me of the tools. I have created #254 to remind us to define some format guide lines that we can then also use for tools. Nuno, will you create an issue to remind us of the missing MPI support if we want to use FFT gauge fixing w/ multi GPUs? It would be good to update the README file commenting on this restriction and (optional, if possible) catch this in the configuration process / at compilation time and not at runtime ( like we do for the asqtad force that also does not support Multi GPU). |
I created a remind issue for the gauge fixing with FFTs. Can you tell me a file code that follows the proper standard style On Fri, May 22, 2015 at 9:08 AM, Mathias Wagner [email protected]
|
@nmrcardoso I've created a quick guide on how to update the configure / makefile here: https://github.com/lattice/quda/wiki/Adding-new-QUDA-features |
I've just push a change to this branch that enables the vectorization for the ghost I/O routines as well. This brings a performance boost to these also, though not as significant as with bulk I/O routines: this is because the data volumes are much smaller here so they are more latency bound. Nevertheless, this should give a nice boost to any extended gauge routines. |
Added warning msg when using --enable-gauge-alg and --enable-multi-gpu "Gauge fixing with FFTs only supported for single-GPU. Use gauge fixing with overrelaxation in multi-GPU mode."
Added warning msg when using --enable-gauge-alg and --enable-multi-gpu "Gauge fixing with FFTs only supported for single-GPU. Use gauge fixing with overrelaxation in multi-GPU mode."
Thanks a lot Mike, the guide is very helpful. |
Glad it helps. Going forward, I'd like to instill a policy that whenever anyone asks a question on QUDA (like how to add a feature, make a change or a question on some parameter or whatever), that instead of someone writing an email response they spend 5 minutes more updating wiki pages and / or doxygen to answer the question. This way we'll get much better documentation and the same questions will stop being asked :) |
…dg and __shfl instructions. Modified the LICENSE to include the NVIDIA license.
…performance across the board, but some regressions at 12/8 reconstruct so left switched off for now (USE_LDG macro in include/gauge_field_order.h).
…deal with separate input/output fields with potentially differeing reconstruction types. Renamed hisq_links_quda.h to more appropriate unitarization_links.h.
…nto feature/gauge-fix
…nto feature/gauge-fix
Added reunitarization flops and bytes
Added reunitarization flop and byte count to the performance results
Added flops and bytes count
deleted su3_testing
…nto feature/gauge-fix
Gauge fixing, pure gauge and optimized gauge I/O routines
This pull request will add the following features to QUDA
gauge::FloatNOrder::load
andgauge::FloatNOrder::save
routines (vectorized I/O)Gauge fixing files:
lib: gauge_fix_ovr_extra.cu, gauge_fix_fft.cu, gauge_fix_ovr_extra.h, gauge_fix_ovr.cu, gauge_fix_ovr_hit_devf.cuh, CUFFT_Plans.h
For pure gauge config generation:
pgauge_exchange.cu, pgauge_init.cu, pgauge_heatbath.cu, pgauge_plaquette.cu, random.cu
This pull request replaces #252.