-
Notifications
You must be signed in to change notification settings - Fork 141
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Mixed precision preview #44
Comments
Comment by: prckent It would be helpful to describe where mixed precision is implemented, what benefits have been seen, what testing, what limitations etc. |
Comment by: prckent What is the recompute frequency? What is being recomputed? The cofactor/inverse slater matrices? This needs to be consistent with the GPU version if so, or the GPU version updated appropriately etc. |
Comment by: prckent I see the recompute is indeed the slater matrix recompute. Can we call the parameter blocks_between_slater_matrix_recompute ? It is longer, but more obvious in what it does. Other ideas are welcome. |
Comment by: ye-luo Direct comparison between double precision calculation with single precision spline and mixed precision calculation. |
Comment by: ye-luo The recompute is implemented in a way to propagate from TrialWaveFunction to each individual component. The default is doing nothing like for the Jastrows. SD has a specialization to recompute in DP. This is the way both CPU and GPU code implements. For this reason, I'd like to call it blocks_between_recompute instead of blocks_between_slater_matrix_recompute explicitly to keep a consistent logic. Also for users, they don't need to know what is recomputed exactly. |
Comment by: prckent Understood. [ This is not something to necessarily fix now, but we really ought to check the differences found after recomputing. This will be increasingly problematic with larger runs. We might also have to fix our "log" handling which does not look to have the best numerics as written. ] |
Comment by: ye-luo I have aligned the recompute behaviours between CPU and GPU codes. The fix is trivial. |
Comment by: prckent A note will be needed in the manual... |
Comment by: markdewing How did you get the SP results? I tried compiling with OHMMS_PRECISION set to 'float', and it failed to compile (both the mixed_precision branch and trunk) |
Comment by: ye-luo Do not touch OHMMS_PRECISION. Add -D MIXED_PRECISION=1 in your cmake command line. |
Comment by: markdewing In the initial text, there are results for DP, SP, and MP (and MP-nocompute). I assume DP is the original code (double precision), MP is mixed precision (-D MIXED_PRECISION=1), and SP is single precision ? Where did the SP results come from? (Or does DP/SP refer to just the orbital spline precision?) |
Comment by: markdewing For consistency, the flag name should start with QMC_ (QMC_MIXED_PRECISION - to match QMC_MPI and QMC_COMPLEX) There seems to be two ways of thinking about mixed precision
Given that enabling mixed precision sets the base precision to 'float', my guess is option 1 might be the better way to think about it? For increased clarity, the setting of the two precision values should be done in the same place. Currently, OHMMS_PRECISION is adjusted in CMakeLists.txt, and the full precision value is hard-coded to 'double' in coulomb_types.h. Maybe add an OHMMS_PRECISION_FULL variable? The type variables for the two precision values (at least for Coulomb interactions) are pRealType and mRealType. What do the 'p' and'm' prefixes mean? |
Comment by: ye-luo Yes. It is better to add QMC_. I will do it for MIXED_PRECISION. I'm also considering to control CUDA_PRECISION by MIXED_PRECISION. The CPU code has option 1 This discrepancy should be unified and take option 2 probably in the new code but not the current QMCPACK. In coulomb case, Mark, could you please check your tests and replace the hard-coded double with RealType? |
Comment by: prckent This is a great improvement. For non experts we should give an guide of which variables are in which precision in the output. |
Comment by: ye-luo Did some further tide up. |
Comment by: ye-luo
|
OK to close? There are some ongoing issues with some of the mixed precision tests covered by issue #46 |
….c9bf1402b6 c9bf1402b6 use branch inq cc47807a45 qmc branch 6301202535 use branch inq d47e66a142 use qmc branch c982898096 remove commented code for msvc d0455e6fca remove workaround for msvc 65049ad549 using std::get idiom for c++17 83fd331d85 Merge branch 'sonar-friend' into 'master' c14ef96fac Merge branch 'fix-tuple-size' into 'master' 8040d0f467 workaround for nvcc ca1f2dfa55 make tuple size inherit from integral_constant 6096e6fbf7 difference type as template 149cf6a9a7 Update .clang-tidy 32f9dd4fdf restore simple cctor 9a061db30d rule of 7 64ad420dbd make constructor constexpr explicitly c1eb283359 fix pragma e8261c7e1f assert literal 079c093a2d remove constexpr f80662c8b0 fix memory for friend 4168bfeb83 remove complex adaptor 3edd55b805 fix nrm2 sonar friend db0be48043 finish gemm_iterator sonar recommendation 3afa2e986f remove comment 28e8902e89 hidden friend 7190917bd6 add simplest reinterpret c24a8d956b protect msvc for reinterpret cast e52f251cb8 reinterpret_for_msvc a569d37d2e Merge branch 'sonar-smells' into 'master' d3193d6546 fix small number faf7cc368e packed example d3d7fc9c51 remove TODOs e5ba849e52 remove comment 5894177185 Update README.md 35f2acd82c Update README.md d37657fb2f fix toc d5687e717d remove iostream a8bfdc5872 fixes for windows godbolt 9d33cefdd9 fix gemm for windows ci e754aeb33e use c++20 gl windows ci fac4a07785 fixes for msvc 8ac818d945 windows gh c++ 20 4a245cafb5 simplify cmake line for windows bd2452667f remove inverted quotes 5459d33c27 use matt fix for windows ci bb5ffdef11 git depth=1 721c91f972 fixes msvc blas 4b7e2fe5a9 m32 d1a50238c7 just install boost core c1bad4819c fix cdotc bfc45d5c99 add iostream d2431643d3 print c4 f2ead4e5b4 more cpplint 85925492af remove lapack from windows gh ci 912de3b9f0 Merge branch 'chunked-exp' into 'master' 481964ce25 bump glibcxx ca443b0198 Merge branch 'chunked-exp' into 'master' d31d4ba28b Merge branch 'master' into 'chunked-exp' 9e8a751fec fix ssize for windows 0e28315ad6 more nolint 18d1484f57 diff nolint 87b17daeed more nolint 7e946b6846 remove utility 76fd7f3cab nolint d2b4838696 remove blas benchmark 3d39197795 comment need for blas 4ab0025739 first version of tiled 20bdb03885 Update README.md 3c860f6db1 Merge branch 'remove-blas-timings-for-redux-test' into 'master' fe1a1ea52b chunk big array cb2231571a remove blas for reduct test e4cf69a955 Update README.md cffe2200c2 Update README.md e288ca8d94 Merge branch 'use-zgemv-for-zdotu' into 'master' 8cb2686157 add gfortran 24d0e71922 use m32 for blas compilation 91b2cc87be use lapack on windows 7b4272afcc install blas instead of openblas 209c5eb506 set BLA_VENDOR 6fe4a27998 fix dir gh wf ef98275e2f force dpkg install 150fc0f629 print error 9c53034704 add windows blas path 9722b523cf add blas lib 5826ec41d8 iwyu map stdlib.h 537d4325e5 lose float eq d83ed7376b Merge branch 'use-zgemv-for-zdotu' into 'master' 0cba2ab5af use zgem for zdot fff868260d Update cmake.yml 61d7465bf2 Update cmake.yml 5eefcbdf8c Update CMakeLists.txt 074afae286 Update cmake.yml 84c49d804b Update cmake.yml 9812d73d81 Merge branch 'restore-dot-tests' into 'master' 17cf25c2df use curl 304fc00bda use ubuntu latest again 328fbb89c9 remove iostream caf6084511 install blas i386 from online 9d85ce693e underscore 47de452e57 use debian db794ae7d5 use libblas-dev i386 a37f20e39e comment cc test 2024798cce print c and cc 1f5ee89bb1 print CC 1 3324239f7c fix arguments bcd77814ae fix parameters b9b4ffe54a cmake option for windows blas d117476299 dot tests e7fb87fa0b Merge branch 'extensions-as-a-mrange' into 'master' 28ea86aeb5 install boost from aptget 0092464144 add openblas to windows ci 6785000217 small set for redux and sanitizer cfbb8fbe2e comment some headers 24766a43fd j 1 for redux test in 32 bits 1d22c25366 restore dot test 435248347f disable dot tests 1bede4e6a3 use gemv always 794d2192ea fixes to gemv as dotu 29fb386e4f return gemv code ea80214388 fallback to gemv de1b0fa87a remove void macro e207802db8 clang openblas bd064b8e95 use openblas cb4376e22b try gemv as a backup 30d91da007 clang blas return void 99854d2c68 test gemv and dot c7c7ee8f66 fix gemv parameters 19cb4af780 test gemv first 42518ccbad swap n m gemv as dot test 214588f48f try gemv 9b28c545d4 check for nan in cdotu test 7cba332347 fix msg a657b89ae1 more F for float 3cbdad6aa1 add return by stack 4260fa9b4b improve msg 7b3d9d939c iwyu string view a80b961522 more tidy 1f7caff143 print configuration message d352a103f7 final touches 1eb898d619 just check for correctness 6534edb420 check that blas works for cdotu ba188663c1 dynamic dotu test aab4ab141c check cdotu before using it 356f480f68 blas void? 36affef84e use string view f221cf8632 fixes for cdotc 1ecb7cf1df fixes for circle 866df4852e more tidy cd2f6b86a1 add const watch 2d495ec131 finish fixes for mac cff344008d fixes dot 42136bb9a5 add void blas e7b74cca21 fixes for redux c577e944f9 cuda clang guard 24982ff155 fixes headers 57a80092bb guard intel 2e109acda0 fix for intel 0ba511b5d7 more basic test 2b77137302 libblas i386 for clang 32 23796e6026 fix loginc in assert f8f5df0072 convert if into OR 81955aec25 merges 927b13b0ca cast fix for 32bit af084884b3 install blas 32 375e2a7e2c i386 for boost serialization 066574408f more tidy 60ffd774f8 more tidy b058415ab0 more tidy 42a2a77cfe more tidy 28bce78524 add iostream iwyu 1e6d1eb777 tidy macro paren 0b44cc3f00 cast 1 cc1d8100bf naming a814ec3282 fix typo 076243272c more ssize_t fb752cddab iwyu fixes 08ff5a4158 iwyu fixes 715ef9f4d9 add i386 arch in ci ae09023b7c add blas i386 851082346e remove tuple 59bbb636f5 fixes for clang and llvm stdlib 0fb7738277 fixes for old versions of thrust a27c2d4c9e add blas to some cis 1d0a14b121 comment broadcast gemv attempt 9a05d8966b more reinterpret_cast for mac blas 2782332a9c add test for gemv with stride 0 981531bd61 fix warnings in blas core 3576bce74a guard valgrind for speed test 7a84c2e940 fix some tidy 1c03d1c5d5 more tidy c5e28df030 redux tests 2d6af7cd0d redux 84a6568d15 Merge branch 'extensions-as-a-mrange' into 'master' 9cd2d6e0f0 ingnore cland cache 11348281c1 fix get in size for extensions 101554fe87 circle latest use c++20 26293effff fix tidy bde591fa27 make extensions behave as an array 5a2faedf45 rest test in lu solve a0a759a55f simplify lu_solve dd32f88c2d Merge branch 'glibcxx-sanitize' into 'master' bb4dde5a85 Merge branch 'plus-returns-a-forced-copy' into 'master' 7ee3c49248 Merge branch 'fix_cursor' into 'master' 23e4d5a95d add sani g++32 a96c175829 force copy on op+ fdc204f277 fix cursor 04d025cb80 Update README.md f549688718 Merge branch 'fix-reinterpret-cast-for-const-base' into 'master' 75d93e0a82 generalize reinterpret_array_cast bd251656ec Update array_ref.hpp 2991963cc2 Merge branch 'datatype-mpi' into 'master' 7a3eadb4d7 Update README.md 7d63a5c21c Update README.md fd6d11c9db Merge branch 'master' into 'datatype-mpi' b694be4249 Merge branch 'change-literal-1-for-msvc' into 'master' 4a4f8de53a nolint loops 6cc3a37dd2 longer var name 9479e40ad2 add row accumulation test 4eb6d59158 simplify lambdas f068612c0a Update README.md f6b2a42459 Update README.md 2bc2790177 Update cai_1d_heat.cpp e2a76ccafc Merge branch 'cai-1d-heat' into 'master' 9d1e26ccb0 Merge branch 'fixed-mac-iwyu' into 'master' 877bf8e9f8 Update README.md 1a668ac2c8 use literal one for msvc d513fe6b7c fixes for iwyu in mac cddc943687 Merge branch 'mac-wa' into 'master' c2628ebcae Merge branch 'remove-commented-code' into 'master' ed5ffc2ef6 remove cuda 12 test cb608ef329 remove commented code 5ffe1608a3 Merge branch 'extensions-adl' into 'master' 98e2e7593c datatype in mpi adaptor 03c7b52074 remove unused header 752c180a51 Merge branch 'master' into mac-wa d541556cc6 run parallel algorithm only in c++20 914ce0b315 fix unused header included in gemv hpp 85e937a414 Merge branch 'master' into extensions-adl 657ff31702 finish cai demo d314eba849 Merge branch 'cai-1d-heat' into 'master' 74a47f6181 more tidy 00441ea1e4 rm ws db9d656b08 better wa 6ce35115c5 supress mac warnings 69613607a5 remove test failing in g++7 f0315a4a40 remove static_assert 0c45cc8300 attempt for g++7 65090a7e72 deduce value_type differently 51e20a6981 workaround iterator_category 8b2b2a667c matlab to multi demo 5349aad754 Update README.md 5145ec5692 add adl to extensions d0beb1b23b Merge branch 'Use-mpich-in-ci' into 'master' 6c5397f48c fix boost components in cmake ef1b0e8426 Update .gitlab-ci.yml b4ea3ae26b Merge branch 'use-boost-component' into 'master' fb2cef42e2 Update CMakeLists.txt fec0bfd5a4 windows ci, install msvc before boost 1dc6188aa0 Merge branch 'fix-namespace-in-thrust-adaptor' into 'master' d79953e9b9 Merge branch 'try-vcpkg' into 'master' 6c519a7935 try vcpkg 4a5c826773 Merge branch 'mpi-adaptor' into 'master' a5df368674 Merge branch 'master' into Use-mpich-in-ci 425e3f8227 Merge branch 'fix-cmake-warning' into 'master' 3011c487f4 Update .gitlab-ci.yml a0cf18b1ca move ignore up e04fdbe1b2 move warn suppress up 4f2d3fbf09 more culang 16 warnings 32506ff79e fix namespace path in iterator_system detection b2f80d7fdd more culang guard 347a46b24f guard for culang 272d23c416 skip nvcc fb32f4227c culang warning 07c997e988 nvcc 883d05949d more workarounds be7e5377e0 check array bound in gcc 1c817750e4 more fix for clang unsafe buffer 11be1ef380 Update README.md e752b0f296 use llvm variable 6856d739c6 Update README.md c8c172ee2f guard intel for execution par 28b55fa5dd tidy cb865275d1 guard clang 14 for unknown option unsafe buffer usage 2eb86a494c fix clang options 13b06f46d0 suppress warning for unsafe buffer access 377e734145 add accumulate 29999dc808 Update .clang-format f3e7a75208 ws 413e2283aa mpi docs b7fdecb741 mpi adaptor header 071175ee63 Merge branch 'blas-example-copy' into 'master' 383026e764 fix assignment between subarrays of different type 55f9bfddde protect nvhpc f87a04aa64 protect copy blas parallel tests 5a282b3c48 cuda standard for blas test ec74259b6f add tbb if possible d0ab9c5788 add tbb to cuda 11.4 ci 9352cb521a add libtbb to culang 16 ci c24ab942f0 fix lambda return type 057a2f9dc0 Merge branch 'fix-coverage' into 'master' 94d07ae26c move result e62c00e102 oneapi can't print chrono duration 9887acb886 fix constness problems with 1D arrays a9a6118399 add vcpkg to windows ci 44d1ce9697 Merge branch 'fix-constness-problems' into 'master' 27ff623c1e Update .gitlab-ci.yml 7fa6bdf396 nolint for loop 9be9ba1266 NOLINT for loop d85c6e9fb1 add range-for loop test in cuda aa695933bb fix const problems b8a56fbe83 Merge branch 'add-assignment-blas-nrm2-test' into 'master' 8562b7ffd1 add tolerance in test of cublas 11.4 a34a826acd fix real (double) case for nrm2 gpu 095267ea8c test norm in gpu 824be1d003 remove constraints for cublas::nrm2 da04af860c add tests for nrm2 ea893a816b Merge branch 'add-assignment-blas-nrm2-test' into 'master' 8c13947a0f add nrm2 test for doubles with assignment 7a3ba62919 Update README.md a0114a8595 Update README.md 07ed90238d Update README.md f45c6bbc3a Merge pull request QMCPACK#78 from correaa/fix-cmake-lapack-mpi-cuda fd55b5d6a0 Merge branch 'master' into fix-cmake-lapack-mpi-cuda 4b85a5aef9 wa msvc 520bf911a4 wa for clang 10 ffafee16cb Merge branch 'master' into fix-cmake-lapack-mpi-cuda 3a3d3ce3c4 wa clang 10 6fcf6e0b63 Merge branch 'master' into fix-cmake-lapack-mpi-cuda 3edf715dec wa clang 10 a73cd9b8de Merge branch 'master' into fix-cmake-lapack-mpi-cuda 46194ac267 wa clang 14 d0e83b594e Merge branch 'master' into fix-cmake-lapack-mpi-cuda c4e9c258fc wa clang 15 84b7ba3f74 Update README.md 270d45b293 Merge branch 'master' into fix-cmake-lapack-mpi-cuda 634268468a msvc wa of ranges 66081d4df4 fix cuda cmake lapack mpi 426d550fbe use long 4dcfd5bfb5 use iota 12d83b1412 use auto 458a0d7b3c use has_include 64081e7980 Merge branch 'herk-range' into 'master' 3b89b9267f wa ctad gcc 14 f5c99131a5 Update README.md d2cd8099e6 use ::Value 4ef8f4442c more wa for gcc 14 789b8a25cb wa gcc 14 732ad9082f tidy var name 5d578ab2e9 protect clang 14 b7647d6b4f ws ba030e3765 make extension random_access_range f283412578 test work arounds for g++ unstable 23 bug dd06b8f201 add construction of subarray from begin end 3f67786e0e Update README.md 90d7bb2fb4 herk_nm implementation ebe3b71b00 Merge branch 'herk-range' into 'master' 7fda805d0b herk range 124860f7ca Update README.md 4495d511bf Merge branch 'implement-mbegin' into 'master' 077384dc17 Update README.md e61ccd5d72 fixes for windows-clang 4b25e4a108 add default 4b44a0ac34 add ctad warning 07aa6dd453 Merge branch 'implement-mbegin' into 'master' de98ee7c10 test mbegin 42a4bbd379 specify default arguments 301587c857 Merge branch 'document-blas-interface' into 'master' 4469521c8e iwyu pragma a0e7d862f8 blas interface documentation 2a1ded16bf implement mbegin and mend and move_subarray b975a3f6fa Merge branch 'cuda12-std20' into 'master' 8ad00dc232 remove 11.8 mkl ci de1d427657 order includes 96c486e427 cuda 20 99fd84d4e4 upgrade cmake for cuda 11.8 c305140964 soft ask for cuda 17 359275082f add verbose 95fcd03b54 use newer version of cmake for cuda 20 standard ff8a3b74e4 Merge branch 'macos-no-mpi-warning' into 'master' a199337ddc Update .gitlab-ci.yml 6ac6bc582d use --mca btl ^tcp option on mac 760dbd494b Merge pull request QMCPACK#76 from correaa/fix-constcorrectness-1d-it 6a172f9639 disable tcp d37e3f3f7c return pointer in base friend 4905fb6739 ws bda3759430 fix constcorrectness for 1d iterator c1919ceb4d Merge branch 'move-it' into 'master' 4b62ca210d logic in ci 3c6565687d fixes for cuda 63d2ed28a3 abs thrust bb1d3cbfb2 relax test prec 5a695571ae use ninja for cuda 12.5 730539a7f6 test gcc 15 da45b6b36e add mpi to arm64 test 4147893f37 use generic boost components 8fcdee586a add bool param to iterator b5acce6089 fixes for mac 7ec078ec48 Merge pull request QMCPACK#75 from correaa/propagate-move-to-subelements 06d9d2b978 simplify boost and use anon ns 9ea3bce781 clang tidy in GH ci 40578bd60e add noexcept move 86eff3f86e fix move issues 738daf0baf longer parameter name bf57ad1559 github pr template a7c8c161e0 minor sonar 2ba7796ab9 minor sonar 42a1c673eb minor sonar 8aead56fb7 force using std::move 01134f16e7 some sonar 6cc7960952 propagate move to subelements 6941a9afee Merge pull request QMCPACK#74 from correaa/move-experiment 635df72180 fix ci name f750b3a8bb rollback to nvhpc c++20 cf5cced061 fix disable_mpi, fix nvhpc image 71940183ca fix thrust adl bb0894ab5b disable mpi test in oneapi 2f6ea9ab57 add fimplicit constexpr 59c1ef6018 add a nolint ea70eba938 two step move 35a16db95d fix moved arrays 84aa553dee try element-moved bd1d5336a0 fix cout \n ba1bc4dc85 Merge branch 'mpi-adaptor-strided-vector' into 'master' f75a9a798e use bracket 08d4d5a4fd use 3 arg message 4944595e3a Merge branch 'implicit-constexpr' into 'master' 740d146fef Update .gitlab-ci.yml 886383e813 Merge branch 'mpi-adaptor-strided-vector' into 'master' 3f46cf9a66 add create_datatype 0af23ae8e1 Merge branch 'mpi-debug' into 'master' cbc6b990b8 Merge branch 'master' into 'mpi-debug' 9eb6e3356e Merge branch 'mpi-adaptor-strided-vector' into 'master' c39fc985ea Update mpi.cpp 11752880f1 Update mpi.cpp 803e923f8b Update mpi.cpp 5fa2910691 finish with recursive skeleton f9571da029 Merge branch 'remove-common-reference-hack' into 'master' be89a18174 remove common reference 6d5b222707 disable mpi d6edccb26a Merge pull request QMCPACK#70 from correaa/improved-thrust-uninitialized-copy 81512bfb30 Merge branch 'master' into improved-thrust-uninitialized-copy b649816c41 fixed typos and grammatical errors f9bebd346d workaround limitations of thrust uninit_copy 4157eecba2 disable mpi testing e426fb8caa Merge branch 'remove-need-for-BoostTest' into 'master' 867ef2b572 remove BT component 898fa98f4d uninti copy c4e68e9e16 Merge pull request QMCPACK#68 from correaa/add-mpi-adaptor 22c27007ac Merge branch 'master' into add-mpi-adaptor e683944063 restore thrust tests 0a1bf18aa4 restore thrust tests 3210dc4621 move disable macro 2c041b62a9 protect msvc 45b9517a71 fix for msvc cce04c78dd add disable mpi af5dfa839d more verbose for ci c875fb7c6a add valgrind exceptions 9370958140 add valgrind e5c6a2ba24 switch to mpich cbd5375a8f use libmpich 8c4276f303 fix root exec in nvhpc ci a8dbbdd334 generalize to 2d c746c11daa first version of mpi adaptor 5aeacb9676 Merge branch 'array_ptr-mctor-noexcept' into 'master' a942c43aab Merge branch 'add-init-pattern' into 'master' 48672b61dd remove test for array_ptr c7753cbcbb noexcept move constructor e7de6afb6a remove some ci scripts a74558ae1c change version 57ee27a183 fix warns fd1e9d3c8f roll back warn 2a706dbdf6 unknown warnings d157aaf462 more protected warns b79fddffb9 more protected warnings 3464e4e6a5 enable some gcc 11 warns f563c829c3 fix cd 365f790fe4 init 4 version 12 e3caae5c93 wa gcc 7 f897a65c8a avoid dl boost libc++ 60ad8af97a wa clang tidy 14 e7bee0ce1e wa for gcc 7 814fc0ff94 add init pattern a nolint d825df3b5c ws 1f1de6ddaf Merge pull request QMCPACK#66 from correaa/tidy-modernize-concepts 5fc71b876f var in PR template bffa38736d finish nolint 9bbb7ff2ed nolint all enable_if for c++20 1e8b9e0522 add installation instructions ab4a249622 more nolint 4a46953596 add numpy to run 3rd party test in cuda test 59fcb2af05 install numpy for 3rd party tests 0814157931 Merge pull request QMCPACK#62 from correaa/github-pr-template 0e58b5dcc3 add nolints for clang tidy c++20 concepts 1f572ff38a choco yes 78055c5e95 choco git dee33ffdc6 simple install posh-git 06c2017fba add confirm false 657b57a595 add windows test for master f21322dfdb Merge branch 'sonar-6' into 'master' 5c2739b75b use wget for sonar binary 3193e9aca5 Merge branch 'sonar-6' into 'master' ba99009ab4 update sonar cli a365012bc9 remove unicode char from line 5e197cc6d3 Merge branch 'sonar-ci' into 'master' 189885c853 ask x86 for sonar exec af46664354 Merge branch 'remove-unicode-chars' into 'master' b3ce674208 remove unicode chars 8b22273edb Merge branch 'experiment-explicit-constructor' into 'master' c815c64a8a fix indentation f28892b9b2 Merge branch 'master' into 'experiment-explicit-constructor' bb5879a978 Merge branch 'overflow-iterator' into 'master' 3220621dfb add void to ignore result aeeb8b6f0b add more sfinae to op Range 74d3fc1bab use copy_n in place of move_n ee0a6b84ad use explicit type in chunked 97c1486151 use more explicit copies in test 5756694a62 Update README.md 247dc61406 test empty interval 59394711b1 comment constexpr test f5b06511be simplify subarray_ptr 8317d04a30 Update README.md f80cf31e94 Merge branch 'clean-cmake' into 'master' 718fa2495c clean cmake be456b161d Update README.md 2cbe5fffe3 eliminate MSVC special code 771ea0b82f make layout trivial for msvc 04b4ee4d63 try trivial with 0D layout bc54ed2e41 test if layout is trivial f8465fbf46 Create pull_request_template.md f3ed1dd85b check for subarray_ptr to be trivial 8cccd9e12b format 37faa722f3 ws 4831e042d5 leave stride_ undefined aeee4ca41b attempt to make iterator trivial 9d58114294 make array iterators trivial fcd440a8b6 add ptr overflow test with diagonal bb2a538e0f add fortran examples 72f2de8c06 tidy last 1e42baa664 Merge branch 'constexpr-fill' into 'master' 2a52aff45d Update README.md 93850e99f5 Update README.md b071f8c164 Update README.md 08c0358007 nvhpc doesn't support some constexpr 7b2c173ca7 Create pull_request_template.md 29f4fbce82 allow constexpr filling 93fed437dd Merge branch 'remove-include-cleaner' into 'master' 4f33e05959 more sonar 01cb6dfa4c Merge branch 'cs-n_random_complex-assingment-delete' into 'master' 55e77d4f46 delete copy assignment ade6f071d3 more iwyu 6adb45faaa Merge branch 'remove-include-cleaner' into 'master' 87da1be290 simplify macro 1fd8c764b0 simplify macro 14103e576a simplify macro c9d03bbe91 more nolint 88a8c5c5ab more nolint e2270d1f18 more tidy e512d42b6c use BOOST_TEST instead of BOOST_TEST_LT 26e94a44a6 simplify title macro 19a096fd4f print cppcheck report 23a951c317 use external tidy d710715914 separate tidy 243abec6d9 more tidy 91dd698c50 more nolint 6b44b8655d Update sort.cpp 7ef1013124 Update sort.cpp 376d0e3b1b more nolint 8e00cc0aa4 more nolint b6b7425c62 remove trivial assignment 8aff52f9ce tidy for ranges test 6f15ed90c0 add Wnoparent 41796a0617 more tidy d471365fae clang tidy out of build 3563ad03dc remove paren warning 1e4f9db904 fix sizes function 9438948636 remove paren warning 99a0ba3265 sizes fix for nvhpc 608b2576e6 more tidy 649a3b1b7e more tidy header clean 9394f2647d more tidy clean headers 1c7dcbe198 mt 24c13320fe mt 6e37f55e55 more tidy fb99096192 more tidy 255e33e207 more tidy c45cfe23eb more tidy 36b2f81347 more tidy included 85659d249c more tidy 2513bb350d fix includes 40b282bb24 more tidy 2ac62ed515 more tidy 286bfa8e15 more tidy 3282625b20 more tidy 0ccbbbef34 more tidy 9a64a09e8d more tidy d6e7691413 more tidy d042ae825d more lint b0cb48e723 more tidy 85cf71b521 more tidy f90b2e52ea more tidy 4d04918347 more tidy 44d224b880 add noexcept for ctor 33246d6cfd more tidy c4e0073f74 tidy headers 37c28a3f26 remove include cleaner c582d3886b Merge branch 'remove-include-cleaner' into 'master' 0ae85e3509 more iwyu 20eb60ebcc more iwyu 2618c978df more iwyu 66bd25c423 more iwyu fd7c4f27fa more iwyu 5f8061b6a9 iwyu error 1e090d33fb remove tidy exception that was used previously for Boost.Test c175618e37 more iwyu d05ecaa793 more iwyu cc8d0f60cc iwyu fixes e7a3df5ed6 Merge branch 'remove-memcheck-options-from-cmakefile' into 'master' 71013b3a03 fix sort iwyu ac2caefabb protect version 7 a5ffb63f6b exceptions for clang and gcc 7 91ee2d717f solve iwyu problems in the ci d8a0885573 remove memcheck options 3ef256ef90 Merge branch 'add-lapack-to-ci-to-enable-tests' into 'master' e10d332592 add lapack to ci 431256c2d4 fix lapack cmake 9780b28737 ws 16dc38b2e7 remove dep on butf on lapack adaptor test 740627a5c5 Merge branch 'fixes-cublas-headers' into 'master' c0c1f17ac1 remove what 81537b56d1 change macros bd63b3d949 change macros 9890e867c3 fix includes in hip all test 0aa59f7f50 remove butf deps 76defb3953 fix cublas test to not use butf d18650b03e remove one utf 92e669ccce correct includes 9faf01d98d Merge branch 'win-clang-warnings' into 'master' 1d277ab3b3 ws 7e938b9c38 gemm indentation e04041cae7 fix types 968d457179 fix conversion 600520d037 alignments 8ddaa584a1 allow C4244 in MSVC 84f51ac943 vcpckg don't install boost-test 9b842b380c smaller stack 075936b0c2 uncommented some layout code c9d05be607 comment all in layout test 1c14a90d68 protect clang windows 732021757f remove clang workaround 2834b97797 tlw main 8f30d542a5 fix for windows clang 1900fe865b remove all boost test installations db4cf2ce62 remove some boost-test installation 682d0f7036 more lwt 49dbe859a8 more lwt 585b9d7ae1 more lwt 22be66eb89 more lwt 03c05e5577 more lwt ea3125947a more lwt ba396678db more lwt caac0a5125 more lwt 24b3518545 more lwt ae366f39e6 more lwt 8ad268e554 more lwt aae3124fa2 Merge branch 'master' into win-clang-warnings 95473c6e8b more lwt 09e8cde0bf Merge branch 'buffer-overflow-msvc' into 'master' 7193386dab more lwt e550911f3a more lws e6941a4852 more lws 36a3548078 remove boost test compilation from m32 6594be090c more lwt 4cb7127f2d remove boost test aaf5eb2a1e circle dep 40a6e39d72 circle compile 4 at a time 27a9f409a1 more lwt ab2cd0aa57 more lwt cc2226e599 more lwt cc6312bc4e more lwt f938a74734 more lwt 0ab71171ad more lwt 1091bb1095 more lwt fe8a775391 more lwt e57e46baaf more lwt 7a4422d631 more lwt fa28b0e26e more lwt f3e9293233 more lwt 54e9e6c460 more lwt e527d4166f more lwt 5b9b5eee1f more lwt 3aa91cb4f5 use consistent types for sizes 95e47d82c5 more lwt acf715e739 more lwt a58c9e7ff0 ws 4f85bb82cb add /GS option 62df86845a fix assignment 240aebede6 more lwt 359fba4520 remove implementation of operator= for indirectly_writable a8f50cb8c1 more constexpr 4cfe5c5fd3 more lwt c5bfbd7323 more lwt 5089a377e7 more lwt 13599ce5df more lwt a82a9f0220 more lwt 3dc0e3d910 allocator test us lwt 19f2bb6a51 fix comparison 8bae027b84 start using lightweight test e9e9d0f394 convert fill test to boost lwt cac3f49292 changes in fill e3ab7aa9e9 Merge branch 'master' into win-clang-warnings 2454c89959 Merge branch 'add-constexpr-test-for-ub' into 'master' f72832afa9 protect more msvc bb5851239d Update README.md 8d39b351a8 Update README.md 69eed7b5fb Update README.md b9ef62ff59 Update README.md 55a3a3e218 Update README.md 4ab60a7b7c Update README.md 7286936925 protect msvc 290749a8ed fixes for msvc 43807ed010 paren in return for msvc 2cbafb586f avoid msvc for constexpr test 0c01afa260 without - d2ff12587b use clang 18, 19 is not working 5a673daa5f gcc 7 warn 0e0bb7fbed constexpr tests 5698ea099f gcc 7 warns 5f4d3ef01f gcc 7 warnings 61546533cb more gcc 7 3122f5d511 more gcc 7 c92f6f9f00 more gcc 7 80aa2c81ed more gcc 7 c36a3bb5bb more gcc 7 protections f565d90ab6 more protection gcc 7 8b31e3cf5f more protection for gcc 7 380f8408ba protect gcc 7 4c11dfa4bd missing new line 8b4d072daf more gcc 7 protection 018a8f09e7 more gcc 7 protection 910bc71ff8 protect gcc 7 5a375387c0 protect gcc version 7 11af99fe75 ignore for gcc 7 44f02d8959 pragmas 756284e710 add ignore Wunknown pragmas 2cc7432b46 no pragmas 3faf8e20c2 no pragma for gcc 7 f67b2aede3 avoid duplicated runs in ci 740ce35dbd cast function warning 2b7662bc9f disable warnings for gcc windows for boost test included 6cf1b9f65c function cast ede7c2ce3e more paren 3e76d9ec0b multiline d325188c7c protect min 8725c41bda use paren std::min 5d9938f05d protect MSVC macros b9ab4272a7 fix min max for msvc 2da4b1542e add float c4f170bc7f Merge branch 'master' into win-clang-warnings 9b1e43245d allow failure circle ae9e04f98f boost header only cmake 14b976f7db boost for legacy cmake 6e84c8b3bd use legacy boost component c4777bafa3 headers b21a416eb6 remove linking with B.UT 76dff44e0a use headers lib 350b155ba8 transition to included header 4570cd054a use test included 433d164723 more clang ignored 66ef274fe3 restore push 29aa529a59 more warning suppressions for clang windows e4f2b2dbf2 fix warnings for clang-windows 315598ed69 Merge pull request QMCPACK#50 from mborland/timeout f27e98875c Merge branch 'more-clang-warnings' into 'master' 6864ccdb65 Merge branch 'master' into 'more-clang-warnings' e5102744e4 Merge branch 'pragma-for-clang-windows' into 'master' 26a6df8275 Merge branch 'fix-op-arrow' into 'master' da8e5a34e7 remove broadcast test aaee81d14e clang windows pragmas 1484231723 array_cref warnings clang windows e3eb55f5c1 ignore warnings b9e563181f clang unknown warning options 75c85d8ae5 pragma for clang windows 3dc2d8a9a1 fix potential problem with clang windows 2cfd455454 Merge branch 'iwyu-fixes' into 'master' 52b115b301 Update cmake.yml bc54775d49 Update default.md 9f9a38a2ec Update default.md 895622775b restore -> tests 11be0cd620 Update default.md 4cef8c961b Merge branch 'master' into timeout 58955f248f Update default.md b3a91dc438 iwyu fixes 9a85b467d4 Merge branch 'fixes-for-msvc-operator-arrow' into 'master' a921d1b844 Merge branch 'master' into fixes-for-msvc-operator-arrow 5132583abd Add new file 563d7b753f change enable_if logic in op== 8c46272e7b define eq for it c33fcb4c02 missing -> 89e96422fb restore stride 50967b8cc8 add L 3a6196ea2f use dot 386df63cd8 remove ->extension 93dd167e2f typo f712d73cc0 remove -> from gemv e83f00f39e fix typos 2274db05c4 remove arrows bc2557efec remove arrow from gemm 3545cf3e66 remove another arrow 988debce6a operator -> removed for msvc 6cc0191410 Merge branch 'fix-for-windows-ptr-base' into 'master' cb6879b4e1 restore base friend function 93f027ab43 declare array_iterator as friend e1a09464ff use ptr_.base_ instead of ptr_->base_ e7f263a8b0 Merge branch 'fix-swap-for-const-views' into 'master' 7cbc5f736e remove home function c22e62d7d1 use default version for clang-latest ci 243f250114 latest 367d105fb5 add duration 77351562ce run windows test only on merge 54edb34a0e try again ef169991b1 workflow 0daa74524e use testing 21d6eda4fc swap test 6fecc322e2 Also add timeout duration to clang-win 67a3180cb5 Enable same duration timeout for windows tests 5871fd2105 disable size test 8009614ec0 Merge branch 'fixes-windows-clang' into 'master' 8e13e5aa69 Merge branch 'windows-extensions-fix' into 'master' 8363dd31bf Update allocator.cpp a645eef33c remove dobule ; b02747682e Merge branch 'subarray_const_fur_pointer' into 'master' a019951610 use reinterpret 0853cf3396 fix base 5d7ef8cd80 fixes for reinterpret_cast 3315ec8186 Update all.cu a9b36affc9 fix typo for msvc cbb71b3be0 change assertions for msvc 38856c6534 eliminate one test from msvc 6e2b68488e fix typo 0aeaa8c408 mark bug in cppcheck a4e59a98b2 missing ; 708684d455 remove static assert for msvc 21c735c320 stride_ = 1 for msvc a05f9921d7 add host device to range fun 9cddfafb4d repeat code for gcc for windows 80e4cd82b4 more host device 0899c81160 add timeout for windows ef8043d4bf fix element access with paren fc33735786 Merge branch 'master' into subarray_const_fur_pointer 8cd94d6388 inq cuda 12.0 f45a849abd fix equality d509017d7d add const option for array_iterator 0a6a61791e remove std::size test 1e773a4989 fix ranges compat c05d167054 no error in iwyu 0f0d669b66 less clang -W ae3f6238af remove std::size test 7010d2ae04 finish with aux functions eb3e6a1155 failing reinterpret c5d77ccd24 make all op() functions host device a515577361 fix reinterpret cast for rvalues b9db720ed7 fix partitioned 754f477a8a add mutable flatted 2e72d34f84 disable ranges test a93ca524f8 fix swap for windows 3d127949b1 fixes for circle 127987c411 fix reinterpret cast 9536cc49dd Update .gitlab-ci.yml bcb5d62e47 print current directory c96b1a33d9 Merge pull request QMCPACK#48 from mborland/node 2ec930cad8 fix transposed in cuda tests 4fddf8f0e3 Remove codecov workflow 78f52dd5d1 remove vs2019 5953d84d17 remove deprecated function (ref from it pair) b1ca59303f possible fixes for circle, g++7 and nvcc 9a6e99418e final fixes for subarray_ptr c2de9043a0 subarray_ptr is now trivial 1991045ab6 Allow use of node 16 in CI fde9753773 layout is trivial 0728e08ad5 Update .gitlab-ci.yml 08934aaa6c make friend declaration public 7ec499dc28 remove ref(first, last) utility 070f0b7657 add friend 2d183020c8 make array_ptr trivial c626f08496 change parameters for subarray_ptr 8f8fa463f9 run circle latest 6047ab7045 fixes for nvcc b585bdd401 fixes for gcc 14 d275a63007 still needs to reenable reinterpret cast 5ba0d37906 still needs to reenable reinterpret cast 5da0e9d413 to fix constness with operator paren db9a62ee96 finish first version of subarray 924663711c finish first version of subarray 6039b0144b const_subarray fixes eb7da4e3d3 Merge branch 'const_subarray2' into 'master' 279201f111 Merge branch 'fix-circle-blas' into 'master' e1e45a0397 Update .gitlab-ci.yml 63db28c6cf add multi::sizes function 62d13eb1d0 fix ci m32 arm 216790686d add arch command e9564128f5 Merge branch 'master' into fix-circle-blas 9003a8ca5d arm64 ci 1d3f96570e Merge branch 'master' into 'const_subarray2' 8b3b8e85f3 explicit arm compilation 185dbe8cce arm friendly test b4b5124c3f use arch e3ff9a3194 require x86_64 for circle executable 62ab4d25f6 create intermediate class const_subarray 5221027442 add brew fft 0b65dc3ab8 add brew to prepush 1782d411f8 fix circle blas 104561821c after joaquin minireview 4b4c5d5624 Merge branch 'dup-keys-yml' into 'master' 2190964e98 first dimension c93440253d better non-associative example 153096d98c added `index_range` to docs 2210d138ad note on lexicographical compare d6e68bf012 typos fea629a3c3 remove dup keys from gitlab ci script 01a89c1874 Merge branch 'more-cpplint-braces' into 'master' bfca356207 add capture 01bdc73ee1 remove Wnrvo cb60dcc448 small typo ddd5a0b6bc fix const return c3d25b6182 special memory 0754d4277c fix braces for cpplint c06154dd3a Merge branch 'Fix-capture' into 'master' 71a0336867 Update ranges.cpp b8ef58ee95 Merge branch 'cpplint-whitespace' into 'master' c0a48bf286 Merge branch 'joaquin-minireview' into 'master' 1f13a6a504 If these parameters have default arguments and you're not disclosing them for now, then you can document the class as if they didn't exit: b5595c94f1 * multi::size_t (usually signed size): did you mean "usually unsigned size"? Why usually? What does this definition depend on? 6afc364788 * class multi::subarray<T, D, P = T*, ...>: no explanation on what T, D and P are, plus no explanation on the extra arguments (...) bb9d90dcaa more cpplint accdd3207a remove whitespace exception for cpplint 24b052439d Merge branch 'iwyu-with-error' into 'master' b1d28ba8e2 remove nolint for cpplint 5849265809 Merge branch 'not-accept-alt-tokens' into 'master' 5ef4187fc7 nvhpc gcc pragma 9d757c2580 Update CPPLINT.cfg 8f2dc28cb7 more pragmas f3d1c727dc use modern syntax 3257c48f2c shorten line ab5f799261 add error code to iwyu 5a61cdede7 Update NO_UNIQUE_ADDRESS.hpp e355ae8300 Merge branch 'blas-iwyu' into 'master' 71c6cddf4e fix syrk 643a9264ea Merge branch 'blas-iwyu' into 'master' 0068103d7d Merge branch 'master' into 'blas-iwyu' d23347a657 finish iwyu for blas adaptors 9eb114e43d Merge branch 'dot_equal' into 'master' e4ccf7fa13 more iwyu for blas adaptor d457d80d9c float-equal 3c9c2414b2 more warns affdd06e1e msvc warnings a8054603a2 msvc pop 19118c687d Update reextent.cpp 88c1d73698 Merge branch 'msvc-warnings-more' into 'master' 2284fb5c4b static_cast integer afc224b5bb msvc warnings c5b820719b Merge branch 'msvc-warnings-more' into 'master' 0d775fcc1d some msvc warnings 2a0a5787a6 more exceptions to warnings msvc d6015583aa Merge branch 'restore-blas-test' into 'master' 2b2440c50e Merge branch 'array_ptr-default' into 'master' 0d1ce348a2 get cmake 3.18 d2ae47a369 more exceptions warnings for windows 9a5d531c42 Merge branch 'm-msvc-w2' into 'master' 0b05323627 Merge branch 'mute-msvc-warnings' into 'master' e55d75862b Update .gitlab-ci.yml dd48d2c00d Merge branch 'remove-all-float-equals' into 'master' 90738bdf68 revert to debian stable 0b983c0b06 add non-shared to arm 3ce51969ed try interruptible 38b510c605 getrf fixes for macros 2959fec45f restore testing blas adaptor c58d3f1627 another msvc exception 7d288e4410 remove float equals operations f87eacd12f msvc warnings 394b88d1da warnings not as errors in msvc 5fd74018c5 Wall 3e6a9b6bf3 Wall msvc dbf4342441 W3 e6458aa3aa last iwyu be6268728e Merge branch 'iwyu-ci-windows' into 'master' be0803da64 restore pmr 5a0fd436a8 windows Exceptions d33a75174b more iwyu d2630c8f21 warnings for msvc 3af8142aad Merge branch 'more-iwyu-llvm' into 'master' 01caf55cd3 more nolint 3d8dfba669 format ab9e480a2e remove dup includes 07d21bbe46 using apple iwyu 967ac044db Merge branch 'iwyu-hang' into 'master' 2e588d8a93 remove merge rules in ci b0cf445081 double -> int 594d9aac02 remove ASAN var 4a80821b48 iwyu keep d0178537c0 comment span 8a1b89b883 conditional 124a895adc eeeeeeso 71d22ad073 finsih iwyu 5a4470cea3 iwyu finish c54bb1343e workflow rules f4fa132dbc finish iwyu dc0ba0e8ad use padding for msvc 3c24ba2bfc iwyu 6997428af0 iwyu ed7c6d71b7 ci 43c76526fa add rules ff09add192 remove needs b125c4c3f3 iwyu e6826ce97f windows only on mr 6bf32d6126 fix windows d5f2f4043e more iwwyu ddaf10df41 parallel build 645ac07960 more iwyu 7dce12bd12 iwyu 037eb88fae more iwyu 386d101ef4 more iwyu 8d322ad93b Merge branch 'iwyu-20240616' into 'master' 5b21ae3328 Merge branch 'more-iwyu-bt-fixture' into 'master' bd88062d98 Merge branch 'possible-solution-msvc-1914' into 'master' 7c56e739c9 assign iwuy 684e662fd8 resiliant old version of clang eb2728ab98 more iwyu c579275555 use localdir 13be9232c2 iwyu eef1849605 chage tmp path 67dd7954c8 fix typo f7d958ddba iwyu f5ae62df80 fix iota 8eddcb328d fix msvc conversion 0216626c0b iwyu d00ffba303 iwyu 792415d9be fix conversion 70d5195f12 iwyu 318def61e6 iwyu 0446ec688f iwyu 0cee4dff2a vs2019 257081bc40 use wget 318a6911e7 msvc 2015 fixed 8085656533 remove useless conversion a6550cc8be for 32 bit b8414ecacb msvc2015 5d1054fd4e more iwyu 4e7e7f7f02 more excpetions for iwyu c7215804d1 iwyu 610b826de7 more iwyu 48edee0d59 more iwyu 46e1d44e9f Merge branch 'more-boost-test-iwyu' into 'master' 1db7bd388a more paren 60758327eb new del mismatch bbf0e95c25 asan clang remove check_init e27f798cce use paren 0e712a5d40 iwyu 596e4f6ea2 choco 64 3586299e83 more iwyu 25b30b6fc9 iwyu for utility 8a2e186e92 Merge branch 'more-boost-test-iwyu' into 'master' b22abb19e5 comment 2 windows 8d82692950 remove needs msvc ci 214a0fd1d2 what? 877c285d20 remove dep c59b8d8229 iwyu for boost test 895f26a058 run windows only on MR 665e6c505b Merge branch 'msvc22' into 'master' 03dd88d6bc add msvc22 a1894aafed Update concepts.cpp f12fa5d876 Update concepts.cpp 4aec5cc2ea Merge branch 'remove-checks-17' into 'master' 2170b17fcd Merge branch 'fix-gcc-github' into 'master' abb4f06cf6 Merge branch 'paren-for-msvc-q' into 'master' 0aaa3290f0 Merge branch 'master' into 'remove-checks-17' fc23343119 fix conversions 52ab8db81a Merge branch 'clangsani' into 'master' 6ba1fe7dd8 Merge branch 'remove-cpp17-test' into 'master' 3681851c56 Merge branch 'vs2017ci' into 'master' cadf80e683 fix paren ec996de903 Update README.md 62801efc4b remove checks 17 4c2d4aba10 Update utility.hpp a06889b3b3 paren 2562361893 revert version 203f704b8a fix asan e22534d54a correct libs 56741af05a add libasan 5ef74a096d 2017 ci 0596b3c85f Merge branch 'iwyu-windows' into 'master' d62850349b apt clang, not clang++ 665b6de14d add sani d2e13724b3 clang sanitizer 8e7da8c151 simplify windows afe69dbccf remove macos e55373ba07 less code 3c35d6ea3f finish windows ci 7255b24c0f add boost path 4720eec81b Merge branch 'more-iwyu-pragma-export' into 'master' 73df974b01 try different macos tag b55237f7e0 comment macos 5bcd9c69d3 last try with boost 1.74 ef8728da1a 1.74 dcb74c670f 1.81 8435dd462b remove -A x64 0db8744743 fix typo 5279b809d2 iwyu 36b14b904b dynamic with 1.42 4ebe02766b use boost 1.42 28e4ae2711 remove static from cmake line 4e5e735672 force boost static 258a722c59 choco static 18be8a2a83 add dotnet 3.5 395195f3db using vcpkg 92eb4fa238 auto -y choco 75a89615f0 upgrade choco 06280d2f0e install boost in vs ci 3a255b2b1c run cmake directly ad95ad7bb6 add -y for choclatey 5d23cec51c more iwyu 3a5a2bae1f install vs in windows ci 27777e5636 reactivate clang-macos ci 88d0b0ab25 add windows saas 8bd9aec3c9 add windows shared 746d41e1ed fix fancy ref iwyu 16b7005c0b Merge branch 'more-iwyu-pragma-export' into 'master' 2b34644dbb conditionally include pmr 4a028b5672 more iwyu 3d3858bdce iwuy and float conversion 13dee28516 use angle bracket a4fc7ce0f3 Merge branch 'pdimov-double-negation-fix' into 'master' fb9cf97db3 pdimov-double-negation-fix ab2577846f Merge branch 'use-g++-with-cuda12.5' into 'master' 01f206bd63 fix typo c06cd19e72 correct type dc0be2feda Merge branch 'increase-coverage-static_array_allocator' into 'master' b8c22e2af7 Update .gitlab-ci.yml 73ef30e78d Merge branch 'add-125-cuda-ci' into 'master' e5f6608341 Update .gitlab-ci.yml b6eab10dca Merge branch 'master' into 'increase-coverage-static_array_allocator' b5c407714a make host calling host device a warning 38f9619100 Merge branch 'meshgrid-ranges' into 'master' 562d8f051e workaround for cuda 11.8 b2713cee3a Merge branch 'master' into 'add-125-cuda-ci' d7a9467b1f Merge branch 'thrust-iwyu' into 'master' eaeab51cea detect ranges_repeat f20b90ac9a implement meshgrid using ranges 2971b14fac add apt-get update f75d7dc808 Merge branch 'fix-ranges-copy_n' into 'master' 4bb1f157d0 comment range test bad219e25d format test 681c3f9792 remove thrust dep 161cb6d9e2 ranges compat 0de338346b Update README.md f1ff1bca4b grid mesh 3e06306285 Merge branch 'fix-windows-init-list' into 'master' c048cc6776 fix type for windows acfc147f36 fixes for cuda 12.5 0c653d8639 add 12.5 cuda ci 4820aa1352 test string for cuda 6fa8c2f755 restore comment 7ab27de4ec Merge branch 'increase-coverage-static_array_allocator' into 'master' 76b164f930 fix string 6ecc3f3abe Merge branch 'remove-header-stack-allocator' into 'master' c7004c9b18 add cuda 12.0.1 CI 5a9a99f438 comment inl include d8d8e2e8b3 increase coverage cf7039d6d6 fix headers fb14874a1e rm unused variables d1575ebf36 add common cmds cff764c58a Merge branch 'remove-header-stack-allocator' into 'master' d2669aa36c cleanup headers in detail 758fb3ee6f improve includes a311bd6019 remove some headers d4f2015a28 Merge branch 'iwyu-fixes' into 'master' 05987405cf use deprecated only for clang 7887fba5a6 more rules 74b61df58f diag deprecated for BMA concepts 82843d99a7 more iwyu rules 79f794eb07 Merge branch 'macos-bt-fix' into 'master' b7a0c2e4ad Merge branch 'iwyu-del-headers' into 'master' 5e96262699 remove rangev3 adaptor 748606c8bb delete separate pmr headers 67e6716549 Merge branch 'master' into 'iwyu-fixes' f9e86a0d50 Merge branch 'use-bookworm-image' into 'master' e6e73c8b66 workaround for MSVC 294845fe36 Merge branch 'iwyu-fixes' into 'master' c130b4f078 add iwyu ad14e1f499 Update .gitlab-ci.yml 5b49540006 update pre-push 206af93c2a workarounds for boost test in macos 0e80dead17 fixes for iwyu 59a1d8f813 Update .gitlab-ci.yml 34d20f324b Update .gitlab-ci.yml d8971d7028 Merge branch 'do-not-include-global-path' into 'master' f36ddc8111 remove redundant test 87c3058ecd add multi dep ec3cf80f32 avoid global path 2928ea3128 Merge branch 'add-ci-nvhpc-24-5' into 'master' 73bd2e7f14 Merge branch 'even-better-constructor' into 'master' c51e833d32 Update README.md 7321bcb897 test nvhpc 24.5 and cuda 12.4 9991dd2e8e remove duplicated constructor 82add032ab Merge branch 'circle-finish' into 'master' f9692d6d6e circle g++-12 f4dfbb3ee7 circle-203 c87c6e0daa 201 ef30f72f45 circle ci 3fee2ae754 remove all circle ifdefs 120fd4503b Merge branch 'fix-constructor-for-circle' into 'master' 9ba688f050 Merge pull request QMCPACK#46 from correaa/windows-boost-serialization d71189c0c7 Merge branch 'add-arm-ci' into 'master' ca6a853505 add timer and serialization to windows github ci 9a8aa2b935 add blas, use system boost for arm af3628268d remove multilib 02397723ae arm64v8 00ff7e6232 add arm e6a31ae751 better constructor 1133cc1e26 circle 203 63514569d7 get newer cmake for ubuntu 20.04 44d0dd9b7e Update .gitlab-ci.yml 7508b30a71 Update .gitlab-ci.yml aed80c5fdf Update .gitlab-ci.yml cab326e511 use circle 203 ce79176224 add circle 203 174269a035 Merge pull request QMCPACK#45 from correaa/correaa-patch-2 6195e55038 Update cmake.yml 8bf6a42fc1 Update cmake.yml a49a9f2e1a Merge pull request QMCPACK#44 from correaa/vcpkg-test-only 023ca2fb72 Update cmake.yml git-subtree-dir: external_codes/boost_multi/multi git-subtree-split: c9bf1402b66026eff7fb1e278bbf8e5b598672b6
Reported by: ye-luo
Hi all,
The mixed precision has been added for QMCPACK.
Add -D QMC_MIXED_PRECISION=1 to activate it.
Single precision is set as the base precision. Double precision is set as the full precision.
The single precision is used almost everywhere including particle/lattice coordinates, distance tables, wave functions (SPO, determinants, Jastrows), Hamiltonians.
To retain accuracy, a lot of reductions (estimators for energy components, gradient/laplacian of WF), coulumb/pseudo potentials initialization and random walking trajectory updates are in DP.
Uniform and gaussian RNGs are always in DP but not necessary needed for accuracy. They are useful to check MP against DP.
A recompute is introduced to recompute the inverse matrix for determinants from scratch, inversion in DP.
1
By default, recompute at the end of every block as the GPU code.
The short tests in the test suite all pass.
The mixed precision code has been tested mainly on solids with real/complex builds with VMC VMC+drift and DMC runs.
Using SD+J1+J2 wavefunction.
Certain parts of WF optimization needs DP, not fixed yet.
DP is fully DP calculation, SP is DP code with SP spline, MP is mostly SP calculation.
VMC runs:
LocalEnergy Variance ratio
no drift
DP -953.463912 +/- 0.000598 19.293480 +/- 0.013491 0.0202
SP -953.465124 +/- 0.000534 19.286142 +/- 0.005167 0.0202 Good news to tell
MP-nocompute -953.463879 +/- 0.000757 19.287460 +/- 0.006794 0.0202
MP -953.464368 +/- 0.000717 19.296810 +/- 0.009718 0.0202
with drift
DP -953.464105 +/- 0.000571 19.285185 +/- 0.010345 0.0202
SP -953.464476 +/- 0.000586 19.288203 +/- 0.007079 0.0202
MP-nocompute -953.464267 +/- 0.000510 19.293128 +/- 0.011464 0.0202
MP -953.464501 +/- 0.000635 19.291002 +/- 0.006508 0.0202
DMC runs
tw_id energy error tw_x tw_y tw_z kpoint_id weight
tw0 -955.0420 0.0028 -0.25 0.25 0.25 3 0.5000000
tw1 -955.0408 0.0021 -0.25 -0.25 0.25 2 1.0000000
tw2 -955.0375 0.0025 -0.25 -0.25 -0.25 1 0.5000000
all_tw -955.04027 0.00141
12/ncell -79.58669 0.00012
tw_id energy error tw_x tw_y tw_z kpoint_id weight
tw0 -955.0374 0.0027 -0.25 0.25 0.25 3 0.5000000
tw1 -955.0423 0.0026 -0.25 -0.25 0.25 2 1.0000000
tw2 -955.0344 0.0021 -0.25 -0.25 -0.25 1 0.5000000
all_tw -955.03910 0.00156
12/ncell -79.58659 0.00013
The DMC results are consistent with in 0.1 mHa per formula unit.
LocalEnergy Variance ratio
cpu-MP-recompute4 -6513.765656 +/- 0.005351 171.717760 +/- 1.120446 0.0264
cpu-MP -6513.767989 +/- 0.006393 170.280852 +/- 0.323286 0.0261
cpu-SP -6513.758155 +/- 0.007937 170.783247 +/- 0.306188 0.0262
cpu-DP -6513.767100 +/- 0.007118 170.458162 +/- 0.217447 0.0262
gpu -6513.756115 +/- 0.009468 170.240843 +/- 0.190607 0.0261
In brief, the accuracy is not compromised with the mixed precision code.
Actually, reducing the recompute frequency doesn't seem hurt the accuracy at all.
By default the mixed precision is switched off, the code should behave as the trunk.
Please provide feedback under this post.
Ye
The text was updated successfully, but these errors were encountered: