-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
exa #25
exa #25
Conversation
|
@jbcaillau Note that ExaModels works with any GPU backend (NVIDIA, CUDA, Intel, Apple...). |
@amontoison @frapac @0Yassine0 run of goddard-exa2.jl on a cluster at Inria: N = 100
exa0: 311.847 ms (25632 allocations: 240.12 MiB) # 90 iterations
exa1: 343.495 ms (50469 allocations: 242.28 MiB) # 90 iterations
exa2: 129.811 ms (313799 allocations: 8.74 MiB) # 20 iterations
N = 500
exa0: 734.530 ms (10977 allocations: 1.92 GiB) # 38 iterations
exa1: 733.023 ms (21471 allocations: 1.92 GiB) # 38 iterations
exa2: 628.971 ms (1183995 allocations: 32.42 MiB) # 67 iterations
N = 1000
exa0: 1.748 s (14564 allocations: 9.66 GiB) # 50 iterations
exa1: 1.602 s (28779 allocations: 9.66 GiB) # 50 iterations
exa2: 648.181 ms (1238880 allocations: 34.74 MiB) # 73 iterations
N = 5000
exa0: 6.547 s (9913 allocations: 64.37 GiB) # 31 iterations
exa1: 6.637 s (41064 allocations: 64.37 GiB) # 31 iterations
exa2: 686.667 ms (658404 allocations: 22.20 MiB) # 34 iterations |
@amontoison @frapac @0Yassine0 run of goddard-exa2.jl using a single V100 GPU (🙏🏽 JM Lacroix from LJAD) - # of iterations to be checked, e.g. for N = 100
exa0: 122.578 ms (22124 allocations: 209.36 MiB)
exa1: 126.011 ms (43646 allocations: 211.22 MiB)
exa2: 86.335 ms (259116 allocations: 7.47 MiB)
N = 500
exa0: 241.734 ms (9332 allocations: 1.62 GiB)
exa1: 242.094 ms (18284 allocations: 1.62 GiB)
exa2: 137.797 ms (384012 allocations: 11.25 MiB)
N = 1000
exa0: 677.497 ms (13432 allocations: 8.91 GiB)
exa1: 678.521 ms (26570 allocations: 8.91 GiB)
exa2: 1.003 s (2398734 allocations: 68.44 MiB)
N = 2000
exa0: 1.829 s (15500 allocations: 42.40 GiB)
exa1: 1.833 s (30228 allocations: 42.40 GiB)
exa2: 505.739 ms (779640 allocations: 23.40 MiB)
N = 5000
exa0: 2.309 s (7886 allocations: 50.31 GiB)
exa1: 2.319 s (16848 allocations: 50.31 GiB)
exa2: 502.286 ms (469136 allocations: 17.29 MiB) |
@jbcaillau Can you print the number of iterations? It's quite good, it seems that you always have a speed-up on GPU. |
Test on single precision GPU (Inria nef cluster), N = 100
exa0: 72.682 ms (3175 allocations: 80.58 MiB)
exa1: 75.944 ms (12425 allocations: 81.17 MiB)
exa2: 146.233 ms (294589 allocations: 7.29 MiB)
N = 500
exa0: 405.774 ms (3603 allocations: 1.67 GiB)
exa1: 411.145 ms (13185 allocations: 1.67 GiB)
exa2: 227.096 ms (312666 allocations: 8.08 MiB)
N = 1000
exa0: 1.522 s (5222 allocations: 10.00 GiB)
exa1: 1.485 s (20226 allocations: 10.00 GiB)
exa2: 310.898 ms (343664 allocations: 9.14 MiB)
N = 2000
exa0: 1.750 s (3524 allocations: 23.32 GiB)
exa1: 1.760 s (13628 allocations: 23.32 GiB)
exa2: 385.701 ms (287417 allocations: 8.55 MiB)
N = 5000
exa0: 3.579 s (2746 allocations: 40.32 GiB)
exa1: 3.578 s (10400 allocations: 40.32 GiB)
exa2: 785.430 ms (359051 allocations: 12.60 MiB)
N = 8000
exa0: 8.069 s (3166 allocations: 48.59 GiB)
exa1: 8.031 s (12220 allocations: 48.59 GiB)
exa2: 1.366 s (423244 allocations: 16.09 MiB)
N = 10000
exa0: 9.788 s (2956 allocations: 44.69 GiB)
exa1: 9.880 s (11310 allocations: 44.69 GiB)
exa2: 2.044 s (499806 allocations: 19.33 MiB) |
@jbcaillau For your information, we can also solve ExaModels instances in quadruple precision now (using quadmath). |
hi @frapac thanks for the information: tests above are in single precision only (Inria cluster), need to re-run the tests on double precision gpu's (LJAD cluster) 🤞🏾 |
Follow up here #26 |
@0Yassine0 test on Goddard case
@frapac the point with ExaModels is to use (i) multi-threaded CPUs and / or (ii) GPUs as documented here. Do you have an available config to test the code properly wrt. (i) and (ii)?