Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hoist eltype calls in matrix multiplication #1880

Merged
merged 1 commit into from
Jul 10, 2024

Conversation

charleskawczynski
Copy link
Member

@charleskawczynski charleskawczynski commented Jul 10, 2024

This PR hoists eltype calls in the matrix multiplication.

main

main
julia> using Revise; include(joinpath("test", "MatrixFields", "matrix_field_broadcasting.jl"))
[ Info: getidx times max(interior,left,right) = (3 nanoseconds,3 nanoseconds,3 nanoseconds)
[ Info: getidx times max(interior,left,right) = (4 nanoseconds,4 nanoseconds,4 nanoseconds)
[ Info: getidx times max(interior,left,right) = (4 nanoseconds,4 nanoseconds,5 nanoseconds)
[ Info: getidx times max(interior,left,right) = (4 nanoseconds,4 nanoseconds,4 nanoseconds)
[ Info: getidx times max(interior,left,right) = (7 nanoseconds,7 nanoseconds,7 nanoseconds)
[ Info: getidx times max(interior,left,right) = (5 nanoseconds,5 nanoseconds,5 nanoseconds)
[ Info: getidx times max(interior,left,right) = (14 nanoseconds,14 nanoseconds,14 nanoseconds)
[ Info: getidx times max(interior,left,right) = (22 nanoseconds,23 nanoseconds,23 nanoseconds)
[ Info: getidx times max(interior,left,right) = (19 nanoseconds,19 nanoseconds,19 nanoseconds)
[ Info: getidx times max(interior,left,right) = (35 nanoseconds,36 nanoseconds,36 nanoseconds)
[ Info: getidx times max(interior,left,right) = (51 nanoseconds,43 nanoseconds,46 nanoseconds)
[ Info: getidx times max(interior,left,right) = (16 nanoseconds,17 nanoseconds,17 nanoseconds)
[ Info: getidx times max(interior,left,right) = (232 nanoseconds,297 nanoseconds,298 nanoseconds)
[ Info: getidx times max(interior,left,right) = (260 nanoseconds,334 nanoseconds,376 nanoseconds)
[ Info: getidx times max(interior,left,right) = (4 microseconds, 197 nanoseconds,5 microseconds, 570 nanoseconds,5 microseconds, 584 nanoseconds)
[ Info: getidx times max(interior,left,right) = (107 nanoseconds,156 nanoseconds,152 nanoseconds)
Test Summary:                    | Pass  Total     Time
Scalar Matrix Field Broadcasting |  112    112  5m23.0s
[ Info: getidx times max(interior,left,right) = (11 nanoseconds,11 nanoseconds,11 nanoseconds)
[ Info: getidx times max(interior,left,right) = (93 nanoseconds,96 nanoseconds,120 nanoseconds)
[ Info: getidx times max(interior,left,right) = (48 nanoseconds,48 nanoseconds,48 nanoseconds)
[ Info: getidx times max(interior,left,right) = (70 nanoseconds,77 nanoseconds,77 nanoseconds)
Test Summary:                        | Pass  Total     Time
Non-scalar Matrix Field Broadcasting |   28     28  1m07.7s

this PR

julia> using Revise; include(joinpath("test", "MatrixFields", "matrix_field_broadcasting.jl"))
[ Info: getidx times max(interior,left,right) = (3 nanoseconds,3 nanoseconds,3 nanoseconds)
[ Info: getidx times max(interior,left,right) = (4 nanoseconds,4 nanoseconds,4 nanoseconds)
[ Info: getidx times max(interior,left,right) = (4 nanoseconds,4 nanoseconds,4 nanoseconds)
[ Info: getidx times max(interior,left,right) = (4 nanoseconds,4 nanoseconds,4 nanoseconds)
[ Info: getidx times max(interior,left,right) = (7 nanoseconds,7 nanoseconds,7 nanoseconds)
[ Info: getidx times max(interior,left,right) = (5 nanoseconds,5 nanoseconds,5 nanoseconds)
[ Info: getidx times max(interior,left,right) = (14 nanoseconds,15 nanoseconds,14 nanoseconds)
[ Info: getidx times max(interior,left,right) = (22 nanoseconds,23 nanoseconds,23 nanoseconds)
[ Info: getidx times max(interior,left,right) = (19 nanoseconds,19 nanoseconds,19 nanoseconds)
[ Info: getidx times max(interior,left,right) = (35 nanoseconds,36 nanoseconds,36 nanoseconds)
[ Info: getidx times max(interior,left,right) = (15 nanoseconds,16 nanoseconds,16 nanoseconds)
[ Info: getidx times max(interior,left,right) = (15 nanoseconds,16 nanoseconds,16 nanoseconds)
[ Info: getidx times max(interior,left,right) = (57 nanoseconds,82 nanoseconds,93 nanoseconds)
[ Info: getidx times max(interior,left,right) = (134 nanoseconds,163 nanoseconds,182 nanoseconds)
[ Info: getidx times max(interior,left,right) = (723 nanoseconds,784 nanoseconds,829 nanoseconds)
[ Info: getidx times max(interior,left,right) = (97 nanoseconds,129 nanoseconds,130 nanoseconds)
Test Summary:                    | Pass  Total     Time
Scalar Matrix Field Broadcasting |  112    112  5m36.8s
[ Info: getidx times max(interior,left,right) = (11 nanoseconds,11 nanoseconds,11 nanoseconds)
[ Info: getidx times max(interior,left,right) = (95 nanoseconds,110 nanoseconds,99 nanoseconds)
[ Info: getidx times max(interior,left,right) = (48 nanoseconds,49 nanoseconds,49 nanoseconds)
[ Info: getidx times max(interior,left,right) = (70 nanoseconds,78 nanoseconds,78 nanoseconds)
Test Summary:                        | Pass  Total     Time
Non-scalar Matrix Field Broadcasting |   28     28  1m12.2s

So, scalar tests 11, 13, 14, and 15 are significantly faster with this PR.

The flame graph for 15 is a step towards #1871, but there is still some time in eltype, which seems odd.

Screenshot 2024-07-10 at 11 02 08 AM

@charleskawczynski charleskawczynski merged commit 424397f into main Jul 10, 2024
16 of 19 checks passed
@charleskawczynski charleskawczynski deleted the ck/matfield_getidx_bm_15 branch July 10, 2024 16:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant