-
Notifications
You must be signed in to change notification settings - Fork 114
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
curved ec performance #688
curved ec performance #688
Conversation
The curvilinear meshes are still more expensive than the Cartesian julia> using BenchmarkTools, Trixi
julia> begin # RHS
redirect_stdout(devnull) do
trixi_include(joinpath(examples_dir(), "tree_3d_dgsem", "elixir_euler_ec.jl"))
end
u_ode = copy(sol.u[end])
du_ode = similar(u_ode)
@benchmark Trixi.rhs!($du_ode, $u_ode, $semi, $0.0)
end
BechmarkTools.Trial: 1569 samples with 1 evaluations.
Range (min … max): 2.988 ms … 5.868 ms ┊ GC (min … max): 0.00% … 0.00%
Time (median): 3.081 ms ┊ GC (median): 0.00%
Time (mean ± σ): 3.182 ms ± 287.739 μs ┊ GC (mean ± σ): 0.00% ± 0.00%
▃█▅▅▅▄▄▄▂▃▁▁▁▁▁▁
████████████████████▇▇█▇▅▇▆▆▆▅▅▆▆▆▄▁▆▁▆▅▆▆▅▆▆▆▅▆▅▅▅▅▅▅▁▄▁▄▅ █
2.99 ms Histogram: log(frequency) by time 4.3 ms <
Memory estimate: 384 bytes, allocs estimate: 6.
julia> begin # RHS
redirect_stdout(devnull) do
trixi_include(joinpath(examples_dir(), "tree_3d_dgsem", "elixir_euler_ec.jl"),
mesh=StructuredMesh((8, 8, 8), float.(coordinates_min), float.(coordinates_max)))
end
u_ode = copy(sol.u[end])
du_ode = similar(u_ode)
@benchmark Trixi.rhs!($du_ode, $u_ode, $semi, $0.0)
end
BechmarkTools.Trial: 1151 samples with 1 evaluations.
Range (min … max): 3.634 ms … 5.902 ms ┊ GC (min … max): 0.00% … 0.00%
Time (median): 4.320 ms ┊ GC (median): 0.00%
Time (mean ± σ): 4.330 ms ± 260.993 μs ┊ GC (mean ± σ): 0.00% ± 0.00%
▁▁▁▂▆▅▆▄█▆▄▂▅▄▃▅▇▄▆▂▂
▃▂▂▁▄▄▃▂▄▅▅▄▆▆██▇███████████████████████▇▇▆▆▇▅▄▅▃▃▃▃▄▃▃▃▁▂▃ ▅
3.63 ms Histogram: frequency by time 530 ms <
Memory estimate: 368 bytes, allocs estimate: 6.
julia> begin # RHS
redirect_stdout(devnull) do
trixi_include(joinpath(examples_dir(), "tree_3d_dgsem", "elixir_euler_ec.jl"),
mesh=P4estMesh((1, 1, 1), polydeg=3, initial_refinement_level=3; coordinates_min, coordinates_max))
end
u_ode = copy(sol.u[end])
du_ode = similar(u_ode)
@benchmark Trixi.rhs!($du_ode, $u_ode, $semi, $0.0)
end
BechmarkTools.Trial: 765 samples with 1 evaluations.
Range (min … max): 6.338 ms … 9.456 ms ┊ GC (min … max): 0.00% … 0.00%
Time (median): 6.392 ms ┊ GC (median): 0.00%
Time (mean ± σ): 6.535 ms ± 332.859 μs ┊ GC (mean ± σ): 0.00% ± 0.00%
█▇▅▃▃▃▂ ▂ ▁▁▁
█████████▇████▇▆▆▇▇▄█▆▆▇▅▇▇▅█▄▆▄▄▅▄▆▄▁▁▁▄▄▄▁▇▆▄▄▆▅▁▁▁▄▁▁▄▄▄ ▇
6.34 ms Histogram: log(frequency) by time 7.92 ms <
Memory estimate: 736 bytes, allocs estimate: 11. Note that the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Although I did not check the changes to the curved part if they make sense numerically - this would be better handled by @andrewwinters5000, if necessary.
Thanks for testing this out! |
Thanks for testing this @ranocha ! Having the volume flux able to accept the pre-averaged contravariant vectors directly make things cleaner and actually somewhat similar to how FLUXO does it ;) |
Codecov Report
@@ Coverage Diff @@
## main #688 +/- ##
==========================================
+ Coverage 93.61% 93.63% +0.02%
==========================================
Files 171 171
Lines 16482 16535 +53
==========================================
+ Hits 15429 15482 +53
Misses 1053 1053
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
Flux differencing on curved meshes sucked since the resulting EC methods were only 25% faster than Fluxo 😉 To the rescue!
This speeds up the RHS evaluation in
joinpath(examples_dir(), "structured_3d_dgsem", "elixir_euler_ec.jl")
by 1/3 forflux_ranocha
and 1/4 forflux_shima_etal
. Thus, our EC methods should be roughly twice as fast as Fluxo on curved meshes.Teaser: Using
julia --check-bounds=no --threads=1
, I geton
main
andfrom this PR.
Benchmarks on Rocinante are running. I will add them later when they are finished (in a few hours?).