-
Notifications
You must be signed in to change notification settings - Fork 113
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
prolong2mortars!
with P4estMesh
allocates sometimes
#628
Comments
I have observed similar things. Introducing our own |
Is there any workaround for now? |
Not really... |
Request my review on your related PR when you feel ready for that and I can try to have a look to speed up critical parts |
Some analysis using Julia v1.6.2: julia> using ProfileView, Trixi
julia> trixi_include(joinpath(examples_dir(), "p4est_2d_dgsem", "elixir_advection_nonconforming_unstructured.jl"),
save_solution=TrivialCallback(), save_restart=TrivialCallback()) # second run, after compilation
[...]
───────────────────────────────────────────────────────────────────────────────
Trixi.jl Time Allocations
────────────────────── ───────────────────────
Tot / % measured: 26.5ms / 96.9% 14.7MiB / 100%
Section ncalls time %tot avg alloc %tot avg
───────────────────────────────────────────────────────────────────────────────
rhs! 126 23.6ms 92.1% 188μs 14.3MiB 97.5% 116KiB
prolong2mortars 126 14.1ms 54.8% 112μs 14.2MiB 97.1% 116KiB
volume integral 126 2.67ms 10.4% 21.2μs 0.00B 0.00% 0.00B
~rhs!~ 126 2.45ms 9.55% 19.4μs 56.1KiB 0.37% 456B
interface flux 126 1.29ms 5.02% 10.2μs 0.00B 0.00% 0.00B
mortar flux 126 1.21ms 4.74% 9.64μs 0.00B 0.00% 0.00B
boundary flux 126 553μs 2.16% 4.39μs 0.00B 0.00% 0.00B
prolong2interfaces 126 528μs 2.06% 4.19μs 0.00B 0.00% 0.00B
surface integral 126 418μs 1.63% 3.32μs 0.00B 0.00% 0.00B
Jacobian 126 263μs 1.03% 2.09μs 0.00B 0.00% 0.00B
reset ∂u/∂t 126 101μs 0.40% 804ns 0.00B 0.00% 0.00B
prolong2boundaries 126 80.2μs 0.31% 636ns 0.00B 0.00% 0.00B
source terms 126 2.22μs 0.01% 17.7ns 0.00B 0.00% 0.00B
analyze solution 2 1.65ms 6.44% 826μs 382KiB 2.55% 191KiB
calculate dt 26 378μs 1.47% 14.5μs 0.00B 0.00% 0.00B
─────────────────────────────────────────────────────────────────────────────── Thus, we see the allocations indicating a type instability.
@code_warntype is fine. |
Very impressive analysis, thanks for sharing your step-by-step process with all details! What is your conclusion what we can do about it? |
|
The following has been done on the
efaulhaber:p4est-non-conforming
branch, which will be merged in #618.When I start the REPL and execute
the timers report:
The benchmark code
reports:
Now, I go to
solvers/dg_p4est/dg_2d.jl
and remove@threaded
fromand execute the example again.
Then, I add
@threaded
again, so the code is identical to the one I benchmarked before.Running the exact same benchmarks with the exact same code again now yields:
and
What is going on here?
The text was updated successfully, but these errors were encountered: