-
Notifications
You must be signed in to change notification settings - Fork 196
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
slow builds with Coq 8.19 and/or dune 3.14.0 #1866
Comments
Cost of sort polymorphism. |
@SkySkimmer Thanks for explaining. Is that expected to be in the 5-8% range? Are there plans to improve this? |
Regarding the speed of Dune, I've definitely noticed a regression in the past few versions, but I am not certain the cause of it. I haven't been involved with working on Dune as of late, so I am not certain what the problem is. Regarding how Dune chooses jobs, this is up to the scheduler of Dune and is not something that is easily tweaked externally AFAIK. There is an option to dump a flame graph of the build detailed here: https://dune.readthedocs.io/en/stable/advanced/profiling-dune.html this should allow you to quickly find bottlenecks and see where the build is spending the most time. cc @ejgallego for the performance regression in Dune. |
I also tested -j1 builds with make 4.3, and Coq 8.19 is 3.2% slower than Coq 8.18. These timings are more stable, so this is probably a more accurate measure of the difference. My -j8 times above with make show 8.19 7.7% slower, but I think one of those 8.18 runs got especially lucky. Further tests show a difference of more like 5%, still more than the -j1 difference. |
Thanks for the numbers @jdchristensen , they are very useful! I'll suggest we handle the two performance problems here as different issues, as they really are. Coq performance regression will likely be improved upstream, but indeed, there is an expected slowdown for HoTT. Here HoTT is a bit unlucky I do believe as it is more impacted than other developments. Regarding the regression in Dune, that's a bit more worrying and deserve investigation. @jdchristensen , what are the times for the "zero" build on different dune versions? [by zero build we mean doing |
@ejgallego I timed
(I added make to the list, but I know that it's not a fair comparison, as dune is doing more sophisticated checks, not just checking timestamps.) I used those |
@ejgallego Is it possible to change dune versions in my ocaml switch without having to recompile coq each time? That would make it faster for me to try 3.12 and 3.13, in case that helps. OTOH, you can probably check out the Coq-HoTT library and reproduce these numbers. It has no extra dependencies and builds with "dune build". |
@SkySkimmer Another comment is that the HoTT library doesn't use |
no. |
Yes, it is possible to use any dune regarless of what you have in your opam switch. Dune itself is self-contained, so it doesn't depend on what is on the switch, except for reading the list of installed packages. So for example you can compile dune in Important note tho: due to versioning, it is hard to use an older dune with a switch with the newer dune; while dune works, it will complain about install dune packages being newer than what it knows. So indeed, it is best to work on a switch that has the lowest dune version you plan to test. |
@jdchristensen , we have identified two performance regressions, indeed there was one from 3.11 to 3.14 and another one long time ago, when we switched dune to a single coqdep call. Zero build times for me are now under 0.2 seconds, with the following PR applied: There is another PR that fixes the regression from 3.11 to 3.14, but actually, that regression is not observable after that one is applied. |
If you are curious about testing, that can be done with: $ cd dune_git_working_dir
$ hub checkout https://github.com/ocaml/dune/pull/10116
$ make clean && make release
$ DUNE_BIN=$(pwd)/_build/install/default/bin/dune
$ cd hott_working_dir
$ $DUNE_BIN build # etc... |
@ejgallego This is wonderful! On my machine, dune now takes 0.16s for a zero build, even faster than make! The -j8 times are also much improved. I've updated the tables:
BTW, my custom build of dune reports its version as |
@jdchristensen , that's great! Sorry for that bug, these kind of bugs (missing a memoized table) are actually pretty hard to track / test for; there are not so many parts of dune that do this, as usually things are cached in files (for that Dune provides a much better hardened API) Indeed, when I decided to start working on Coq + Dune, I had made some benchs, and Dune always outperformed make, so I'm glad to see that's still the case in a bug-free setup. Again thanks for the very important work of performance testing dune and make.
I think that dune will use |
@ejgallego Yes, I can confirm that |
Thanks for confirming this, I did open the issue myself, as to provide a bit more info (this is due to a change in Dune's release process) Here it is ocaml/dune#10141 Thanks again @jdchristensen ! |
IMHO this issue can be closed, dune releases every 6 weeks, so the fix should arrive to the released package pretty soon. |
Two of the three slowdowns are fixed, so I'll close this. Thanks for your quick fixes, @ejgallego! |
Indeed the Coq slowdown won't have an easy fix, but hopefully other improvements in 8.20 will offset this bad case for HoTT. |
Timing comparisons building Coq-HoTT commit 2673d0f from Feb 19 using -j8. This builds the whole library, including Classes, contrib and test. In all cases, coq is built with
--packages="ocaml-variants.4.14.1+options,ocaml-option-flambda"
. The time shown is the average of the best two runs out of four.Observations:
Coq 8.19 is slower than Coq 8.18.
Dune 3.14.0 is slower than dune 3.11.1.
Dune is slower than make. This is not new, and is mostly due to the 1-2s that dune spends at the start hashing things. (Can this be parallelized to use multiple cores? Can some targets be started while hashing is still being done?)
Does anyone have any theories about 1 or 2?
(Related to speed, a wishlist item of mine is for dune to choose the target to process next based on the total number of dependencies a target has (recursively), which should improve parallelism.)
The text was updated successfully, but these errors were encountered: