-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GH-39517: [C++] Disable parallelism for jemalloc external project #39522
GH-39517: [C++] Disable parallelism for jemalloc external project #39522
Conversation
|
@github-actions crossbow submit -g r |
Revision: 5f6051d Submitted crossbow builds: ursacomputing/crossbow @ actions-e361533807 |
The r-binary-packages, devdocs, backwards comp and other builds where previously failing due to this issue and now pass. |
set(JEMALLOC_BUILD_COMMAND ${MAKE} ${MAKE_BUILD_ARGS}) | ||
# Paralleism for Make fails with CMake > 3.28 see #39517 | ||
if(${CMAKE_GENERATOR} MATCHES "Makefiles") | ||
list(APPEND JEMALLOC_BUILD_COMMAND "-j1") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we want to use -j1
only for jemalloc?
It seems that this problem may be happen with other software. So how about setting MAKE_BUILD_ARGS
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem has not occurred with any other dependency neither on CI nor in my own testing (or after the fix). For the smaller dependencies it might not make a difference but with the number of larger dependencies we have forcing the all to use -j1 if there is no actual issue seems like it would slow build times with no gain?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not a strong opinion but I think that we should choose safer option (disable make
's parallel build for all dependencies) here.
Because we have a workaround (use -GNinja
) for the parallel build problem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We only have 2 cores available on cran (policy), no ninja (afaik) and our build times are already being criticized so....
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we have only 2 cores, we should not use parallel build for dependencies.
If we use parallel build, it will use 2 more cores in total and slow down total build time...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI: We use make
only for jemalloc, bzip2 and UCX.
if(NOT ((_VERSION_ENTRY MATCHES "^[^ #][A-Za-z0-9-_]+_VERSION=") | ||
OR (_VERSION_ENTRY MATCHES "^[^ #][A-Za-z0-9-_]+_CHECKSUM="))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are they needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These were made by the cmake linter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow. What CMake linter did you use? And what the error message from the CMake linter?
It seems that this diff changes matched pattern. (I think that it doesn't have any problem in our use case.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
archery lint --fix --cmake-format
^^ but I am happy to remove the change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hm weird, it doesn't produce output if I leave the --fix flag off. I will revert the change to be safe.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
if(NOT ((_VERSION_ENTRY MATCHES "^[^ #][A-Za-z0-9-_]+_VERSION=") | ||
OR (_VERSION_ENTRY MATCHES "^[^ #][A-Za-z0-9-_]+_CHECKSUM="))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow. What CMake linter did you use? And what the error message from the CMake linter?
It seems that this diff changes matched pattern. (I think that it doesn't have any problem in our use case.)
set(JEMALLOC_BUILD_COMMAND ${MAKE} ${MAKE_BUILD_ARGS}) | ||
# Paralleism for Make fails with CMake > 3.28 see #39517 | ||
if(${CMAKE_GENERATOR} MATCHES "Makefiles") | ||
list(APPEND JEMALLOC_BUILD_COMMAND "-j1") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not a strong opinion but I think that we should choose safer option (disable make
's parallel build for all dependencies) here.
Because we have a workaround (use -GNinja
) for the parallel build problem.
This reverts commit 5f6051d.
…9522) ### Rationale for this change With CMake > 3.28 the generated Makefile fails on the jemalloc_ep due to 'bad file descriptor'. ### What changes are included in this PR? For a sequential build for jemalloc by setting -j1. ### Are these changes tested? CI ### Are there any user-facing changes? No. * Closes: #39517 Authored-by: Jacob Wujciak-Jens <[email protected]> Signed-off-by: Jacob Wujciak-Jens <[email protected]>
Isn't this making builds slower? Can it be circumvented to only CMake 3.28? |
…9522) ### Rationale for this change With CMake > 3.28 the generated Makefile fails on the jemalloc_ep due to 'bad file descriptor'. ### What changes are included in this PR? For a sequential build for jemalloc by setting -j1. ### Are these changes tested? CI ### Are there any user-facing changes? No. * Closes: #39517 Authored-by: Jacob Wujciak-Jens <[email protected]> Signed-off-by: Jacob Wujciak-Jens <[email protected]>
Ping @raulcd @assignUser ^^ |
Tiny bit maybe but unlikely as the jemalloc build in itself is pretty quick + we only set the one build to -j1 so the cmake can still build other objects/external prjects in parallel.
That would be possible via https://cmake.org/cmake/help/latest/variable/CMAKE_VERSION.html but also it only applies when using the Make Generator and most dev/ci are likely using Ninja...
No but I have that as a todo. I haven't found a working reprex yet, I guess it needs the many parallel external projects we have... |
My experience is that the jemalloc build blocks all other steps when building Arrow. I am not sure why that is (perhaps a bug in our dependency tree?); I tried outputting a dot file of the build graph using Ninja but I failed graphing the file afterwards... |
After merging your PR, Conbench analyzed the 6 benchmarking runs that have been run so far on merge-commit 5acf67c. There were 22 benchmark results indicating a performance regression:
The full Conbench report has more details. It also includes information about 1 possible false positive for unstable benchmarks that are known to sometimes produce them. |
I build jemalloc with our release configure command and
So it seems that make defaults to a mostly single core build anyway, this just removes issues we had. Explicitly passing -j16 to the jemalloc build interestingly also fixes the issue and does not show the warning mentioned in #2779. It might be enough to remove that change/conditional. I have tested this on ubuntu 20.04 with cmake 3.16 and it also works there. (I think cmake decoupled external projects from the main make by adding a cmake layer inbetween at some point prior to 3.16 thus getting rid of the sub-make issues). |
It seems that we solved this by GH-5281. (But only with Ninja.) |
Well, I use Ninja. |
…ct (apache#39522) ### Rationale for this change With CMake > 3.28 the generated Makefile fails on the jemalloc_ep due to 'bad file descriptor'. ### What changes are included in this PR? For a sequential build for jemalloc by setting -j1. ### Are these changes tested? CI ### Are there any user-facing changes? No. * Closes: apache#39517 Authored-by: Jacob Wujciak-Jens <[email protected]> Signed-off-by: Jacob Wujciak-Jens <[email protected]>
…ct (apache#39522) ### Rationale for this change With CMake > 3.28 the generated Makefile fails on the jemalloc_ep due to 'bad file descriptor'. ### What changes are included in this PR? For a sequential build for jemalloc by setting -j1. ### Are these changes tested? CI ### Are there any user-facing changes? No. * Closes: apache#39517 Authored-by: Jacob Wujciak-Jens <[email protected]> Signed-off-by: Jacob Wujciak-Jens <[email protected]>
…ct (apache#39522) ### Rationale for this change With CMake > 3.28 the generated Makefile fails on the jemalloc_ep due to 'bad file descriptor'. ### What changes are included in this PR? For a sequential build for jemalloc by setting -j1. ### Are these changes tested? CI ### Are there any user-facing changes? No. * Closes: apache#39517 Authored-by: Jacob Wujciak-Jens <[email protected]> Signed-off-by: Jacob Wujciak-Jens <[email protected]>
Rationale for this change
With CMake > 3.28 the generated Makefile fails on the jemalloc_ep due to 'bad file descriptor'.
What changes are included in this PR?
For a sequential build for jemalloc by setting -j1.
Are these changes tested?
CI
Are there any user-facing changes?
No.