-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bazel is slow on Docker Mac #7290
Comments
Our team has experienced the same problem which causes cpu or ram intensive test to timeout or to take an excessive amount of time. One way we found to mitigate the problem (aside from the general Mac performance issue) is to restrict the number of parallel actions that Bazel executes. Our understanding is that by default Bazel tries to execute as many parallel actions as logical processors are available and tries to estimate the ram needed per action based on a percentage (ram_utilization_factor) of the total memory of the system. However, Bazel estimates for how much memory and cpu each action requires are not perfect which seems to cause cpu or ram intensive actions to be run in parallel when they shouldn’t. As they need to share the resources with other actions the performance is degraded. I’d suggest to reduce the number of parallel actions using —jobs (—jobs=4 gives good results on our experience) and maybe reduce the ram_utilization_factor setting (which defaults to 67). |
I think @cdlcs's observation hints in the right direction, but it's not clear why that's helping. Bazel already auto-detects how many resources it should use and limits itself to that. Do the containers not report limits properly to Bazel maybe? Or are they so heavyweight that we need to account for their overhead? I have very limited knowledge of how Docker on Mac works, but from the info I find that mentions Linux VMs and a file system helper... it's not surprising. Also, why is this a Bazel problem and not a Docker problem? If the same tool works fast outside a container and not inside, the problem is the container, not the tool. It's possible we can make tweaks to Bazel to make the problem less pronounced -- but ultimately, it's not Bazel's fault if the containerization solution is slow. |
It's not necessarily a container bug just because a tool works differently in and out of the container. An example is a bug in the JDK where The bug is addressed in Oracle Jdk 8: https://bugs.java.com/bugdatabase/view_bug.do?bug_id=6515172 and OpenJdk 9: https://bugs.openjdk.java.net/browse/JDK-6515172 and backported to OpenJdk 8: https://bugs.openjdk.java.net/browse/JDK-8185179 As far as I can see, the 'proper' fix only appears in OpenJdk 10: https://bugs.openjdk.java.net/browse/JDK-8146115 - this fixes cpu detection when run in a cgroup using cpu quotas, as well as cpu sets. The interesting thing in this case is that the problem appears only in Docker for Mac, not in stock docker on Linux. The combination of VM plus docker container on Mac would have to result in differently detected resources from just a docker container on Linux. |
Apart from Bazel being slow in Docker, we're sometimes experiencing hangups of Bazel, when doing bazel query or bazel build //some/package:app_deploy.jar. Process hangs and the only way to end it is to kill the bazel process. |
Attaching a strace, where process gets stuck and the only way to end is to kill it. Current workaround for this is to do bazel shutdown before every executing bazel commands in sequence (script), somehow it helps. bazel query 'kind("binary rule", //...)':
|
If my reading is correct, this is the same as #3886 (Bazel not tuning its resource usage appropriately when running inside a container). |
Description of the problem / feature request:
Bazel build in Docker container on Mac is about 120% slower than the same build on a host machine (Mac).
Bugs: what's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.
Bring up container with Bazel installed and do a bazel build.
What operating system are you running Bazel on?
Bazel is running in a container (debian) using Docker Mac.
What's the output of
bazel info release
?release 0.21.0
Have you found anything relevant by searching the web?
I have tried profiling the build and found out that init phase is quite slower than on a host machine, where it's practically instant (~12ms) and also execution time is about ~70% higher, 29s on a host machine.
Can someone explain what's happening in init phase time and what to do to improve speed of build - execution phase time? Is this an issue of a slow file, cpu or something else?
The text was updated successfully, but these errors were encountered: