-
Notifications
You must be signed in to change notification settings - Fork 77
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
arm64 ubi-quarkus-graalvmce-builder-image is very slow #260
Comments
@ryanemerson is this happening with cc @Karm |
We're using Quarkus I tried the mandrel based images and the build time is now down to ~ 2.5 hours, so there's some improvement. |
Hi @ryanemerson could you please also provide a few more info on your setup? E.g. what are the hardware specs of the amd64 machine you are using for the builds, and what are those of the arm64 machine you are comparing with?
Did you also try building only for arm64? What are the results? |
Hardware is a Exact workflow:
Yes. An |
Thank you for the extra information @ryanemerson. We will try to replicate and investigate the issue. |
As a first step, I tried reproducing the issue on a local AMD64 machine (using
Some interesting differences between the arm64 runs:
I will investigate further... |
After some more experimentation it looks like the slow down is related to the initial heap size. Setting The issue seems related to oracle/graal#6432 @ryanemerson could you please give this a try, while I try to better understand why that's happening? Please use as the initial heap size something a bit higher than the Peak RSS you get when building with Mandrel 22.3. |
Thanks for looking into this. The most recent builder images have significantly reduced the latencies we were experiencing when I first created this issue, however the total build time is still almost double what we experienced with Adding |
What is the Peak RSS reported when building with 22.3-java17 without using this option?
So the actual issue is that oracle/graal#6432 is setting @ryanemerson may I ask you to give this a go with cc @fniephaus |
Sure np. |
GCTime=99 is mostly for latency and leads to the build process quickly using as much memory as it is allowed to use (bigger peak RSS). With GCTime=9, we tweak the GC more towards throughput, allowing it to spend more time cleaning up while not allocating more memory (lower peak RSS). I haven't seen any actual build output in this issue, but "very slow" sounds to me like the app simply requires more memory to be built with GraalVM. How much memory/CPU is the build process allowed to use? A build time of ~50min could mean that 7GB is simply not enough. |
Setting GCTime=99 made no noticeable difference to build time.
We have Xmx set to 8g before and after the increased latency was observed between the two different builder images. I increased this to 16g and the build time remains the same. |
@ryanemerson can you share the output of native-image when building with 23.1 ( And just to make sure are you using Quarkus 3.6.0 in both cases? If not you might be hitting quarkusio/quarkus#38683 (although if that was the case it shouldn't show up only on aarch64) A reproducer could also be handy if you can share. |
Are you sure that you also bumped the Xmx value? If you did, it seems memory is not the bottleneck, maybe it's CPU. You could try by increasing the number of cores available in your container. |
I've created a standalone reproducer to simplify things: https://github.com/ryanemerson/quarkus-arm64-slow-reproducer Here's the output for building with the two different builder images, on the same machine using the same args with Quarkus 3.7.3:
You can see that the total build time for |
Thanks for the reproducer and the extra info @ryanemerson At first sight it still looks like a GC-related issue to me:
I will try the reproducer and have another look next week. |
@ryanemerson thanks again for the reproducer and output results. I was finally able to see what's wrong. After a closer inspection of the logs I noticed that the build is actually running on This led me to have a second look at your Dockerfiles and the images they use.
At this point it might be worth mentioning that the latest builder image for Java 17 is tagged with Applying the following patch to the reproducer I am getting more consistent results (the diff --git a/src/main/docker/Dockerfile.22.3-java17 b/src/main/docker/Dockerfile.22.3-java17
index d108905..b6c718f 100644
--- a/src/main/docker/Dockerfile.22.3-java17
+++ b/src/main/docker/Dockerfile.22.3-java17
@@ -1,4 +1,4 @@
-FROM quay.io/quarkus/ubi-quarkus-native-image:22.3-java17 as build
+FROM quay.io/quarkus/ubi-quarkus-mandrel-builder-image:22.3-java17 as build
COPY --chown=quarkus:quarkus mvnw /code/mvnw
COPY --chown=quarkus:quarkus .mvn /code/.mvn
COPY --chown=quarkus:quarkus pom.xml /code/
diff --git a/src/main/docker/Dockerfile.jdk-21 b/src/main/docker/Dockerfile.jdk-21
index b5ab2a8..b7c63a9 100644
--- a/src/main/docker/Dockerfile.jdk-21
+++ b/src/main/docker/Dockerfile.jdk-21
@@ -1,4 +1,4 @@
-FROM quay.io/quarkus/ubi-quarkus-graalvmce-builder-image:jdk-21 as build
+FROM quay.io/quarkus/ubi-quarkus-mandrel-builder-image:jdk-21 as build
COPY --chown=quarkus:quarkus mvnw /code/mvnw
COPY --chown=quarkus:quarkus .mvn /code/.mvn
COPY --chown=quarkus:quarkus pom.xml /code/ As a follow up question, I am curious whether you actually test the images you build with I am closing this issue as it's actually not an issue with the images themselves. For the record I am adding the build outputs I get from the correct images below: 22.3-java17
jdk-21
|
Well I feel dumb 😅 We don't have any automated testing for our arm images, they're provided on a best effort basis for community user, which is why this wasn't detected. It seems nobody is actually using these images. Thanks for looking into this @zakkak, much appreciated. |
Is there still an issue on the Native Image side? #260 (comment) is somewhat expected and #260 (comment) shows the result: while the build takes ~1min longer on JDK 17, it only needs 2.38GB as opposed to 6.60GB of memory, even on a machine with 75.6% of 30.60GB of memory available. |
I think not.
True, but whether that's good or bad really depends on the use case. I have opened quarkusio/quarkus#38968 to give some options to Quarkus users, perhaps it would make sense to implement something similar directly on GraalVM. |
The Infinispan project previously used
quay.io/quarkus/ubi-quarkus-native-image:22.3-java17
as a builder image to create various native components for botharm64
andamd64
architectures. The total time taken for all of our images was ~ 30 mins.In order to use the latest GraalVM JDK 21 distribution, I have updated the builder image to be based upon
quay.io/quarkus/ubi-quarkus-graalvmce-builder-image:jdk-21
. However this has dramatically slowed down our image build time, with all our images now taking ~ 4 hours. Upon further investigation, it seems this is specifically caused by thearm64
builder image, as only building foramd64
results in the build time coming back down to ~ 16 mins.Has anything changed between
ubi-quarkus-native-image
andubi-quarkus-graalvmce-builder-image
that could explain this increased build time, or is the culprit more likely to be GraalVM itself?The text was updated successfully, but these errors were encountered: