-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[GR-51307] Unable to collect GC data with NotificationEmitter in native build #8237
Comments
Hi @viniciusxyz as already mentioned in the previous ticket,
We are aware of that, but this is currently no priority for us to fix. |
Maybe this is something @roberttoyonaga likes to look into? |
Hi @fniephaus. I'm happy to look into this eventually, if nobody else picks this up. However, it probably wont be on my to-do list for some time. |
Cool, thanks!
Typo? 😆 We're not going to complain if it's done asap |
oops 😆 I mean it will probably remain on my to-do list for a while |
I understand this point perfectly, I just found it strange to mark it as complete, even though it is a backlog item I believe it should be maintained because problems with GC happen quite frequently and losing traceability on this is a current negative point of native compilation, but I completely understand not be an immediate priority |
🙋 we're also interested in having that added. In the meantime, did anybody already reach out to Micrometer? Update: I've now created a PR to detect and log that: micrometer-metrics/micrometer#5149 |
Very interesting ! I hadn't paid attention to this detail in Micrometer, I'm going to download the code and do some tests, but for my part I didn't know there was this other way so I didn't open an issue there |
Well, it will not really help with the issue that there will be no metrics. But you'd at least get a log message why. |
I understood. I thought there was some fallback, but if there really isn't, then calling the micrometer team probably won't help much since the problem is the lack of notifications in the native compilation. It's quite frustrating to have to make monitoring worse to take advantage of the benefits of native images, but this is a necessary choice at the moment as far as I understand. |
Hi @viniciusxyz and @codesimplicity ! I've submitted a PR adding support for GC notifications (only for serial GC for now). If you have time, please let me know if this solution works for your use case. Thanks! |
@roberttoyonaga As soon as possible I will try to understand how the process of compiling graalvm works locally to test its implementation and I will report back. I really appreciate the work. |
No problem! Here are some brief steps to get you started with building GraalVM: You'll need mx. And the latest labsjdk release. Put mx on the path and set java home to labsjdk: Once GraalVM is built, you can use The first few minutes of this video does a good job of explaining this as well: https://youtu.be/3Gh0cz3vjG8?feature=shared&t=202 |
@roberttoyonaga Unfortunately I had some problems getting this to work on Windows, I'll try again tomorrow using Linux |
@roberttoyonaga I'm having some problems getting the test done, I can compile the graal using mx --env ce build, but from what I understand from the documentation it doesn't have the native-image tool and the mx native-image apparently does not exist, I appealed and tried to run a mx build --all, but I still have the same problem, can you please guide me on how to create the native-image binary? |
@viniciusxyz maybe try |
@roberttoyonaga By following the steps I managed to compile it, but the result was the same as opening the issue, I'm using the right branch as you can see below: But the result of the compilation is still using scavenger, probably I must still be doing something wrong, but I don't know what... The steps I did were:
export PATH=/home/vvsantos/mx:$PATH
export JAVA_HOME=/home/vvsantos/labsjdk-ce-24-jvmci-b01-debug
mx --dynamicimports /substratevm build
cd ../substratevm
mx build
javac MainNotificationEmitter.java
mx native-image MainNotificationEmitter
./mainnotificationemitter The class used for testing was this import com.sun.management.GarbageCollectionNotificationInfo;
import javax.management.NotificationEmitter;
import java.lang.management.GarbageCollectorMXBean;
import java.lang.management.ManagementFactory;
public class MainNotificationEmitter {
public static void main(String[] args) throws InterruptedException {
for (GarbageCollectorMXBean gcBean : ManagementFactory.getGarbageCollectorMXBeans()) {
if (!(gcBean instanceof NotificationEmitter notificationEmitter)) {
continue;
}
System.out.println("NotificationEmitter " + gcBean.getName());
notificationEmitter.addNotificationListener((notification, handback) -> {
var type = notification.getType();
System.out.printf("Notification issued -> %s\n", type);
}, notification -> notification.getType()
.equals(GarbageCollectionNotificationInfo.GARBAGE_COLLECTION_NOTIFICATION), null);
}
System.out.println("Notifier added");
System.gc();
Thread.sleep(3_000);
System.out.println("End");
}
} Can you identify if I'm doing something wrong in this flow? |
Hi @viniciusxyz you need to use the build options shown in the PR description when you execute the Native Image build. In short, you need to use As a side note: To use any of the monitoring/serviceability features in Native Image, you need to pass If you have any other questions, please don't hesitate to ask! |
@roberttoyonaga I read this yesterday and I had already forgotten about it today :) I managed to test it, I just found it strange that there was an extra notification after the main code had finished executing When we build with mx, is the native-image binary available in the labsjdk bin? I ask because I would like to do a complete test by compiling an application using maven and reachability-metadata (I need it to compile some things) and uploading it to Kubernetes to validate if the micrometer metrics are working correctly with this implementation |
Oh yes, that's a small bug. When the system gets teared-down in emits another notification. I'll fix that and push an update.
No its not in labsjdk. You can find the latest build in the graal repository here: graal/vm/latest_graalvm_home |
@roberttoyonaga I started the integrated tests, basically I have an application that uses micronaut in conjunction with micrometer that uses jvm metrics to launch metrics for prometheus, initially notificationEmmiter seems to be working well since now I can see the pauses associated with the GC, strangely the metrics related to HEAP memory broke, but perhaps it has nothing to do with the change you made, another point that I found strange was the GC pauses that only increased more and more, I will start the application with the GC monitoring flags and compare the data to validate whether they are correct, I will give you a more complete feedback as soon as possible. Below is the image with the GC time metric that is super important for us and that now exists with its change :) |
Hi @viniciusxyz
What do you mean by this? Did the notifications stop containing the memory usage info? Or did the memory usages reported look strange? I made a very simple test app here: https://gist.github.com/roberttoyonaga/44bfe8cbaa809c102fa5d9cc959f2997 |
@roberttoyonaga I don't know if it was your change, but there seems to have been a change in behavior in the memory metrics related to MemoryPoolMXBean, the screenshot below shows the behavior in the jdk 21 lts that I generally use: Jdk 24 with your PR: Code for test: https://gist.github.com/viniciusxyz/c0883ec5a499351225bebf9a8ccbc5ea What I'm doing for now is using memory metrics that are provided by micrometer so I'm testing the functioning of the resources necessary for these 3 classes to work: JvmHeapPressureMetrics As for measuring GC times, everything really seems to be ok, I made a comparison using -XX:+PrintGC and there doesn't seem to be anything abnormal |
@roberttoyonaga I did a new test and it seems that this metric has disappeared and has no relation to your commit, checking out the previous commit and redoing the build I noticed the same problem, now why does this work in Java 21, but in Java 24 it is like this I don't know. As for the GC metrics themselves, everything seems to be correct. Test with commit a8efbf7c2981 |
Hi @viniciusxyz this is actually expected. Please see the comment here. The max sizes are always -1 in JDK 24 becuase of this PR #6930 |
@roberttoyonaga I understand, there are some graphs we use that use this maximum to be able to demonstrate in percentage terms the amount of heap used at the moment, but this can be removed without any problems, thank you for the clarification |
A favor I ask of the team
This is actually the second time I have opened this issue, the first was #7803 opened on November 11, 2023,
but it was closed maybe because I couldn't explain it clearly, I ask that you please be aware that this issue does not refer to compilation problems and yes, the lack of data in the NotificationEmitter at runtime to collect GC metrics, this occurs in both G1GC and SerialGC in both GraalVM CE and Oracle GraalVM
Describe the problem
I'm developing applications and I'm missing some statistics when using Micrometer, being more specific the details and duration of GC pauses, I started validating the lib code and noticed that the problem is actually that it seems that NotificationEmitter is not working launching the notification events and this only happens in the native image, these metrics are very important for us to be able to put the applications into production, so I would like help to solve this problem.
Steps to reproduce the issue
mvn clean package -Pnative
./target/main-notification-emitter
When running the native image, only the notifier addition log will be displayed, and when running with hotspot, when System.gc() is called, a log is displayed from the emission of the NotificationEmitter event
Describe GraalVM and its environment:
More details
Print execution with hotspot:
Print from execution with native compilation:
All compilation configuration is in the pom file of the project passed in the example
To prove that the problem does not depend on whether it is graalvm ce or oracle graalvm, the print follows with the same behavior:
Currently, as far as I know, there are two main ways to expose information about the garbage collector runtimes while the application is running so that we can view it continuously when we are in Kubernetes, the first is through a javaagent that exports this information to any provider such as Prometheus and another is by adding some lib that sends these metrics to one of these providers, but as far as I've seen both forms depend on notificationEmmiter for updates related to these metrics, without this improvement several applications that monitor the GC via Prometheus + micrometer for example will be left without the data for monitoring related to GC times.
Demo of information visualization in grafana
Without native compilation
With native compilation
@wirthi @kassifar I reopened the issue, if any details were not clear please let me know
The text was updated successfully, but these errors were encountered: