-
Notifications
You must be signed in to change notification settings - Fork 306
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Out of memory after number of redeployments #4098
Comments
How do you know this is caused by Payara and not your application. Have you looked at what objects are causing the OOME? |
Closing as not responsive. |
Could we reopen this issue? I have been curious and tried https://github.com/sgflt/payara-test-case on Payara 5.194. Memory leak is clearly visible after many redeployments: Most of created instances came from Felix as java-util.HashMap$Node referenced by
Also classes seems to be loaded permanently as count of loaded classes is constantly growing. |
I did our usual classloader leak test. Deployed the application, heap dumped the server and looked for EarClassLoader instances and WebAppClassLoader instances for the application. Undeployed the application and forced GC a few times then heap dumped again and the relevant classloaders are gone. So I don't think we have a classloader leak. You need to ensure there is a full GC to unload classes so not sure whether in the image above that has happened. |
Linking with #5063 that may reveal the cause. (tested on 5.2020.7 and still getting OOM after some count of redeployments) |
Leak is probably caused by ServiceLocatorImpl with name "__HK2_Generated_0".
|
@sgflt since you are looking at it, any chance you can pinpoint the source of the leak in non-generated code? |
Yes, another candidate is org.glassfish.deployment.admin.DeployCommand. Retained size is about 230 MB. |
Something is not deleting a reference to the class loader and this what's causing redeploy leak, and needs to be pinpointed / fixed. Anything else would be a small leak |
Tracked GC root to archiveMetaData in com.sun.enterprise.deploy.shared.FileArchive#440 [GC root - Java frame] |
There is some relation between FileArchive.archiveMetaData -> DeployCommand.report -> PropsFileActionReporter.subAction -> DeployCommandSupplementalInfo.dc -> DeploymentContextImpl.source -> FileArchive.archiveMetadata again |
I was able to cut out some fields. Now the leak is not so fast, but still present. All fixed places used nested classes. |
And the way turned into ServiceLocatorImpl back again. Now stored in static field in Globals with nice comment: "Very sensitive class, anything stored here cannot be garbage collected" |
I think Globals are the main cause. All places I have fixed was Services instantiated by ServiceLocator. |
Yes. Each redeploy puts some descriptors into default service locator, but undeploy does not call shutdown nor unregisters its own descriptors from default service locator, so count of initialized services is unstopably growing. Insertion stacktrace:
|
Unfortunately, I am not sure you are barking up the right tree here. |
I wouldn't be suprised if there were multiple causes of different leaks. I am going to measure the change to confirm that the big leak is fixed and small is still present. |
- this could be the first cause of leak
…lasses - anonymous and inner holds implicit reference to parent - this was probably the second cause of leak
* #4098 Reduced too broad scope of variable * #4098 Moved field to local variable - this could be the first cause of leak * #4098 Refactored inner and anonymous classes to nested static classes - anonymous and inner holds implicit reference to parent - this was probably the second cause of leak * #4098 Fixed code consistency * Fix more class loader leaks by: - making sure Server Threads and Timers do not inherit app's context class loaders - making sure app's security contexts don't get propagated to server threads and timers Added correct ear classes to class loader leak tests Co-authored-by: lprimak <[email protected]>
FYI I tried to track this down today, and services are actually unregistered from This was all part of the fix to this issue. |
* [FISH-1018] Out of memory redeploy leaks (#5081) * #4098 Reduced too broad scope of variable * #4098 Moved field to local variable - this could be the first cause of leak * #4098 Refactored inner and anonymous classes to nested static classes - anonymous and inner holds implicit reference to parent - this was probably the second cause of leak * #4098 Fixed code consistency * Fix more class loader leaks by: - making sure Server Threads and Timers do not inherit app's context class loaders - making sure app's security contexts don't get propagated to server threads and timers Added correct ear classes to class loader leak tests Co-authored-by: lprimak <[email protected]> * [FISH-1018] found more leaks and more reliable leak test (#5102) * found more leaks and more reliable leak test * bump jakarta.el to -p3 patch * tyrus patched update Co-authored-by: Lukáš Kvídera <[email protected]>
Lately we are getting memory leak errors and heap space errors in our server log. The heap space problem prevents me from deploying. Will a newer version of Payara solve this? I'm currently on Payara Server 5.194 #badassfish (build 327). Thanks.
|
Yes. Latest Payara Enterprise and community will solve most of these issues. While there are still small leaks, the large ones are absent |
Thanks. Is there any way to temporarily reset the server and the memory, until we get a chance to upgrade? UPDATE: Confirmed. A restart of web instances prior to deployment will clear the memory and prevent the heap space error. This is a good temporary solution until the Payara upgrade can be done. |
Description
On our preprod env after a couple tenths of redeployments instances are lacking memory and have to be killed manually from OS. Problem is reproduced on two separate environments.
Expected Outcome
No out of memory error.
Current Outcome
Steps to reproduce (Only for bug reports)
Start the deployment group, perform multiple redeployments of your application and at some time the error will occur.
Environment
The text was updated successfully, but these errors were encountered: