-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug] [zeta] java.lang.OutOfMemoryError: Metaspace #4915
Comments
I also encountered this problem, is there a simple solution for it? @chaorongzhi |
Sorry, I do not have a solution at the moment and am trying to solve. |
@liugddx Hi, At the moment I reproduced the bug, but I do not know how to solve the problem, can you give me some advice. |
I got the dump file and am trying to analyze it. |
Maybe you can adjust the jvm parameters. |
@chaorongzhi may be zeta engize's bug,I found it's history service never delete finished jobs!!! |
You're right. You've been running a lot of jobs? |
Yes,I run 11 batch tasks per minute. The size of the metaspace does not drop after fullGC and classes are rarely unloaded. |
Can this be solved by caching the SeaTunnelChildFirstClassLoader instead of re-creating the SeaTunnelChildFirstClassLoader instance each time? |
It should be possible, can you submit a pr to fix this problem? |
Sure, I'll try. |
|
have removed SeaTunnelChildFirstClassLoader this class, put all the plug-ins in the lib directory, metaspace did rise slowly, at the same time, I have already solved the problem JobHistoryService cache data time is too long, But it still feels like metaspace memory inflation is not being solved. I am somewhat reverting to the zeta engine using hazelcast's serialization and disordering. It could be that the zeta engine is not working properly, or it could be hazelcast itself |
This issue has been automatically marked as stale because it has not had recent activity for 30 days. It will be closed in next 7 days if no further activity occurs. |
The version 2.3.2 I used also encountered the same problem, may I ask if this problem was fixed in version 2.3.3? |
@chaorongzhi Is there any progress on this issue? |
没人去解决这个问题。
| |
傅大爷
|
|
***@***.***
|
---- Replied Message ----
| From | ***@***.***> |
| Date | 12/27/2023 15:23 |
| To | apache/seatunnel ***@***.***> |
| Cc | wu-a-ge ***@***.***>,
Manual ***@***.***> |
| Subject | Re: [apache/seatunnel] [Bug] [zeta] java.lang.OutOfMemoryError: Metaspace (Issue #4915) |
2.3.3还是存在这个问题
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>
|
Excuse me, can I just configure g1 in jvm-optioal in 2.3.3-release? @liugddx |
1.version @liugddx sir,What should I do? Can you give me some advice? |
@wu-a-ge @chaorongzhi Hi , I was running into the same issue with 2.3.3. And I also tried to remove SeaTunnelChildFirstClassLoader this class. But it seems not work. Could you please gvie me some instructions to solve the issue? |
After I checked the log, the classloader created log not existed. Line 63 in 21a4593
![]() That's meaning no new classloader be created in server side. Could you provide your full log which after server started. It should be not only one log file. Or started with debug mode. @W-dragan |
metaSpace.txt Running 22 CDC tasks simultaneously will also generate Java. lang. OutOfMemoryError: Metaspace The configuration is also very simple, it should be easy to replicate simply from mysql-CDC to pg |
maybe you can do some data desensitization.
It's lots of task if heap size is 3g. Does 22 CDC use same source and same sink? What's the parallelism value? |
@Hisoka-X Yes, it's the same source and sink, but it's divided into 22 tasks to execute, each with a parallelism of 1, such as source mysqltable1 to sink pgtable1 Source mysqltable22 to sink pgtable22 |
Please provide us with complete desensitization logs when cache-mode is turned on. Thanks. |
org.apache.seatunnel.engine.server.service.classloader.DefaultClassLoaderService#getClassLoader 2024-03-07 17:00:35684 INFO org. apache. seatunnel. engine. common. loader ClassLoaderUtil - recycle classloader org.apache.seatunnel.engine.common.loader.SeaTunnelChildFirstClassLoader@7dc18237 Suspected to be recyclable |
@W-dragan how do you submit job? Http or shell? |
@Hisoka-X http |
Oh I see. This is a bug of http submit job. In Line 169 in 943bd48
|
cc @liugddx |
errorlog.txt Line 127 in 7c0ea2e
It should be that the node obtained is not the master node, but there was no subsequent master node judgment, which caused the problem to occur @Hisoka-X |
My fault, I didn't check whether seaTunnelServer is empty. |
All node have seatunnel server, only difference are it is master node or not. |
Lines 105 to 107 in 7c0ea2e
But in this method, if it is not a master node, it will return null, and I remember it was intentionally written here to solve # 6217 |
Test #6492 |
Lines 138 to 142 in 7c0ea2e
I made the changes according to #6492, but I think if this is the case, the line of code I marked should also be changed, and similar logic in the RestHttpGetCommandProcessor class may also need to be changed uniformly |
I unified the method, please check again. |
But after my verification, I found a phenomenon that may be related to cluster deployment. I just repeatedly submitted two duplicate types of tasks using HTTP, one batch and one CDC. In the end, the tasks were all completed, and CDC called stop to terminate. Theoretically, there should only be two classloaders, and they should be released when the tasks were completed. However, I have found that after all the tasks have ended, There are still three instances of SeaTunnelChildFirstClassLoader. Of course, even if repeated submissions are made, no new SeaTunnelChildFirstClassLoader instances are generated, which should have solved the problem of OOM to some extent. However, this phenomenon still confuses me at present. |
No, one job not only one type classloader at now. It will including source classloader, sink classloader, and
Yes. In cache mode, we store classloader in the memory so we can reuse it in the future. |
Search before asking
What happened
I will run about 15 batch synchronization tasks per minute. When I add MaxMetaspaceSize = 2g in jvm_options, OOM will appear after about 1.5 hours of running.
SeaTunnel Version
2.3.1
SeaTunnel Config
Running Command
Error Exception
Flink or Spark Version
none
Java or Scala Version
1.8
Screenshots
Are you willing to submit PR?
Code of Conduct
The text was updated successfully, but these errors were encountered: