Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix schedule task memory leak in ManagedLedgerFactoryImpl #13134

Conversation

hezhangjian
Copy link
Member

Motivation

The memory is leak due to the ManagerLedgerImpl can't be garbage collected.

Modifications

Before remove the managedLedgerImpl from map, call close() to cacel the schdule job first.

Documentation

Check the box below and label this PR (if you have committer privilege).

Need to update docs?

  • no-need-doc

Fix a memory leak bug

@github-actions github-actions bot added the doc-not-needed Your PR changes do not impact docs label Dec 4, 2021
@hezhangjian
Copy link
Member Author

/pulsarbot run-failure-checks

@lhotari
Copy link
Member

lhotari commented Dec 4, 2021

The memory is leak due to the ManagerLedgerImpl can't be garbage collected.

Can you share more details about the symptoms? Which Pulsar versions are impacted?

@hezhangjian
Copy link
Member Author

hezhangjian commented Dec 4, 2021

@lhotari
The managedledgerImpl were removed from map, but the schedule tasks didn't stop, So their memory can't be released, leading to a memory leak.

In current branch we maintained, version 2.8 and version 2.9 are impacted.

@hezhangjian
Copy link
Member Author

@codelipenghui @hangc0276 @eolivelli PTAL

Copy link
Member

@lhotari lhotari left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@lhotari
Copy link
Member

lhotari commented Dec 4, 2021

Good catch @Shoothzj

@@ -353,6 +353,17 @@ public void asyncOpen(final String name, final ManagedLedgerConfig config, final
log.warn("[{}] Attempted to open ledger in {} state. Removing from the map to recreate it",
name, l.getState());
ledgers.remove(name, existingFuture);
l.asyncClose(new CloseCallback() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, if the managedLedger in Fenced or Closed state, call asyncClose just

public synchronized void asyncClose(final CloseCallback callback, final Object ctx) {
State state = STATE_UPDATER.get(this);
if (state == State.Fenced) {
factory.close(this);
callback.closeFailed(new ManagedLedgerFencedException(), ctx);
return;
} else if (state == State.Closed) {
if (log.isDebugEnabled()) {
log.debug("[{}] Ignoring request to close a closed managed ledger", name);
}
callback.closeComplete(ctx);
return;
}

So, should we just call ManagedLedgerFactoryImpl::close ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I guess calling asyncClose is sufficient. However there seems to be a bug in the asyncClose method when the ledger is fenced. The method call to cancel the scheduled tasks is missing.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@hezhangjian hezhangjian marked this pull request as draft December 5, 2021 02:42
@hezhangjian hezhangjian force-pushed the memory-leak-managed-ledger-factory-impl branch from badb576 to f40684c Compare December 5, 2021 02:49
@@ -1192,6 +1192,7 @@ private long consumedLedgerSize(long ledgerSize, long ledgerEntries, long consum
@Override
public synchronized void asyncTerminate(TerminateCallback callback, Object ctx) {
if (state == State.Fenced) {
cancelScheduledTasks();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that similar addition would be needed to asyncClose when the ledger is fenced

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, yesterday,I found It has been fixed in #12565, it's better to cancel the schedule task when the ledger is fenced. I will test if it fix the memomry leak.

@hezhangjian hezhangjian closed this Dec 6, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
doc-not-needed Your PR changes do not impact docs
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants