-
-
Notifications
You must be signed in to change notification settings - Fork 300
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Beacon node active handles after close sequence is completed #5642
Comments
Another issue reported on discord, the BN process is not shutting down but based on the logs (beacon-2023-06-25.log) it is hard to tell what the issue is. We might have to call process.exit explicitly after beacon node closed if this issue can't be resolved.
|
The issue seems to be fixed with upgrade to libp2p to 0.45.9 in 7280234. The PR that fixed the issue: |
Reopening as issue does not yet seem to be resolved. Looks like we are pinging peers after sending goodbye
or still dialing peers while disconnecting (
Based on the current network close sequence it looks like we close the peer manager after disconnecting peers which could explain this behavior lodestar/packages/beacon-node/src/network/core/networkCore.ts Lines 253 to 256 in 85ff3cf
as only after closing, the intervals and event listeners are removed lodestar/packages/beacon-node/src/network/peers/peerManager.ts Lines 203 to 212 in 85ff3cf
Changing the order of the closing sequence will likely fix the issue, but extensive testing is required as this issue is hard to reproduce. |
Proposed solution #5642 (comment) does not resolve the issue. The problem seems to be with libp2p which in the end must take care of closing all connections / removing tcp listeners. There are several closed but also open issues regarding connections not being closed properly.
This comment libp2p/js-libp2p#436 (comment) summarizes open tasks but there was no progress in a while. For now, we just have to explicitly exit process until upstream issues are fixed. |
Potential fix has been merged to unstable It is hard to verify if this actually fixed the issue due to the fact that it is not really reproducible and happens rarely. Two things have to be done to confirm a fix
Note: there is a chance that the process would also hang with |
@nflaig I think this issue is resolved now? |
I don't think so, did libp2p releae a fix for this? Since we switched to running libp2p in a worker we get a differnet issue #5775 but this one should be reproducible by setting |
Describe the bug
In some rare cases when the node is running for a longer time it does not exit when receiving a exit signal.
Active handles
The Sockets that show
135.181.2.45:9000 -> undefined:undefined
can probably be ignored as those are also there if beacon node exits correctly.Debug logs
What looks really suspicious is that we are dialing peers after network is already closed
Expected behavior
Beacon node should always exit process after close sequence succeeds. No explicit process.exit should be required.
Steps to reproduce
Run beacon node for a while with the following command (or similar)
CTRL + C to exit gracefully and observe that it is not shutting down even though close sequence is executed successfully.
Operating system
Linux
Lodestar version or commit hash
unstable (bf58427)
The text was updated successfully, but these errors were encountered: