-
Notifications
You must be signed in to change notification settings - Fork 902
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AeronArchive does not survive disconnects #427
Comments
Yes you are correct that the error reporting should be better in such cases. I'm going to revisit this code. |
…connection drops unexpectedly. Issue #427.
No memory leak is occurring. The Publication was going not connected and then connecting again with a new session. I've tightened up the logic to ensure the session is aborted if either inbound image or outbound publication drops. When an inbound requests comes and no return path can be determined then sending back an error is tricky. @tmontgomery and I are looking at a number of ways to address this. One simple way would be for a having a commonly known error stream. An alternative is the concept of a full duplex Aeron style socket which is in development for other uses that this could benefit from. For now if you get an error then reconnect. |
…ng on the deadline expiring. Issue #427.
I've pushed a change that detects the connection is broken and gives a more informative message. |
I think memory leak can actually occur in a scenario when you dropped your AeronArchive for some reason (app crashed or you consider it disconnected) and created a new one with same response channel and stream, while old ControlSession in Archive is still alive. |
…nect response is not missed. Issue #427.
…e so that it cannot be reclaimed by a restarting client on the same channel and stream id on the same driver. Issue #427.
I've addressed those two potential situations. |
AeronArchive does not survive long disconnects > 5s.
But what is more important there is no way to know that Archive's ControlSession is dead or not.
The only sympton will be all AeronArchive methods throwing TimeoutException which is ambiguous. It may either indicate that connection is still considered connected by Aeron but is actually broken, or that Archive's ControlSession is dead.
One could just recreate AeronArchive on each TimeoutException, but then you can get a memory leak, because currently if you send CONNECT message, a new ControlSession is unconditionaly created (line), but previous one is still considered alive as its Publication is in connected state (line).
Following test reflects described case:
The text was updated successfully, but these errors were encountered: