Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JDK client can hang when try to reconnect in bad network connection conditions #646

Closed
greenmancm opened this issue Oct 30, 2018 · 2 comments

Comments

@greenmancm
Copy link

greenmancm commented Oct 30, 2018

When using secure connection (wss) while in the conditions of a bad network connection, the client connection thread can hang in dead lock on the JdkClientContainer's connectSynchronously() method, because it never receives neither completed() nor failed() callback.

There is a good way to reproduce the issue:

  1. Use client to establish secure websocket connection.
  2. Run sudo tcpkill -i ${networkInterfaceId} -9 port ${httpsPort} for about 2 minutes. The client will make several reconnect attempts, then hang. I used the Java Mission Control tool to observe tyrus's threads. For every connection attempt, a new thread was spawned. Then the tyrus connection thread gets stuck, and new threads are not spawning.
  3. Stop tcpkill to restore network. The thread is still stuck.

There is a workaround for this bug. However, I don't have full understanding of how it may impact other use cases.
By my observations, when the bug is reproduced, in the ClientFilter the processConnectionClosed() method is called, and wsConnection == null.
If I explicitly call connectCompletionHandler.failed(null); here, the dead lock doesn't occur.

This issue probably duplicates #594.

greenmancm added a commit to greenmancm/tyrus that referenced this issue Oct 30, 2018
wgolyakov added a commit to wgolyakov/tyrus that referenced this issue Mar 4, 2021
When using secure connection (wss) while in the conditions of a bad
network connection, the client connection thread can hang in
JdkClientContainer's connectSynchronously() method, because it never
receives neither completed() nor failed() callback.
Missing failed() call added for fix.

Bug: eclipse-ee4j#646
Signed-off-by: Vladimir Golyakov <[email protected]>
wgolyakov added a commit to wgolyakov/tyrus that referenced this issue Mar 5, 2021
wgolyakov added a commit to wgolyakov/tyrus that referenced this issue Mar 5, 2021
wgolyakov added a commit to wgolyakov/tyrus that referenced this issue Mar 5, 2021
@lprimak
Copy link
Contributor

lprimak commented May 12, 2021

This issue can be closed

@lprimak
Copy link
Contributor

lprimak commented Oct 27, 2023

@jansupol this issue can be closed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants