-
Notifications
You must be signed in to change notification settings - Fork 158
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RiakClient shutdown() returns a future that never completes, JVM does not exit #706
Comments
I tried adding
In RiakCluster.java, it did not fix the problem, so something else is wrong and my analysis is flawed, but I'm still hanging on shutdown about half the time. |
Based on a few more break points in my debugger, this is the line that hangs and never returns It doesn't throw, it just never comes back. |
@philip-doctor I'm not sure whether it is your case or not, but here is how we avoid 'stuck on exit' in spark connector https://github.com/basho/spark-riak-connector/blob/master/connector/src/main/scala/com/basho/riak/spark/rdd/connector/SessionCache.scala#L121 |
@srgg thanks for the tip, I tried it (kotlin code for reference)
I seem to still hang, I think it's something with this line here going awry but I'm not sure what's exactly going wrong here yet. I'll see if I can take more time to dig.
|
RiakClient shutdown() returns a future that never completes, JVM does not exit
Context
I'm writing a simple ETL app, on conclusion my application is not exiting. I am calling shutdown() on my connection and then blocking .get(), but it never returns.
I have attached my debugger to it and stepped through the code, here's what I see happening
Expected Behavior
shutdown completes, program terminates
Actual Behavior
blocks forever on the shutdown future
Possible Fix
I think all you have to do is to catch the IllegalStateExceptions and continue iteration over each node in your list and call shutdown().
Steps to Reproduce
This does not reproduce cleanly, I suspect there's something racey, I'm using Kotlin, but I'm doing nothing fancy here:
This is what I think is happening, let's take a dive....
https://github.com/basho/riak-java-client/blob/riak-client-2.1.1/src/main/java/com/basho/riak/client/core/RiakCluster.java#L630
Here we get a list of nodes, what state are those nodes in?
https://github.com/basho/riak-java-client/blob/riak-client-2.1.1/src/main/java/com/basho/riak/client/core/RiakCluster.java#L408
stateCheck(State.CREATED, State.RUNNING, State.SHUTTING_DOWN, State.QUEUING);
Okay cool, so we get that list of nodes and then call shutdown on them over here:
https://github.com/basho/riak-java-client/blob/riak-client-2.1.1/src/main/java/com/basho/riak/client/core/RiakNode.java#L303
That does a state check
stateCheck(State.RUNNING, State.HEALTH_CHECKING);
So if the state is currently State.SHUTTING_DOWN then you get the node in your node list, then it throws an IllegalStateException. What happens then? Well the Cluster Shutdown action is taking place on a thread in the ExecutorService pool, it's never caught so it just kills the thread.
Because of this we never get our countdown latch to 0, because the thread that's supposed to shut everything down in the background got killed.
Because of that we block forever.
I think all you have to do is to catch the IllegalStateExceptions and continue iteration over each node in your list and call shutdown().
Context
I want my program to terminate normally and not leak connection
Your Environment
The text was updated successfully, but these errors were encountered: