Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Connection management improvements. #886

Closed
wants to merge 2 commits into from
Closed

Conversation

btimby
Copy link

@btimby btimby commented Aug 2, 2017

This PR addresses #306 and #824. We ran into these issues (many connections in CLOSE_WAIT state, and too many open connections following a load spike).

Specifically, I modified ConnectionPool.get_connection() so that it retrieves the connection at the head of the list of available connections. release() still appends to the tail, so all connections will be utilized in round-robin fashion. This change was made to support the other change to get_connection() which verifies a connection's "liveness". If a connection is not live, it is disconnected, and the next available is similarly tested. If no connections are usable, then a new one is created.

I also modified the Connection.disconnect() method so that if shutdown() errors (due to CLOSE_WAIT or other socket state) close() will still be called. I am not sure this is necessary, but I figured it was safer to ensure close is always called before the reference to the socket is deleted by setting self._sock = None.

@taylor-cedar
Copy link

I am running into issues with connections becoming idle and timing out. This would be very helpful to fix those issues.

@nickwilliams-eventbrite

+1 could use

@lambdaq
Copy link

lambdaq commented Dec 22, 2017

+1 plz merge

@taylor-cedar
Copy link

This PR has a couple of major issues. I attempted to use it in production and ran into some bugs.

First is line https://github.com/smartfile/redis-py/blob/f9348968dec581830a32e8469228c74ce443bd88/redis/connection.py#L1010 . self._created_connections will eventually exceed self.max_connections because connection counts are not reset when refreshing a dead connection.

Second issue is https://github.com/smartfile/redis-py/blob/f9348968dec581830a32e8469228c74ce443bd88/redis/connection.py#L687 . socket.TCP_INFO is not supported on OSX, so this line throws an exception.

I updated the code and have been using it in production for 2 months without issue. Here is the diff for the improved version.

smartfile/redis-py@master...cedar-team:master

@ssjunior
Copy link

+1

@btimby
Copy link
Author

btimby commented Jan 4, 2018

@taylor-cedar Good changes! Counting the connections rather than relying on a counter is better than the original method. Also, I work exclusively with Linux, so I am unable to test any other platform.

I do have a question about line 990 in your diff however, connection should never be in self._available_connections since it is popped on line 979 right? The only reason this if statement would ever evaluate to True is that the connection is duplicated within the list, which would indicate some other issue no? Did you run into this?

@taylor-cedar
Copy link

@btimby You are correct about the self._available_connections. That is unnecessary.

@btimby
Copy link
Author

btimby commented Jan 5, 2018

@taylor-cedar OK, makes sense. Seems like we would want your changes merged into this PR. Do you think you can make a PR against my fork? I can merge them in and issue another PR or update this one.

I could try to add your changes myself, but I would not be able to test the Mac specific parts, and I would probably mess it up.

AlwxSin added a commit to AlwxSin/redis-py that referenced this pull request Feb 1, 2018
@treychadwell
Copy link

For limiting connections, have you guys had any luck using the BlockingConnectionPool class? http://redis-py.readthedocs.io/en/latest/#redis.BlockingConnectionPool

@btimby
Copy link
Author

btimby commented Mar 13, 2018

@TCTChadwell for me it is not about limiting connections, but about handling timeouts properly. The default pool reuses the same connections over and over. When load spikes new connections are created to handle load (OK). But then when load reduces these connections stagnate and are timed out by the server. The connection pool never notices and never closes them (it does not attempt to use them again) so they remain in CLOSE_WAIT (NOT OK.)

If load increases again these stale connections throw application errors when they are used again. The pool does not check the connection before returning it for use.

This change causes all connections in the pool to be used, round robin. This keeps timeouts at bay and detects errors. It also culls extras and performs a liveness test (ping) before returning a connection to the application. Broken connections are closed and replaced (errors that arise from CLOSE_WAIT while closing are handled and the connection is fully closed).

I don't need to limit connections, my redis server can handle the load, but by default connections are not properly managed by this pool.

@wsantos
Copy link

wsantos commented May 3, 2018

Guys what do we need to get this merged ? in the mean time @taylor-cedar do you mind to merge/rebase your fork ?

@taylor-cedar
Copy link

taylor-cedar commented May 3, 2018

@wsantos I can rebase my fork, but it seems like this library is abandoned. Nothing has changed since November. If @andymccurdy or anyone else with write access has any interest in merging this PR, I will be happy to do the work necessary.

@kfrendrich
Copy link

+1

1 similar comment
@Ondrysak
Copy link

+1

@andymccurdy
Copy link
Contributor

I've just released 3.2.0 which attempts to address the underlying problem in a different way. I tried to use the code in this PR but couldn't get it to accurately identify the state of the connection.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.