Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

socket.reconnectStrategy short circuits if no error listener is added to redis client. #1993

Closed
sam-perez opened this issue Feb 18, 2022 · 8 comments
Labels

Comments

@sam-perez
Copy link

We've been noticing some connection issues in production and have decided to experiment with retrying connections via socket.reconnectStrategy. After adding in a simple retry strategy, I wrote a test to ensure that it was behaving the way I expected. To my surprise, the reconnectStrategy was only being invoked once and not the several times that I had expected.

After diving into the source code, I eventually realized it's because I wasn't registering a listener for the 'error' event on the redis client.

My investigation lead me down the following steps.

Retry logic entrypoint in redis socket:
https://github.com/redis/node-redis/blob/master/packages/client/lib/client/socket.ts#L100

Line that throws on first error in retry logic:
https://github.com/redis/node-redis/blob/master/packages/client/lib/client/socket.ts#L155

Listener of error event in retry logic of redis socket:
https://github.com/redis/node-redis/blob/master/packages/client/lib/client/index.ts#L277

Once I got here, I was expecting there to be another listener somewhere that was throwing. To my surprise, after debugging and inspecting this.listeners('error') on the instance of the RedisClient, I realized that there were no error event listeners on the RedisClient. After some quick internet searches, I found the following:

https://nodejs.org/api/events.html#error-events

"If an EventEmitter does not have at least one listener registered for the 'error' event, and an 'error' event is emitted, the error is thrown."

So in order to get retrying to work without short-circuiting after the first error, we have to register an error event listener to prevent the calls to this.emit('error', err) from throwing errors.

Example.

const redis = require("redis");

const main = async () => {
  const observeReconnectStrategyBehavior = async (addErrorListener) => {
    let reconnectInvocationCounter = 0;

    const client = redis.createClient({
      socket: {
        reconnectStrategy: (retries) => {
          reconnectInvocationCounter++;

          if (retries < 5) {
            return 0;
          }

          return new Error("No more retries remaining, giving up.");
        }
      },
      url: `redis://broken_url:1234`
    });

    if (addErrorListener) {
      client.on('error', () => {})
    }

    await client.connect().catch(() => {});

    console.log(
        "Add error listener:",
        addErrorListener,
        "Reconnect strategy invocations:",
        reconnectInvocationCounter
    );
  };

  await observeReconnectStrategyBehavior(true).catch(()=>{});
  await observeReconnectStrategyBehavior(false).catch(()=>{});
};

main().then(() => console.log("Test run complete."));

Example output.

Add error listener: true Reconnect strategy invocations: 6
Add error listener: false Reconnect strategy invocations: 1
Test run complete.

Is this the behavior as intended? This is not mentioned anywhere in the docs, and it was confusing to me. I would have expected my reconnect strategy to work without requiring additional configuration, especially configuration that lives outside of the options object passed to the call to createClient.

Environment:

  • Node.js Version: v14.17.3
  • Redis Server Version: irrelevant
  • Node Redis Version: 4.0.3
  • Platform: irrelevant
@sam-perez sam-perez added the Bug label Feb 18, 2022
@rpong
Copy link

rpong commented Mar 16, 2022

we're experiencing the same issue, and this one seems to be related: #2032

we're stuck with v3 for now, although v4 would work, connection hiccups would affect our availability and v4 reconnect/recovery seems unreliable. hoping this gets fixed soon.

@JMTK
Copy link

JMTK commented Mar 17, 2022

omg thank you, i was so confused on why it wasn't reconnecting

@LucidThought
Copy link

I was using [email protected] and I had this problem (with an error listener defined). I was using redis clients for an adapter in my socket.io server project, and I thought (incorrectly) that the error was in my configuration of the @socket.io/redis-adapter package. My subClient reconnected as expected, but the pubClient passed the 'reconnecting' and 'connect' events without ever entering the 'ready' hook.

Downgrading redis to 3.1.2 fixed my problem immediately.

@MCApollo
Copy link

MCApollo commented Mar 27, 2022

+1 bump

2022-03-27T08:25:13.089346+00:00 app[web.1]: node:events:504
2022-03-27T08:25:13.089372+00:00 app[web.1]: throw er; // Unhandled 'error' event
2022-03-27T08:25:13.089372+00:00 app[web.1]: ^
2022-03-27T08:25:13.089373+00:00 app[web.1]: 
2022-03-27T08:25:13.089373+00:00 app[web.1]: SocketClosedUnexpectedlyError: Socket closed unexpectedly
2022-03-27T08:25:13.089374+00:00 app[web.1]: at TLSSocket.<anonymous> (/app/node_modules/@node-redis/client/dist/lib/client/socket.js:182:118)
2022-03-27T08:25:13.089375+00:00 app[web.1]: at Object.onceWrapper (node:events:646:26)
2022-03-27T08:25:13.089375+00:00 app[web.1]: at TLSSocket.emit (node:events:538:35)
2022-03-27T08:25:13.089376+00:00 app[web.1]: at node:net:687:12
2022-03-27T08:25:13.089376+00:00 app[web.1]: at TCP.done (node:_tls_wrap:580:7)
2022-03-27T08:25:13.089377+00:00 app[web.1]: Emitted 'error' event on Commander instance at:
2022-03-27T08:25:13.089377+00:00 app[web.1]: at RedisSocket.<anonymous> (/app/node_modules/@node-redis/client/dist/lib/client/index.js:339:14)
2022-03-27T08:25:13.089377+00:00 app[web.1]: at RedisSocket.emit (node:events:526:28)
2022-03-27T08:25:13.089378+00:00 app[web.1]: at RedisSocket._RedisSocket_onSocketError (/app/node_modules/@node-redis/client/dist/lib/client/socket.js:205:10)
2022-03-27T08:25:13.089379+00:00 app[web.1]: at TLSSocket.<anonymous> (/app/node_modules/@node-redis/client/dist/lib/client/socket.js:182:107)
2022-03-27T08:25:13.089379+00:00 app[web.1]: at Object.onceWrapper (node:events:646:26)
2022-03-27T08:25:13.089379+00:00 app[web.1]: [... lines matching original stack trace ...]
2022-03-27T08:25:13.089380+00:00 app[web.1]: at TCP.done (node:_tls_wrap:580:7)
2022-03-27T08:25:13.219235+00:00 heroku[web.1]: Process exited with status 1
2022-03-27T08:25:13.300972+00:00 heroku[web.1]: State changed from up to crashed
2022-03-27T08:25:33.000000+00:00 app[heroku-redis]

Edit:

.once('error', (err: Error) => this.#onSocketError(err))

Just skimming: I'm thinking whoever wrote that line meant to put on instead once for emitting a error;
retryConnection is recursive & will try to reconnect forever with a exponential timeout, it's firing once and the error is being throw the second time.

If that's not intended, I would bail/stop retrying after the X'th retry; I'll fork & PR a line change because this can't be fixed by defining a error handler.

Edit 2:

client.on('error', (err) => console.log('Redis Client Error', err));

socket.on('error', () => {

Or I'm a idiot and this is 100% intended, both the README.md and socket.spec.ts show examples of handling the error; The this keyword makes it slightly confusing, but the error gets raised to RedisClient where it's emitted again.

@niksy
Copy link

niksy commented May 19, 2023

we're stuck with v3 for now, although v4 would work, connection hiccups would affect our availability and v4 reconnect/recovery seems unreliable. hoping this gets fixed soon.

Unfortunately, this is the approach we will also be using since constant socket closing event can’t be something we can rely in production.

@rpong have you managed to resolve this issue or you ended up using v4 although you still experience this issue?

@leibale
Copy link
Collaborator

leibale commented May 21, 2023

This is a Node.js "feature"/design decision (and not something specific to "node-redis") - if an EventEmitter emits an error and there are no listeners - the process crashes. See here and here.

edit: we can change the event to "silent error" or something like that when there are no listeners, but this is not how things "should work" in the "node.js world"..

@leibale leibale closed this as completed May 21, 2023
@niksy
Copy link

niksy commented May 22, 2023

@leibale I understand that it’s how event emitter works, but it’s a little unintuinitive since this didn’t happen in v3. Could you maybe tell us why it happens now? Is there some core architectural change in how node-redis works so this happens?

@leibale
Copy link
Collaborator

leibale commented May 23, 2023

@niksy in v3 the client is hiding (not emitting) the error...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants