Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Redis cluster improvements #419

Merged

Conversation

Ladicek
Copy link
Contributor

@Ladicek Ladicek commented Nov 22, 2023

Several commits, best reviewed in isolation:

  • show all errors when connecting to Redis cluster fails
  • use a subclass of ConnectException on connection failure in cluster/replication client
  • fix caching of hash slot assignment in cluster client
  • improve some code comments slightly
  • treat sharded pub/sub commands as pub/sub commands
  • improve how documentation links are generated in RedisAPI

@Ladicek Ladicek requested a review from vietj November 22, 2023 12:21
@Ladicek
Copy link
Contributor Author

Ladicek commented Nov 22, 2023

@Ladicek Ladicek changed the title show all errors when connecting to Redis cluster fails Redis cluster improvements Nov 28, 2023
Copy link

@sfali16 sfali16 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code changes look good.

Some comments (1 and 2 are stylistic, and 3 might affect retries for non-cluster client connections):

  1. Should the commit comment explaining the use of a class dervied from ConnectException be in the javadocs of one of the connect methods or the RedisClusterClient itself?
  2. Should we encapsulate this logic of using a RedisConnectException in its own function to better clarify the logic, and for future editors to reuse to do the right thing?

onGotSlots.handle(Future.failedFuture(new RedisConnectException("Cannot connect to any of the provided endpoints")));
to

onGotSlots.handle( failWithRetryableException( "Cannot connect to any of the provided endpoints"); 
// .. some other code 
/**
 * Return future that resolves with a RedisConnectException, an instance of ConnectException that will signal to clients that they can retry the connection
 */
public Future<RedisConnectException> failWithRetryableException( message ) { 
    return Future.failedFuture( new RedisConnectException( message ));
}
  1. Finally should the same logic to wrap failures in a RedisConnectException apply to the RedisBaseClient ? You may have a good reason for this - but why is it only applicable to the cluster or replication clients?

@Ladicek
Copy link
Contributor Author

Ladicek commented Nov 29, 2023

I could add something like RedisConnectException.failedFuture(String) and add a comment on that method. That's a good point.

It is required to use a ConnectException explicitly for the cluster and replication clients, because they fail the connecting process using a custom logic. The other clients propagate the original underlying exceptions directly, so they should be fine (I tested with the standalone client, but didn't bother checking with sentinel).

@Ladicek Ladicek force-pushed the show-all-errors-on-cluster-connection-failure branch from 3ca16c6 to 2452b51 Compare November 29, 2023 08:54
@Ladicek
Copy link
Contributor Author

Ladicek commented Nov 29, 2023

I didn't do RedisConnectException.failedFuture, but I added a comment directly on RedisConnectException.

This improves user's ability to diagnose errors when connecting
to a Redis cluster fails. Instead of bare

```
io.vertx.core.impl.NoStackTraceThrowable: Failed to connect to all nodes of the cluster
```

users will see the actual problems. For example:

```
io.vertx.core.impl.NoStackTraceThrowable: Failed to connect to all nodes of the cluster
- io.vertx.core.http.ConnectionPoolTooBusyException: Connection pool reached max wait queue size of 24
```
…eplication client

This is to let users detect connection error uniformly by inspecting
the class of the exception. The standalone client directly exposes
the underlying exception, which is a subclass of `ConnectException`
(provided by Netty), so this commit makes sure the clustered and
replicated clients do the same (albeit with a different subclass
of `ConnectException`).
The cluster client caches the hash slot assignment for a brief period
of time (1 second by default). This miniature cache is expired by
a simple Vert.x time, but that timer is only scheduled when the hash
slot assignment was obtained _successfully_. When an error occurs
during `CLUSTER SLOTS`, the cache is never expired and all subsequent
connection attempts fail with the cached exception. This commit fixes
that by scheduling the cache expiration also in case of an error.
This commit also regenerates all the commands and `RedisAPI`,
based on Redis 7.0.12.
@Ladicek Ladicek force-pushed the show-all-errors-on-cluster-connection-failure branch from 2452b51 to 0258b4d Compare December 5, 2023 14:01
Copy link
Contributor

@vietj vietj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM good work

@Ladicek Ladicek merged commit 998ca95 into vert-x3:master Dec 6, 2023
3 checks passed
@Ladicek Ladicek deleted the show-all-errors-on-cluster-connection-failure branch December 6, 2023 14:44
@Ladicek Ladicek added this to the 5.0.0 milestone Jul 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

4 participants