A question about `redisClusterAsyncConnect2()` and slot map updates #249

plainbanana · 2025-01-09T15:59:46Z

Hello hiredis-cluster team,

Thank you so much for providing and maintaining this client library.

I have a question regarding redisClusterAsyncConnect2() and slot map updates.

Because the node that issues the CLUSTER SLOTS command is chosen randomly, we are considering using redisClusterAsyncConnect2(). My understanding is that when using this API, we must first run the event loop and wait until the eventcallback is triggered with HIRCLUSTER_EVENT_READY.

However, in an quite edge case (e.g. the cluster has just started up) sometimes the cluster slots information is incomplete, so the slot map cannot be updated. c.f. https://github.com/Nordix/hiredis-cluster/blob/0.14.0/hircluster.c#L3854
In such cases, the slot map will not be updated, and updateSlotMapAsync will not be retried. As a result, we can't get HIRCLUSTER_EVENT_READY event.

We considered calling redisClusterAsyncConnect2() again in that scenario. However, deciding the proper timing for this call is difficult, and since it’s not a connection error, the behavior itself is correct, but selectNode keeps choosing the same node, so I suppose this might not be an good idea.

For now, we recreate all related contexts if the slot map does not be updated within a certain period.
IMHO, (I may be wrong) it would be helpful to receive an event indicating that the slot map could not be updated because the data was incomplete, or retry updateSlotMapAsync again.
Is there a better way to handle this situation?

Thank you again.

The text was updated successfully, but these errors were encountered:

bjosv · 2025-01-10T10:28:09Z

It seems that hiredis-cluster should perform a retry in this scenario so that a user don't need to call redisClusterAsyncConnect2() again. I will look into adding a testcase and a fix.

I guess there are additional cases that need to be handled/decided on what should happen, like when the client receives an ERR This instance has cluster support disabled. In this case the retry could be unnecessary.

bjosv · 2025-01-17T08:33:16Z

The initial slotmap update should behave better now with your suggestion to retry, additionally there are some fixes to handle empty address which I have seen coming from starting clusters.

Note that the event:
HIRCLUSTER_EVENT_READY when the slot mapping has been fetched for the first time and the client is ready to accept commands, `
is fired when there is at least one node handling slots. It doesn't mean that all slots are covered, but the slotmap will eventually be updated while sending commands, see this testcase. Any suggestions/improvements are welcomed.

zuiderkwast assigned bjosv Jan 9, 2025

bjosv mentioned this issue Jan 14, 2025

Retry when an async slotmap update fails #252

Merged

bjosv closed this as completed in #252 Jan 17, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

A question about `redisClusterAsyncConnect2()` and slot map updates #249

A question about `redisClusterAsyncConnect2()` and slot map updates #249

plainbanana commented Jan 9, 2025 •

edited

Loading

bjosv commented Jan 10, 2025

bjosv commented Jan 17, 2025

A question about redisClusterAsyncConnect2() and slot map updates #249

A question about redisClusterAsyncConnect2() and slot map updates #249

Comments

plainbanana commented Jan 9, 2025 • edited Loading

bjosv commented Jan 10, 2025

bjosv commented Jan 17, 2025

A question about `redisClusterAsyncConnect2()` and slot map updates #249

A question about `redisClusterAsyncConnect2()` and slot map updates #249

plainbanana commented Jan 9, 2025 •

edited

Loading