Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A question about redisClusterAsyncConnect2() and slot map updates #249

Closed
plainbanana opened this issue Jan 9, 2025 · 2 comments · Fixed by #252
Closed

A question about redisClusterAsyncConnect2() and slot map updates #249

plainbanana opened this issue Jan 9, 2025 · 2 comments · Fixed by #252
Assignees

Comments

@plainbanana
Copy link
Contributor

plainbanana commented Jan 9, 2025

Hello hiredis-cluster team,

Thank you so much for providing and maintaining this client library.

I have a question regarding redisClusterAsyncConnect2() and slot map updates.

Because the node that issues the CLUSTER SLOTS command is chosen randomly, we are considering using redisClusterAsyncConnect2(). My understanding is that when using this API, we must first run the event loop and wait until the eventcallback is triggered with HIRCLUSTER_EVENT_READY.

However, in an quite edge case (e.g. the cluster has just started up) sometimes the cluster slots information is incomplete, so the slot map cannot be updated. c.f. https://github.com/Nordix/hiredis-cluster/blob/0.14.0/hircluster.c#L3854
In such cases, the slot map will not be updated, and updateSlotMapAsync will not be retried. As a result, we can't get HIRCLUSTER_EVENT_READY event.

We considered calling redisClusterAsyncConnect2() again in that scenario. However, deciding the proper timing for this call is difficult, and since it’s not a connection error, the behavior itself is correct, but selectNode keeps choosing the same node, so I suppose this might not be an good idea.

For now, we recreate all related contexts if the slot map does not be updated within a certain period.
IMHO, (I may be wrong) it would be helpful to receive an event indicating that the slot map could not be updated because the data was incomplete, or retry updateSlotMapAsync again.
Is there a better way to handle this situation?

Thank you again.

@bjosv
Copy link
Collaborator

bjosv commented Jan 10, 2025

It seems that hiredis-cluster should perform a retry in this scenario so that a user don't need to call redisClusterAsyncConnect2() again. I will look into adding a testcase and a fix.

I guess there are additional cases that need to be handled/decided on what should happen, like when the client receives an ERR This instance has cluster support disabled. In this case the retry could be unnecessary.

@bjosv
Copy link
Collaborator

bjosv commented Jan 17, 2025

The initial slotmap update should behave better now with your suggestion to retry, additionally there are some fixes to handle empty address which I have seen coming from starting clusters.

Note that the event:
HIRCLUSTER_EVENT_READY when the slot mapping has been fetched for the first time and the client is ready to accept commands, `
is fired when there is at least one node handling slots. It doesn't mean that all slots are covered, but the slotmap will eventually be updated while sending commands, see this testcase. Any suggestions/improvements are welcomed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants