-
Notifications
You must be signed in to change notification settings - Fork 414
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feature / bug: Enhance Redis Failures and Reconnects #2565
Comments
But are these issues with IORedis or specific to BullMQ? because we really rely on IORedis capability to reconnect automatically. Recently I found an issue where the blocking BZPOPMIN command could hang forever in the case of a disconnect, but other than that it should just work. |
I’ve found that basically it’s both IORedia and BullMq that have issues on reconnecting , my test script shows IORedis has an issue where I have to force a reconnect command . But bull has issues if redis is completely restarted. I’ll post my testing processes later today , as there’s 3 different type of failures .
|
Another note , bull does reconnect , but jobs get stuck in waiting forever |
There was some strange issue with IORedis (redis/ioredis#1888) but I made a fix in BullMQ to workaround it and since then I cannot reproduce it anymore. |
Is your feature request related to a problem? Please describe.
Sometimes redis has different sorts of disconnects that may not result in bullmq not to reconnect. Mostly because my config requires us to make sure a job aways gets added / ran .
For instance if a DNS resolution failure i have to setup a check that we resume the queue after getting the disconnect
Other times redis has crashed and if your using in memory only, there will be no queue, to fix that we needed to setup a ping pong system to force a reconnect of redis. BullMQ seems to pick that up better and is able to do what it does and continues to work after a connect.
Describe the solution you'd like
It would be nice if bullmq was able to detect these the same way to handle the reconects / resumes of the worker / producer etc.
Additional context
Most of my code is closed source, but im willing to share some code if needed.
Feel free to close the issue but i wanted to warn people that the bullmq reconnect may or may not work depending on their configuration of errors.
Also note , we run our redis in a kubernetes cluster so our DNS resolution can fail (annoying) , or can even change between different IP's.
Here is example code to handle the disconnects via redis crashing
The text was updated successfully, but these errors were encountered: