-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Haraka dies when connection to redis is lost #45
Haraka dies when connection to redis is lost #45
Comments
After a deeper look, I realize this actually happens because of other haraka plugins creating redis connections without setting an error handler on the client, which translates in redis-node's error events generating exceptions (as per node's docs). At least these plugins are concerned: The package haraka-plugin-redis also has 2 client creations that don't set an error handler, here and here, though I'm unsure how they are used. Should we create PRs to add error handlers to all of these? With error handlers everywhere I cannot reproduce this exact issue. I can still get Haraka to die by attempting to e.g. send an email while redis is loading the RDB dump, which also generates an exception. That may be a separate issue. |
Thanks for merging #67 and #61 @msimerson. No merge for #46? |
This is probably more of an issue with node-redis than Haraka's redis plugin.
If the connection to redis is lost (e.g. redis failover, transient network issue, ...), an exception is thrown by node-redis that isn't catched, and Haraka dies. Ideally, Haraka should be resilient to such events, and instead attempt to reconnect for a reasonable amount of time.
My understanding is that node-redis is actually supposed to try reconnecting, via the default socket.reconnectStrategy.
It can be reproduced quite easily by simply starting redis + Haraka, then stopping redis and wait for a few seconds.
I'm not sure if there's something that can be done about it in haraka-plugin-redis. I'm open to ideas, and at the very least I thought I should mention the issue.
The text was updated successfully, but these errors were encountered: