Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ziggurat drops messages while pushing to rabbitmq when connection with rabbitmq is broken #131

Closed
theanirudhvyas opened this issue Dec 6, 2019 · 1 comment · Fixed by #146
Labels
bug Something isn't working

Comments

@theanirudhvyas
Copy link
Contributor

theanirudhvyas commented Dec 6, 2019

When the connection with rabbitmq is broken and the service attempts to push messages to it, it retries the publish a couple of times and then drops the message and moves on to the next message.

The expected behaviour is that it should raise an exception and the message offset should not be commited to kafka.

@theanirudhvyas theanirudhvyas added the bug Something isn't working label Dec 17, 2019
@theanirudhvyas
Copy link
Contributor Author

theanirudhvyas commented Jan 14, 2020

@mjayprateek and I are picking this up.

The problem:

When the connection with RabbitMQ is broken while the service is running, Ziggurat does not exit, it keeps on processing messages. Publish to RabbitMQ ziggurat.producer/publish retries publishing 5 times, but if it is still failing, it just reports the issue to sentry and returns.

Since it returns without an exception, streams commits the message and moves on to the next message, thus causing the message loss.

Proposed Solution:

In ziggurat.producer/publish, if the publishing fails even after the retry, we'll stop the streams, so that no new messages are commited or read from kafka.
The streams can be restarted manually by restarting the service (or we could provide an API for restarting the streams).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant