Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add a NATS "dead letter queue" stream for failing messages #5035

Merged
merged 1 commit into from
Nov 27, 2024

Conversation

fnichol
Copy link
Contributor

@fnichol fnichol commented Nov 27, 2024

This change introduces a new NATS Jetstream stream called DEAD_LETTER_QUEUES which will contain metadata of messages that reached their consumer's max_deliver limit. Currently, only the forklift-server consumer for the AUDIT_LOGS stream is configured, but more streams and consumers could be configured to use this in the future.

The forklift/audit logs NATS consumer is configured to attempt 4 deliveries of each message with a linear backoff (i.e. when a message is nackd) of 5 seconds, then 10, then 15 for a total of 30 seconds before the message metadata is added to the "dead letter queue" stream. Note that a "failed" message is not deleted from its source stream, but rather skipped over by that consumer. This allows us to inspect each failed message in place in the stream and decide how to triage any remediations.

This change introduces a new NATS Jetstream stream called
`DEAD_LETTER_QUEUES` which will contain metadata of messages that
reached their consumer's `max_deliver` limit. Currently, only the
`forklift-server` consumer for the `AUDIT_LOGS` stream is configured,
but more streams and consumers could be configured to use this in the
future.

The forklift/audit logs NATS consumer is configured to attempt `4`
deliveries of each message with a linear backoff (i.e. when a message is
`nack`d) of 5 seconds, then 10, then 15 for a total of 30 seconds before
the message metadata is added to the "dead letter queue" stream. Note
that a "failed" message is *not* deleted from its source stream, but
rather skipped over by that consumer. This allows us to inspect each
failed message in place in the stream and decide how to triage any
remediations.

Co-authored-by: Nick Gerace <[email protected]>
Signed-off-by: Fletcher Nichol <[email protected]>
Comment on lines +83 to +88
max_deliver: 4,
backoff: vec![
Duration::from_secs(5),
Duration::from_secs(10),
Duration::from_secs(15),
],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent!

@fnichol fnichol added this pull request to the merge queue Nov 27, 2024
Merged via the queue into main with commit f2d09cc Nov 27, 2024
8 checks passed
@fnichol fnichol deleted the nf/forklift-dead-letter-queue branch November 27, 2024 20:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants