-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support dead letter queue on sinks #1772
Comments
I support this. I'm curious if this could be architected in a way that wouldn't require a ton of manual individual sink work? |
Noting here that it'd be useful to "replay" messages from the DLQs easily. |
Use case from Discord: https://discord.com/channels/742820443487993987/746070591097798688/875360138612064286
|
Any news regarding this? We desperately need this DLQ feature. Logstash DLQ helps us a lot and this is the only feature that keeps us from migrating |
Hi @shamil ! It's still something we very much want to do, but hasn't been scheduled yet. |
Thanks @jszwedko I will wait ;) |
Would like this feature as well :) We previously were using the Dead Letter Queue on Logstash for this where we read the failed documents straight out of the Dead Letter Queue with a dedicated plugin, did some transforms and indexed them back (in a different format) to Elasticsearch for further investigation. |
Just bumping - the need for this feature came up for me again. |
I think it's important that the DLQ contains actionable messages where an actionable message looks like: {error msg, payload that caused the error}. Through such actionable messages, we can figure out what failed and also calculate exactly why it failed. If it is too much work to architect a DLQ that chains off of sinks, then I think for the sake of delivering such functionality, it is worth extending a feature that you guys already have architected. The feature im referring to is the Proposed (simple) solution: Example
Note: I'm aware that this implementation does not really produce a DLQ, and that it does not technically satisfy the OP's description, but I believe many people would be satisfied with this idea for now, because it at least drastically helps in diagnosing sink errors in components. Thumbs up if you think this is "good enough" for now or thumbs down in case I'm misguided. I just want something to be done as opposed to waiting too long for the perfect solution. @jszwedko @binarylogic what do you guys think? I havent read the code, but i would guess that it's straightforward to implement considering it seems you guys already have the codified infrastructure set up for internal logs to pick up sink errors. Thank you for the read! |
Thanks for the thoughts @rlazimi-dev ! I think this is something we might pick up in Q3. |
Any updates on this? It's something that could be really useful. |
I just effectively ran into this issue too. I'm using the HTTP sink and I'd really like to have access to the output produced from the HTTP request for use in other transforms and sinks. |
Folks, any updates on this? We cannot use this for any stable pipeline if DLQ is not supported. Or is there an alternative to DLQ strategies? |
For example: https://github.com/vectordotdev/vector-test-harness/blob/master/cases/tcp_to_http_performance/ansible/config_files/vector.toml |
No updates yet unfortunately. Sinks do retry requests for transient failures. |
Have to say pretty big gap to be missing, as some of those logs could be critical and need recovering, or you also could be put in a position where you end up duplicating logs through the stack just to recover a couple. I think this is a must have feature for sinks, especially given something similar has been implemented in VRL. Maybe a simpler way for doing the elastic sink is to just add additional configuration item for specifying the dlq index and vector can output to that index if the current one fails all retries. |
I think it's not about retries especially when we talking about elasticsearch.
|
I have the same exact case as @ravenjm, using elasticsearch sink, rejected logs because of format mismatch are just dropped. |
I have come in to the same issue and it would be really nice to send failed messages to another sink to maybe send them back to a kafka topic for dedicated DLQ messages or to file on disk. |
One particular use case I have is around using Vector for delivering logs. We persist some logs in Kafka which then go through a processing pipeline but occasionally, someone will send through a log line that is bigger than the maximum Kafka payload size (that we probably have as a default). In these cases, it would be nice to have a DLQ action. Probably not a literal queue but say; rerouting an event like that to an S3 bucket. Ironically, a standard pattern of publishing to a dead letter Kafka topic wouldn't work because it too would not be sized large enough plus it's rare enough that ordering is not that useful of a property. |
@marcus-crane for such cases we developed a number of remaps that handle "dead letter" cases. First one that checks event's size and then "abort"s processing:
Next, ALL aborted messages in previous remap's are going through catch-all remap that tags them properly. "route" is internal field name we use for dynamic events routing:
And last, all our kafka sinks are configured to support large messages with:
this of course can't handle cases when ElasticSearch rejects a message due to mapping conflicts, which is shame, but theoretically can be worked around by having strict schema verification in vector, but this is still in our to do list. hopefully this RFC will be finished and merged before that :) |
In the context of an event processor a dead letter queue can mean a number of things. We can already support content based DLQs using transforms to route certain events to a secondary sink when they aren't suitable for our primary sink.
However, it would also be nice to support dead letter queuing by chaining sinks as fallback options, where failed sends can be routed under certain circumstances (rather than retrying in an infinite loop, or being dropped completely).
It'd be cool to be able to do something like this:
And since we're just producing output from sinks here there's no reason we can't add transforms in there as well:
The text was updated successfully, but these errors were encountered: