producer message queue persistance on disk #31

edenhill · 2013-11-28T15:44:57Z

Persisting the producer message queue on disk would allow longer broker unavailability times.

bartw72 · 2014-04-13T11:10:01Z

Hi, I am interested in this feature, esp if this can persist any pending messages that have no ack and are still in the buffer before destroy (which will be re-queued on startup). Do you have any ideas on how you would implement this? Otherwise I will try to look into this to see if I can create a pull request at one time with this functionality.

edenhill · 2014-04-14T18:28:03Z

My idea for this is to simply use mmap()ed files and point the rkbuf's there.
The message contents would be copied to that mmap:ed space directly (regardless of RD_KAFKA_MSG_F_COPY or not).

bartw72 · 2014-04-15T14:48:32Z

Hi Magnus,

Would this have to survive gracefull shutdown and startups?

My initial thought is to create files equally sized to the max.messages
or max.bytes boundries of the current memory buffer. Overflow messages
in new files once the current is full and creating a file per topic.

Write to this file not on every queued message, but let a separate
thread control this, and write the not-committed blocks using writev
from the memory locations already kept in the memory buffer.

For as long as there are messages in a file which are not added in the
memory queue or persisted on disk, make sure new messages are added to a
file until all pending messages are either processed or expired.
Normally/hopefully the speed of processing messages exceeds the amount
of new messages added to the queue.

Since the kafka and topic settings can change (if we want to restore
from a gracefull shutdown), I think we need to store almost all data
currently present in the message structure to be able to re-add them to
a partition/queue.

Regards,

B.

On Mon, 2014-04-14 at 11:28 -0700, Magnus Edenhill wrote:

My idea for this is to simply use mmap()ed files and point the rkbuf's
there.
The message contents would be copied to that mmap:ed space directly
(regardless of RD_KAFKA_MSG_F_COPY or not).

—
Reply to this email directly or view it on GitHub.

edenhill · 2014-04-15T20:45:40Z

Good input, B!

Regarding writev:
Using mmap()ed memory rather than scheduled writev()s will work though since the kernel's write-out caching will make it a nop on most memory writes, thus very fast.

Regarding reading incomplete spool files on restart after a non-graceful shutdown:
I think the reader should be forgiving and simply stop reading messages from the spool file when
things start looking fishy, but not error out.

Re message metadata:
The following metadata will probably need to be stored per message:

partition
absolute message timeout
key & key_len
payload & payload len

Since messages may finish (through delivery or error) in a non-sequencial order (mainly due to them belonging to different partitions), there needs to be a spool file management layer that knows when the last message from a spool has finished and then remove the spool file.
This could be as simple as an in-memory refcount for each file.

I do have an old working branch somewhere with some early work for something like this, I'll see if I can dig it up.

bartw72 · 2014-04-20T09:51:40Z

The main issue for persistency is in producing the messages in my use-case. I'd rather have the consumer client rely on kafka for reading, so if you want to re-read data which was not properly read earlier, they do so by restarting from the last know good point (offset). This minimizes the duplication of kafka functionality in the client. Also the chance of having to store data locally is reduced to producers which connect to brokers that are not high available (safety net).

banks · 2015-02-11T17:22:45Z

Hi I'm really interested by this thread.

We are about to start using Kafka over an inferior (but simpler!) inhouse "buffer" store that was based on Facebook's Scribe.

We need to continue to support "fire and forget" style publishing from large PHP web app for high volume logging type purposes where delaying user request is not acceptable and we have several other parts of our infrastructure integrated with Scribe already.

A significant benefit of Scribe is it's Buffer store where if it looses connection to upstream Scribe server it will start writing to disk until periodic check re-establishes connection. It then streams disk file back to upstream and when done turns back to straightforward in-memory streaming.

Trying to figure out how to write a Scribe store that uses librdkafka to write to Kafka cluster as upstream is pretty complex given the same desire to tolerate longer outages without memory pressure on web app machines. Although Kafka is more reliable/highly available than current solution, we don't want to lose the protection of disk spooling on web servers for best-effort eventual delivery, especially since Kafka might be deployed in a different datacenter (low latency but still bigger risk of partition/network failure) to the web servers.

In some ways the easiest option would be if librdkafka had a totally synchronous producer API such that we can try to produce a batch (to a single partition) and either get synchronous success or error returned.

That way we can leave Scribe doing the partitioning and batching and handling failures by switching to buffering. Even then we would need a reasonable way to periodically check the partition is now writable again in order to start re-delivery of spooled messages.

@edenhill do you have any thoughts on this? I know this thread is about implementing something similar within the library but, would it make sense to you to instead provide additional public API hooks that allow apps to customise this behaviour for themselves? For example if the library could accept a callback that was notified when a partition becomes unavailable and another when it becomes available again (in terms of successfully connected to a leader broker). Is there some way I can get close to that now without re-implementing a lot of details of meta-data retrieval etc. - the rd_kafka_topic_partition_available function seems like it gets really close to what I need already except that it's documented as only for use internally in partitioner callbacks...

edenhill · 2015-02-13T09:35:30Z

Hi @banks, that is an interesting use case, here are my thoughts:

Sync API

While librdkafka is very asynchronous by nature it would be possible to create a synchronous batch producer API with a bit of work. However, the problem with synch APIs is performance, they will be bound to the round-trip-time (rtt) of the broker. This can be alleviated in some part by sending larger batches, but they on the other hand are limited by max message sizes, and will increase end-to-end latency.

So if possible it is mostly a better idea to use async APIs, but the door isn't closed.

Topic and partition state

Propagating topic&partition state to the application in some way has been mentioned a bunch of times in different issues and maybe it's about time to actually implement it!

The thinking goes along these lines:

register a topic state callback. Per topic (topic_conf)? Globally (conf)?
call a poller to trigger the callback. Per topic polling (rd_kafka_topic_poll, rd_kafka_queue_poll?)? Or globally (rd_kafka_poll)
callback is provided a metadata-like topic&partition object with methods to extract state, etc.

Persistent queue

The existing plan is to provide a pluggable framework for message queue persistence,
with a default implementation provided by librdkafka itself using mmap:ed files.

There are two alternative approaches here:

messages are always provided to the persistence framework, regardless of topic state. produce() -> persistence-framework -> queue -> send-to-broker
messages are only provided to the persistence framework when the destination is down. produce () -> queue -> send-to-broker && when-down: queue -> persistence-framework && when-up: persistence-framework -> queue

1 is simpler to implement and with mmaped files the overhead should be minimal, expect the occassional disk io flushes.
2 is optimised for success (which is always a good thing), but might be a bit heavier on the implementation side of things and will be tricky to avoid reordering.

If reordering is not a (big) concern a naiive implementation on your side could simply persist messages from the dr_cb for failed messages (Timeout) and query topic metadata information regularily (through rd_kafka_metadata()) to decide when to retry those messages.

banks · 2015-02-17T14:08:18Z

@edenhill Thanks for the detailed thoughts. I need a bit of time to digest that.

I understand the performance issue around sync API but I guess the rationale is that Scribe already handles many of the concerns such as partitioning and batching and I'm trying to plug into the part that just sends a batch upstream which either succeeds or fails. If I wrote my own kafka protocol serializers for metadata and produce RPCs I could make that work really simply, at the expense of duplicating low level client code you've done and tested - I couldn't see an easy way to re-use that code from librdkafka on a quick glance.

The max message size restriction in kafka is a limiter on batch size and so throughput for messages sent this way but in practice for our uses it would be unlikely to be a constraint. Especially since multiple partitions can be sending batches in parallel (at least waiting on acks in separate threads) on the same broker connection still without interfering and still give a per-partition sync send/ack interface.

I'll let you know what we end up doing once we've had a chance to try a few options.

edenhill · 2015-02-17T22:50:23Z

Requests for a sync interface comes up quite frequently and I try to steer people clear off those ideas, but I acknowledge there are situations where such an interface is warranted and I'd like to accomodate those use cases in the best manner.

To summarize:

Throughput

The main problem with a sync interface is throughput, the application must wait for the full server round-trip-time (rtt) before being able to send the next message(s).
E.g.; if the full rtt (including network, system and application latencies) is 2ms and one message is sent for each produce call then the maximum throughput is a mind boggling 500 messages/second.

Batching

Batching alleviates this situation, thus the sync interface should also provide a batched version.

Transactional batching

Should the produced batches be treated as atomic batches when sent to the broker?
Pros: Allows transactional messaging - either all messages in the batch are succesfully delivered, or none.
Cons: The application needs to be aware of maximum message/batch sizes, etc.

I'd be happy to work out a solution with you that suits your needs, it is not always optimal to conceive new functionality without an existing use case/problem to solve.

cleaton · 2015-02-18T04:23:21Z

In my company we use librdkafka to send events collected from an nginx server. These servers could be in locations that occasionally have connection issues to our kafka cluster. Because of this we wanted to have a (somewhat) reliable buffer so we do not lose too much if our producer dies when there are a lot of messages stuck in the output queue.

What we do right now is to parse the the nginx log files and create a separate progress file per log file that stores the sent/unset status per log line. In this way we can make sure all messages are sent at least once. Throughput has not been an issue because so far it has been faster than our nginx server during normal network conditions (20k+ messages/second).

Pros

Easy restart and resume.
Easy debugging by looking at the line sent status.
Does not block nginx. This is very important for us. Whatever issues our producer have it should never block nginx. This is why in the end our producer runs separately instead of integrate it in nginx.
Simple implementation.

Cons

The progress file takes quite some space because we store status per line. The async nature makes it more troublesome to store a offset to where everything has been processed.
There is probably a serious throughput impact. Acceptable for us though.

Mmap

The problem I see with mmap (correct me if I am wrong) is that it would still be a fixed size queue, and could potentially block the producer and whatever application it is connected to. In our case it would still help a lot though since I could simply store a single offset in the progress file (we process nginx logfiles sequentially in a single thread).

Transactional batching

I like this feature and think it could be very useful (provides a lot of flexibility). Could each batch be treated as a queue and block when it is full? In this way the application does not necessary need to know about the sizes. Only needs to know what to do when the batch is full.

Thanks!

edenhill · 2015-02-18T09:55:30Z

Hi @cleaton and thanks for your input.

Mmap

It would use fixed sized ledger files, when the first file is full, a second one is created and new messages are appended to it, and so on.
There would be a configurable maximum queue size for the on-disk queue as well, but it can be much larger than the in-memory queue.

Transactional batching

That's a good idea, what would such an interface look like?
Something like this?

   batch = kafka_batch_new();
   for my msgs {
         if  not batch->add(msg)
             // batch is full, decide on something
   }

   if err = produce_batch(batch, FLAG_SYNC)
        // all messages in batch failed
    else
        // all messages in batch sent

There is no point in blocking on an atomic sync batch send with too many messages since it is limited by a hard configuration value rather than any current queue depths.

banks · 2015-02-18T12:34:08Z

@edenhill, that transactional batching interface suggested would work wonderfully for my case too I think.

I'd be happy to work out a solution with you that suits your needs, it is not always optimal to conceive new functionality without an existing use case/problem to solve.

I'm not sure it's directly related to this issues (built in persistence) so would you prefer to have that conversation elsewhere? Another issue? Or an alternate discussion thread? I'm very happy to write up a more thorough overview of Scribe and our specific needs (with suggestions on potential API I think would fit them) if that helps?

edenhill · 2015-02-18T13:38:30Z

@banks, that sounds great, I created issue #204, can you do your write-up there?

As for persistence (this issue), it will work for batches too.

ankurjohri · 2015-09-13T01:16:03Z

Is the sync interface for producer available now?

edenhill · 2015-09-16T18:16:56Z

@ankurjohri The sync interface is something else, see here:
https://github.com/edenhill/librdkafka/wiki/Sync-producer

This issue here is tracking persisting messages on disk on the producer until they are acked by the broker.

DEvil0000 · 2016-02-08T10:41:07Z

+1 I am also interested in this.
I think it should also survive a process crash but I am not sure if mmap on a file will survive this. Connect me if I am wrong but a process crash will corrupt the mmaped file - right?
Maybe it works if you only append to it or write a lot of small files (one message per file)?

edenhill · 2016-02-08T10:43:47Z

mmap or not file corruption is related to the sync frequency, so this can be controlled at the expense of performance.

edenhill · 2016-03-12T17:22:01Z

It should optionally encrypt messages before writing to disk.

billygout · 2016-06-08T19:33:23Z

@edenhill: At my company, we are looking to replace a rsyslog based sender with librdkafka, and one of the sticking points is the advantage given by rsyslog's "disk-assisted queues" (DAQ), which appears to be the same thing as is being discussed here, with the exception that with disk-assisted queues, the messages are written to disk only when they fail to be received by the rsyslog receiver, which sounds like your second alternative ("messages are only provided to the persistence framework when the destination is down").

I have considered rolling my own disk-persistence triggered from the "dr_cb" callback. I find it quite excellent that the entire message sticks around and is passed to that callback, and isn't deallocated until after the callback returns. Do you see any issues with using that callback for the purposes of writing failed messages to disk?

I'd much rather use whatever you eventually come up with for this internal to librdkafka, so here's me hoping this gets implemented soon!

Thanks!

edenhill · 2016-06-08T19:54:40Z

@billygout Unfortunately there are no plans to implement this in librdkafka in the foreseeable future, so you are better off writing your own, and you should be fine using the delivery callback.

I do suggest, however, that you write the message to disk prior to produce(), and only mark the on-disk copy as "delivered" from your dr_msg_cb, this way your messages will also survive application crashes.

billygout · 2016-06-08T19:57:46Z

@edenhill OK, thanks!
On Jun 8, 2016 12:54 PM, "Magnus Edenhill" [email protected] wrote:

@billygout https://github.com/billygout Unfortunately there are no
plans to implement this in librdkafka in the foreseeable future, so you are
better off writing your own, and you should be fine using the delivery
callback.

I do suggest, however, that you write the message to disk prior to
produce(), and only mark the on-disk copy as "delivered" from your
dr_msg_cb, this way your messages will also survive application crashes.

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#31 (comment),
or mute the thread
https://github.com/notifications/unsubscribe/ABoBEs97ARs8DSTovb3omjd_uHr43d-Fks5qJx4HgaJpZM4BQpnX
.

zyrikby · 2016-10-05T06:32:46Z

@billygout Do you plan to release this feature or make a PR? I'm currently looking for something similar.

billygout · 2016-10-05T17:33:07Z

@zyrikby nope. i wish i was :)

DEvil0000 · 2016-10-06T15:44:12Z

I implemented this with SQLite for our librdkafka wrapper. Works with about 350Msg/s on bare metal HP server. So I can tell you its working with (for me) acceptable speed.
But atm I can not share the code - sorry.

billygout · 2016-10-06T15:59:40Z

@DEvil0000, can you reveal any of the design aspects of it? I'd like to know: 1) whether you store every record or only the ones that fail, 2) are you using the delivery report callback (or any callback, for that matter), 3) how do you keep track of which ones have been sent, 4) does same program send the current messages along with the stored ones or do you have a separate program to send the stored ones, and 5) assuming you use the same program for producing both current and stored messages, how you switch between these tasks.

DEvil0000 · 2016-10-07T13:39:11Z

SQLite db structure looks like: (id, key, message, state)

on producer startup (after kafka connection is ok):

state is getting reset for all messages in SQLite state="waiting for delivery"
SQLite is checked for left over messages to send them (see on produce 2. and so on)

on produce call:

store (key, message) to SQLite, set state="waiting for delivery"
SQLite is checked for left over messages to send them (ordered by id)
update state="in flight" for the selected messages
do real kafka produce and associate the SQLite entry with msg_opaque=id

on dr_callback:

remove entry by id from SQLite on success
do whatever you want on failure

DEvil0000 · 2018-05-07T19:36:15Z

I was initially using a conservative configured sqlite approach for doing this but I do not remember full details. It was doing the retry based on metadata but also other things. The max msg rate with a reasonable size was about 10-15kMsg/s (single threaded).mmap is not a good idea for persisting data i think.Now I am with different architecture, erlang brod and mnesia (to persist) but did not test the throuput there. Von meinem Samsung Gerät gesendet.

…

-------- Ursprüngliche Nachricht -------- Von: sing3u <[email protected]> Datum: 07.05.2018 20:52 (GMT+01:00) An: edenhill/librdkafka <[email protected]> Cc: "A. Binzxxxxxx" <[email protected]>, Mention <[email protected]> Betreff: Re: [edenhill/librdkafka] producer message queue persistance on disk (#31) @DEvil0000, I thought partition leaderships are changed when brokers are online/offline while consumer are rebalanced when the consumers are offline/online. @edenhill, do you think the storage management complexity warrants tools such as SQLLite or RocksDB? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread. {"api_version":"1.0","publisher":{"api_key":"05dde50f1d1a384dd78767c55493e4bb","name":"GitHub"},"entity":{"external_key":"github/edenhill/librdkafka","title":"edenhill/librdkafka","subtitle":"GitHub repository","main_image_url":"https://cloud.githubusercontent.com/assets/143418/17495839/a5054eac-5d88-11e6-95fc-7290892c7bb5.png","avatar_image_url":"https://cloud.githubusercontent.com/assets/143418/15842166/7c72db34-2c0b-11e6-9aed-b52498112777.png","action":{"name":"Open in GitHub","url":"https://github.com/edenhill/librdkafka"}},"updates":{"snippets":[{"icon":"PERSON","message":"@chienhsingwu in #31: @DEvil0000, I thought partition leaderships are changed when brokers are online/offline while consumer are rebalanced when the consumers are offline/online. \r\n\r\n@edenhill, do you think the storage management complexity warrants tools such as SQLLite or RocksDB?"}],"action":{"name":"View Issue","url":"#31 (comment)

edenhill · 2018-05-07T19:58:33Z

@DEvil0000 I'm curious why mmap would be a bad idea for persisting data, care to elaborate?

DEvil0000 · 2018-05-07T20:59:42Z

I am not an expert on this but as far as i know the data might get corrupted in case of process crashes and such. Von meinem Samsung Gerät gesendet.

…

-------- Ursprüngliche Nachricht -------- Von: Magnus Edenhill <[email protected]> Datum: 07.05.2018 21:58 (GMT+01:00) An: edenhill/librdkafka <[email protected]> Cc: "A. Binzxxxxxx" <[email protected]>, Mention <[email protected]> Betreff: Re: [edenhill/librdkafka] producer message queue persistance on disk (#31) @DEvil0000 I'm curious why mmap would be a bad idea for persisting data, care to elaborate? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread. {"api_version":"1.0","publisher":{"api_key":"05dde50f1d1a384dd78767c55493e4bb","name":"GitHub"},"entity":{"external_key":"github/edenhill/librdkafka","title":"edenhill/librdkafka","subtitle":"GitHub repository","main_image_url":"https://cloud.githubusercontent.com/assets/143418/17495839/a5054eac-5d88-11e6-95fc-7290892c7bb5.png","avatar_image_url":"https://cloud.githubusercontent.com/assets/143418/15842166/7c72db34-2c0b-11e6-9aed-b52498112777.png","action":{"name":"Open in GitHub","url":"https://github.com/edenhill/librdkafka"}},"updates":{"snippets":[{"icon":"PERSON","message":"@edenhill in #31: @DEvil0000 I'm curious why mmap would be a bad idea for persisting data, care to elaborate?"}],"action":{"name":"View Issue","url":"#31 (comment)"}}}

edenhill · 2018-05-08T08:39:27Z

Use msync() to flush page cache to disk, just like fsync():
http://man7.org/linux/man-pages/man2/msync.2.html

chienhsingwu · 2018-05-08T15:58:18Z

Thanks guys for the tips.

@edenhill: another question, for our case, we would like to send the messages as soon as possible, I see queue.buffering.max.ms can be set to 1 or 0. What does the value 0 mean? Should we set to 0 instead of 1?

DEvil0000 · 2018-05-08T16:28:08Z

Kafka is not meant to deliver messages as fast as possible end-to-end in terms of rtt. You can tune it to some degree with those settings and with related settings in the broker config but this will lower your bandwith and overall throuput dramatically. You can not have all the things.If you need fast end-to-end delivery you may use something else or rethink your architecture. Von meinem Samsung Gerät gesendet.

…

-------- Ursprüngliche Nachricht -------- Von: sing3u <[email protected]> Datum: 08.05.2018 17:58 (GMT+01:00) An: edenhill/librdkafka <[email protected]> Cc: "A. Binzxxxxxx" <[email protected]>, Mention <[email protected]> Betreff: Re: [edenhill/librdkafka] producer message queue persistance on disk (#31) Thanks guys for the tips. @edenhill: another question, for our case, we would like to send the messages as soon as possible, I see queue.buffering.max.ms can be set to 1 or 0. What does the value 0 mean? Should we set to 0 instead of 1? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread. {"api_version":"1.0","publisher":{"api_key":"05dde50f1d1a384dd78767c55493e4bb","name":"GitHub"},"entity":{"external_key":"github/edenhill/librdkafka","title":"edenhill/librdkafka","subtitle":"GitHub repository","main_image_url":"https://cloud.githubusercontent.com/assets/143418/17495839/a5054eac-5d88-11e6-95fc-7290892c7bb5.png","avatar_image_url":"https://cloud.githubusercontent.com/assets/143418/15842166/7c72db34-2c0b-11e6-9aed-b52498112777.png","action":{"name":"Open in GitHub","url":"https://github.com/edenhill/librdkafka"}},"updates":{"snippets":[{"icon":"PERSON","message":"@chienhsingwu in #31: Thanks guys for the tips.\r\n\r\n@edenhill: another question, for our case, we would like to send the messages as soon as possible, I see queue.buffering.max.ms can be set to 1 or 0. What does the value 0 mean? Should we set to 0 instead of 1?"}],"action":{"name":"View Issue","url":"#31 (comment)"}}} {"@type":"MessageCard","@context":"http://schema.org/extensions","hideOriginalBody":"false","originator":"37567f93-e2a7-4e2a-ad37-a9160fc62647","title":"Re: [edenhill/librdkafka] producer message queue persistance on disk (#31)","sections":[{"text":"","activityTitle":"**sing3u**","activityImage":"https://avatars0.githubusercontent.com/u/9533964?s=160\u0026v=4","activitySubtitle":"@chienhsingwu","facts":[]}],"potentialAction":[{"name":"Add a comment","@type":"ActionCard","inputs":[{"isMultiLine":true,"@type":"TextInput","id":"IssueComment","isRequired":false}],"actions":[{"name":"Comment","@type":"HttpPOST","target":"https://api.github.com","body":"{\"commandName\":\"IssueComment\",\"repositoryFullName\":\"edenhill/librdkafka\",\"issueId\":31,\"IssueComment\":\"{{IssueComment.value}}\"}"}]},{"targets":[{"os":"default","uri":"https://github.com/edenhill/librdkafka/issues/31#issuecomment-387452690"}],"@type":"OpenUri","name":"View on GitHub"},{"name":"Unsubscribe","@type":"HttpPOST","target":"https://api.github.com","body":"{\"commandName\":\"MuteNotification\",\"threadId\":21141975}"}],"themeColor":"26292E"}

edenhill · 2018-05-08T17:05:18Z

@chienhsingwu See CONFIGURATION.md. Basically it sets the maximum amount of time the producer will buffer messages into a message-set/batch before sending to the broker. Lower buffer time = lower latency, and perhaps smaller batches depending on your produce rate. Smaller batches will result in slightly lower thruput.

edenhill · 2018-05-08T17:06:17Z

@DEvil0000 It all depends on your expectations and requirements, round trips in single digit milliseconds is indeed possible with Kafka.

chienhsingwu · 2018-05-09T13:52:42Z

Hi @edenhill looking into the difference between callbacks such as dr_msg_cb and interceptor, sounds like interceptors are called independent from poll calls while dr_msg_cb requires poll calls. Do I need to call poll when using interceptor?

edenhill · 2018-05-15T08:52:19Z

on_send() is triggered from produce*() while on_acknowledgement() is called from internal librdkafka threads or produce*(), so there is no need to call poll() from the interceptors point of view.

chienhsingwu · 2018-07-19T16:39:39Z

Hi @edenhill we started implementing based on interceptor and came to a couple of snags. We tested sending messages to invalid topic or invalid broker list to understand the call back semantics. The on_acknowledge interceptor call returns Success of delivery of a few messages before reporting the rest of those 10 failed. The number of success ones varies for invalid topic. For invalid broker list all were successful. we are using version 0.11.4.

Are those cases reasonable tests? Any insights to share with us?

below is the on_acknowledge call back:
_rd_kafka_resp_err_t onAckInterceptor(rd_kafka_t *pHandle, rd_kafka_message_t *rkmessage, void *ic_opaque) {

string ackMessage = "STATUS DESCRIPTION FROM INTERCEPTOR: ";
ackMessage += rd_kafka_err2str(rkmessage->err);
ackMessage += " ";
string x((char *)rkmessage->payload, rkmessage->len);
ackMessage += " PAYLOAD: ";
ackMessage += x;
event_error_callback_ptr(ackMessage);
return RD_KAFKA_RESP_ERR_NO_ERROR;

}_

edenhill · 2018-07-19T16:41:54Z

@chienhsingwu Interesting!

What was your request.required.acks setting? And does your program have a dr_cb or dr_msg_cb registered? If so, does it also give you SUCCESS for failed messages?

chienhsingwu · 2018-07-19T16:50:30Z

We did not configure request.required.acks. So it's whatever the default. the code does not use dr_cb/dr_msg_cb. But we did a separate test using them it seemed fine.

edenhill · 2018-07-19T16:57:04Z

Default acks is 1, so should be fine for this (but do consider using acks=all).

I think the problem might be with missing dr_msg_cb, could you try setting one up (it doesnt need to do anything) and see if you get the same behaviour?

chienhsingwu · 2018-07-19T16:58:50Z

I don't need to call poll, right?

edenhill · 2018-07-19T17:10:00Z

You will need to call poll() eventually to avoid the queue filling up.
But this is just for testing

chienhsingwu · 2018-07-19T17:20:28Z

hmm, then in that case not much of a difference between interceptor and dr_msg_cb from our perspective... Independently of that, it's probably is desirable to break that dependency between interceptor and dr_msg_cb, right?

edenhill · 2018-07-19T17:39:13Z

The idea was to set a dr_msg_cb now to check if the lack of a dr_msg_cb is causing on_ack to receive success even on failure.

chienhsingwu · 2018-07-19T17:41:03Z

Thanks. I will report back the results.

chienhsingwu · 2018-07-19T17:46:07Z

OK just did that. The results did not change.

edenhill · 2018-07-19T19:05:21Z

Did you get the same error code in the dr_msg_cb as on_ack?

chienhsingwu · 2018-07-19T19:24:04Z

I did not have poll called so I did not get called on dr_msg_cb. But a separate test with poll call on dr_msg_cb, without interceptor, showed correct errors.

edenhill · 2018-07-19T19:51:33Z

Can you try with dr_msg_cb and your interceptor at the same time? It would be interesting to compare error codes between the two

chienhsingwu · 2018-07-20T14:08:33Z

@edenhill, we did that. The result remained the same.

edenhill · 2018-07-20T14:34:11Z

Thank you.
This is a bug in librdkafka, the delivery error is not set on the message before passing it to the interceptor:
https://github.com/edenhill/librdkafka/blob/master/src/rdkafka_broker.c#L2070

We'll fix this on master soon and include it in the next maintenance release.

chienhsingwu · 2018-07-20T14:46:16Z

Thanks!

banks mentioned this issue Feb 18, 2015

Atomic/transactional batches #204

Closed

edenhill mentioned this issue Apr 8, 2015

Question/Improvement: connected/connection lost event callback (c++) #231

Closed

edenhill added the producer label May 15, 2015

DEvil0000 mentioned this issue Oct 12, 2015

rdkafka stopping connection retries on RdKafka::ERR__ALL_BROKERS_DOWN #373

Closed

edenhill mentioned this issue Jul 20, 2018

on_acknowledgement interceptor does not see proper error message #1892

Closed

write2jaydeep mentioned this issue May 9, 2019

multiple call of rd_kafka_new() from threads #2317

Closed

6 tasks

MockingJayWong mentioned this issue Jul 9, 2019

rd_kafka_topic_new after using admin api #2399

Closed

7 tasks

Furuta-Masakazu-quick mentioned this issue Sep 29, 2020

Producer detects "All broker connections are down" when rolling restarts #3090

Closed

7 tasks

producer message queue persistance on disk #31

producer message queue persistance on disk #31

Comments

edenhill commented Nov 28, 2013

bartw72 commented Apr 13, 2014

edenhill commented Apr 14, 2014

bartw72 commented Apr 15, 2014

edenhill commented Apr 15, 2014

bartw72 commented Apr 20, 2014

banks commented Feb 11, 2015

edenhill commented Feb 13, 2015

Sync API

Topic and partition state

Persistent queue

banks commented Feb 17, 2015

edenhill commented Feb 17, 2015

Throughput

Batching

Transactional batching

cleaton commented Feb 18, 2015

Pros

Cons

Mmap

Transactional batching

edenhill commented Feb 18, 2015

Mmap

Transactional batching

banks commented Feb 18, 2015

edenhill commented Feb 18, 2015

ankurjohri commented Sep 13, 2015

edenhill commented Sep 16, 2015

DEvil0000 commented Feb 8, 2016

edenhill commented Feb 8, 2016

edenhill commented Mar 12, 2016

billygout commented Jun 8, 2016 • edited Loading

edenhill commented Jun 8, 2016

billygout commented Jun 8, 2016

zyrikby commented Oct 5, 2016

billygout commented Oct 5, 2016

DEvil0000 commented Oct 6, 2016

billygout commented Oct 6, 2016

DEvil0000 commented Oct 7, 2016

DEvil0000 commented May 7, 2018 via email

edenhill commented May 7, 2018

DEvil0000 commented May 7, 2018 via email

edenhill commented May 8, 2018

chienhsingwu commented May 8, 2018

DEvil0000 commented May 8, 2018 via email

edenhill commented May 8, 2018

edenhill commented May 8, 2018

chienhsingwu commented May 9, 2018

edenhill commented May 15, 2018

chienhsingwu commented Jul 19, 2018

edenhill commented Jul 19, 2018

chienhsingwu commented Jul 19, 2018

edenhill commented Jul 19, 2018

chienhsingwu commented Jul 19, 2018

edenhill commented Jul 19, 2018

chienhsingwu commented Jul 19, 2018

edenhill commented Jul 19, 2018

chienhsingwu commented Jul 19, 2018

chienhsingwu commented Jul 19, 2018

edenhill commented Jul 19, 2018

chienhsingwu commented Jul 19, 2018

edenhill commented Jul 19, 2018

chienhsingwu commented Jul 20, 2018

edenhill commented Jul 20, 2018

chienhsingwu commented Jul 20, 2018

billygout commented Jun 8, 2016 •

edited

Loading