Change of behavior in 2.10.0/1.5.0 - standard example does not consume when subscribe before create topic #863

peterbroadhurst · 2020-12-20T23:27:58Z

Environment Information

OS [e.g. Mac, Arch, Windows 10]: Docker (Ubuntu 18.04)
Node Version [e.g. 8.2.1]: 14
NPM Version [e.g. 5.4.2]: 14
C++ Toolchain [e.g. Visual Studio, llvm, g++]: Ubuntu 18.04
node-rdkafka version [e.g. 2.3.3]: 2.10.0

Steps to Reproduce

Use the "Flowing mode" sample here: https://github.com/Blizzard/node-rdkafka#standard-api-1

Create the consumer, then create the topic (such as automatically from the producer).

node-rdkafka Configuration Settings

n/a

Additional context

The consume-before-create pattern is documented in various node-rdkafka and rdkafka issues as something that should be supported.
However, since confluentinc/librdkafka#1540 was introduced in 1.5.0 (2.10.0) I believe there is a problem, which seems to me to point at some missing guidance/documentation in node-rdkafka for the right usage of the API.

What I found is:

Here node-rdkafka discards an err that occurs, if you simply use consume() (no parameters) and rely on the data event:

node-rdkafka/lib/kafka-consumer.js

Lines 440 to 447 in 6a36b45

    
                 /** 
        
                  * Data event. called whenever a message is received. 
        
                  * 
        
                  * @event KafkaConsumer#data 
        
                  * @type {KafkaConsumer~Message} 
        
                  */ 
        
                 self.emit('data', message); 
        
                 cb(err, message);

-

Here (in the C layer) node-rdkafka exits the consume loop if we hit an error:

node-rdkafka/src/workers.cc

Line 472 in 6a36b45

looping = false;

So if I've understood the above correctly, the problem with the standard "Flowing mode" sample is it leaves any error (such as the new error introduced in confluentinc/librdkafka#1540) to result in the consumer silently stopping listening for the messages. Even when the subscription topic/regex list detects new topics and fetches, the app does not receive the messages because there is no running consume loop to drive the data events.

I believe there are other cases in the past that would have exposed this problem pattern.
For our code, we moved to using the consume(cb) variant, and handling the err by scheduling an asynchronous reconnect/resubscribe loop.

However, I'm not sure if we've taken the right approach - it's quite heavyweight to pull all that retry logic outside of the client.

So I thought I'd raise this issue for discussion, in case it helped others, and to see if one of the following would be useful:

The "Flowing mode" example could be updated with a best practice approach to handling the case of an error on the consume loop
A default callback implementation could be updated to include a default implementation, that for example restarts the consumer loop. I see the default implementation is currently empty:

node-rdkafka/lib/kafka-consumer.js

Line 410 in 6a36b45

cb = function() {};

The text was updated successfully, but these errors were encountered:

iradul · 2020-12-22T15:45:05Z

Thank you for the great report.

As you pointed out there is a change introduced with librdkafka 1.5.0 which broke current logic in two edge cases: when topic or partition doesn't exist or when authorization fails for the topic.

One way to solve this would be to just ignore those errors and keep the consumption loop retrying, pretty much the same way it was done before. It seems to me this makes the most sense.
At some point, we may also want to propagate those errors to the javascript consumer may be in the form of warnings (?)

iradul added the bug label Dec 22, 2020

iradul mentioned this issue Jan 6, 2021

Handle consumer topic errors in librdkafka v1.5.0+ (#863) #864

Merged

iradul closed this as completed Jan 8, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Change of behavior in 2.10.0/1.5.0 - standard example does not consume when subscribe before create topic #863

Change of behavior in 2.10.0/1.5.0 - standard example does not consume when subscribe before create topic #863

peterbroadhurst commented Dec 20, 2020

iradul commented Dec 22, 2020

Change of behavior in 2.10.0/1.5.0 - standard example does not consume when subscribe before create topic #863

Change of behavior in 2.10.0/1.5.0 - standard example does not consume when subscribe before create topic #863

Comments

peterbroadhurst commented Dec 20, 2020

iradul commented Dec 22, 2020