Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change of behavior in 2.10.0/1.5.0 - standard example does not consume when subscribe before create topic #863

Closed
peterbroadhurst opened this issue Dec 20, 2020 · 1 comment
Labels

Comments

@peterbroadhurst
Copy link

Environment Information

  • OS [e.g. Mac, Arch, Windows 10]: Docker (Ubuntu 18.04)
  • Node Version [e.g. 8.2.1]: 14
  • NPM Version [e.g. 5.4.2]: 14
  • C++ Toolchain [e.g. Visual Studio, llvm, g++]: Ubuntu 18.04
  • node-rdkafka version [e.g. 2.3.3]: 2.10.0

Steps to Reproduce

Use the "Flowing mode" sample here: https://github.com/Blizzard/node-rdkafka#standard-api-1

Create the consumer, then create the topic (such as automatically from the producer).

node-rdkafka Configuration Settings

n/a

Additional context

The consume-before-create pattern is documented in various node-rdkafka and rdkafka issues as something that should be supported.
However, since confluentinc/librdkafka#1540 was introduced in 1.5.0 (2.10.0) I believe there is a problem, which seems to me to point at some missing guidance/documentation in node-rdkafka for the right usage of the API.

What I found is:

  • Here node-rdkafka discards an err that occurs, if you simply use consume() (no parameters) and rely on the data event:
    /**
    * Data event. called whenever a message is received.
    *
    * @event KafkaConsumer#data
    * @type {KafkaConsumer~Message}
    */
    self.emit('data', message);
    cb(err, message);
    -
  • Here (in the C layer) node-rdkafka exits the consume loop if we hit an error:
    looping = false;

So if I've understood the above correctly, the problem with the standard "Flowing mode" sample is it leaves any error (such as the new error introduced in confluentinc/librdkafka#1540) to result in the consumer silently stopping listening for the messages. Even when the subscription topic/regex list detects new topics and fetches, the app does not receive the messages because there is no running consume loop to drive the data events.

I believe there are other cases in the past that would have exposed this problem pattern.
For our code, we moved to using the consume(cb) variant, and handling the err by scheduling an asynchronous reconnect/resubscribe loop.

However, I'm not sure if we've taken the right approach - it's quite heavyweight to pull all that retry logic outside of the client.

So I thought I'd raise this issue for discussion, in case it helped others, and to see if one of the following would be useful:

  1. The "Flowing mode" example could be updated with a best practice approach to handling the case of an error on the consume loop
  2. A default callback implementation could be updated to include a default implementation, that for example restarts the consumer loop. I see the default implementation is currently empty:
    cb = function() {};
@iradul iradul added the bug label Dec 22, 2020
@iradul
Copy link
Collaborator

iradul commented Dec 22, 2020

Thank you for the great report.

As you pointed out there is a change introduced with librdkafka 1.5.0 which broke current logic in two edge cases: when topic or partition doesn't exist or when authorization fails for the topic.

One way to solve this would be to just ignore those errors and keep the consumption loop retrying, pretty much the same way it was done before. It seems to me this makes the most sense.
At some point, we may also want to propagate those errors to the javascript consumer may be in the form of warnings (?)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants