-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug] pulsar keep creating dead letter queue producer and exceed the maximum limit #20635
[Bug] pulsar keep creating dead letter queue producer and exceed the maximum limit #20635
Comments
A single bookie failure shouldn’t have that type of impact unless it drops the number of active/writable bookies below the write/ack quorum. Can you share the number of bookies in your cluster and the write and ack quorum settings ? |
There are 4 bookies configured for each environment. We also have the following configure that requires 3 bookie up running. In another word, only 1 bookie is allowed to be down without service interruption. |
The issue had no activity for 30 days, mark with Stale label. |
I think we are facing the same problem. This issue might also be related. We face this issue from time to time, when due to engineer, we encounter and schema incompatibility between the received messages and the one consumer expects. We have disabled schema validations on broker side by choice. However, rather than just sending those messages to DLQ, Pulsar client side fails validation and goes into a producer creating loop. Here is the example code to replicate the issue. We had a look at the client code and the problem seems to be in We were able to "fix" this problem when in ((ProducerBuilderImpl<byte[]>) client.newProducer(Schema.AUTO_PRODUCE_BYTES(schema))) we created the producer with as: ((ProducerBuilderImpl<byte[]>) client.newProducer(Schema.BYTES)) and in
we sent the message with
I am not sure if this is proper way to fix it, as it leaves the DLQ with binary schema in registry. This would be fine for us, but not sure if someone else is relying on it having a more descriptive schema. However, I would like to imagine DLQ as a dumping ground that should be able to accept all types of messages. If this is an OK fix, I can create a pull request for it. edit:
|
The issue had no activity for 30 days, mark with Stale label. |
@tonisojandu I have submitted a PR #23824 to address the producer leak. There's also a retry backoff so that such failures don't cause excessive load on the broker. This PR doesn't make changes to the schema validation. I'm planning to address that in a separate PR. Most likely there would need to be a separate mode for AUTO_PRODUCE_BYTES where it can be chosen on the client side to skip schema validation. That feature is also useful in other use cases since it can reduce CPU usage on the client side when schema validation is skipped at producing time. |
Search before asking
Version
2.11
Minimal reproduce step
Use pulsar java client library to create a consumer with dlq prodcer.
It seems keep creating dead letter queue producer, and eventually hit the maximum limit
What did you expect to see?
extra producer should not be created if there is an issue on pulsar
What did you see instead?
created over 10000 producers and eventually exceed the limits
Anything else?
see logs
Are you willing to submit a PR?
The text was updated successfully, but these errors were encountered: