-
Notifications
You must be signed in to change notification settings - Fork 3.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
program crashed for rd_kafka_tppar_t was used after destroy #1846
Comments
When I use librdkafka, I meet a problam, the program crashed,I find that, a rktp(rd_kafka_toppar_t) was used,but its memory was destroyed, I guess there is something wrong with its refcnt |
There's not much to go on here. |
I use the librdkafka version is 0.11.4(the lastest),I just use the program rdkafka_consumer_example_cpp which is in the giving example,change the config:
|
Thank you, this is great! |
OK,I had send logs to your email([email protected]),please check |
Thanks. I guess the program crashed at the end of this log file? Did you manually try to terminate the program before this crash? (e.g., ctrl-c)? |
I just run the program normally, I find that, the reason of problem is that I create too many partitions of the topic(320 partitions),when I create 8 partitions of the topic, the program will run OK;I think if you change your partitions to 320 or more, and consume data from the topic when producing data to it, maybe you will see the problem |
@TomGitter What librdkafka version are you using? What Operating system and version? |
I use the librdkafka version is 0.11.4(the lastest),and kafka version is 0.10.1.0. I can meet the problem with windows 7 flagship and redhat enterprise linux server releae 6.7 |
@edenhill , we have meet the similar issue in version 0.11.4 and 0.11.6, we get the logs and gdb stacks. It seems that the rktp=0x7f342c08c870 have destroyed 2 times, so the rkq_refcnt is 0 in second destroy operations. I have checked the code but did not find the root case. Can you help check? E001:3-8 11:19:58.149(296302|7679)RDKAFKA-7-REQERR: rdkafka#consumer-3: [thrd:main]: ssl://10.249.92.197:26328/20001: MetadataRequest failed: Local: Timed out: actions Retry |
Is this somewhat easily reproducible for you? There's been some fixes to partition management on master for the upcoming v1.0.0 release. Otherwise, make sure your rd_kafka_topic_new() / rd_kafka_topic_destroy() calls are symetric, exactly one destroy per new. |
@edenhill , thanks for your reply. Yes, it's easily reproduce in our performance test environment. In my application, call rd_kafka_topic_new at process starting and call rd_kafka_topic_destroy when process exit, so i think it's ok. And i have learned the code these days. From the logs, i saw that when the operation version is outdated, the rd_kafka_toppar_t object 0x7f342c08c870 will be destroy. the call stack maybe like this:
Almost the same time, in the same thread(i append process id and thread id to the logs, like 296302|7679), the same rd_kafka_toppar_t object 0x7f342c08c870 had been destroy the second time and crashed. The stack is below:
The logs shown:
I am not sure it's normally to free the same object two times, it should not reference any more after destroyed. would you please help check? |
Read the FAQ first: https://github.com/edenhill/librdkafka/wiki/FAQ
Description
How to reproduce
<your steps how to reproduce goes here, or remove section if not relevant>
IMPORTANT: Always try to reproduce the issue on the latest released version (see https://github.com/edenhill/librdkafka/releases), if it can't be reproduced on the latest version the issue has been fixed.
Checklist
IMPORTANT: We will close issues where the checklist has not been completed.
Please provide the following information:
<REPLACE with e.g., v0.10.5 or a git sha. NOT "latest" or "current">
<REPLACE with e.g., 0.10.2.3>
<REPLACE with e.g., message.timeout.ms=123, auto.reset.offset=earliest, ..>
<REPLACE with e.g., Centos 5 (x64)>
debug=..
as necessary) from librdkafkaThe text was updated successfully, but these errors were encountered: