Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix fd leak in out_fluentd #2

Merged
merged 2 commits into from
May 21, 2015
Merged

Conversation

enukane
Copy link
Contributor

@enukane enukane commented May 21, 2015

out_fluentd plugin causes following fd/socket leakage.
cb_flush_buf handler should close sockets it opened.

# about 5mins after fluent-bit executed

$ lsof -p 2705                                                                                                                                                                                                                         [1056/1953]
lsof: WARNING: can't stat() fuse.gvfsd-fuse file system /run/user/112/gvfs
      Output information may be incomplete.
COMMAND    PID    USER   FD   TYPE DEVICE SIZE/OFF   NODE NAME
fluent-bi 2705 enukane  cwd    DIR    8,1     4096 817620 /home/enukane/Dev/fluent-bit/build
fluent-bi 2705 enukane  rtd    DIR    8,1     4096      2 /
fluent-bi 2705 enukane  txt    REG    8,1    56056 818517 /home/enukane/Dev/fluent-bit/build/bin/fluent-bit
fluent-bi 2705 enukane  mem    REG    8,1  1754876 671725 /lib/i386-linux-gnu/libc-2.19.so
fluent-bi 2705 enukane  mem    REG    8,1   134380 671728 /lib/i386-linux-gnu/ld-2.19.so
fluent-bi 2705 enukane    0u   CHR   4,64      0t0   8239 /dev/ttyS0
fluent-bi 2705 enukane    1u   CHR   4,64      0t0   8239 /dev/ttyS0
fluent-bi 2705 enukane    2u   CHR   4,64      0t0   8239 /dev/ttyS0
fluent-bi 2705 enukane    3u  IPv4  14907      0t0    TCP 192.168.0.230:59222->192.168.0.240:11185 (ESTABLISHED)
fluent-bi 2705 enukane    4u  0000    0,9        0   7811 anon_inode
fluent-bi 2705 enukane    5u  0000    0,9        0   7811 anon_inode
fluent-bi 2705 enukane    6u  0000    0,9        0   7811 anon_inode
fluent-bi 2705 enukane    8u  IPv4  14909      0t0    TCP 192.168.0.230:59224->192.168.0.240:11185 (ESTABLISHED)
fluent-bi 2705 enukane    9u  IPv4  14914      0t0    TCP 192.168.0.230:59226->192.168.0.240:11185 (ESTABLISHED)
fluent-bi 2705 enukane   10u  IPv4  14931      0t0    TCP 192.168.0.230:59228->192.168.0.240:11185 (ESTABLISHED)
fluent-bi 2705 enukane   11u  IPv4  16078      0t0    TCP 192.168.0.230:59230->192.168.0.240:11185 (ESTABLISHED)
fluent-bi 2705 enukane   12u  IPv4  14958      0t0    TCP 192.168.0.230:59232->192.168.0.240:11185 (ESTABLISHED)
fluent-bi 2705 enukane   13u  IPv4  14966      0t0    TCP 192.168.0.230:59234->192.168.0.240:11185 (ESTABLISHED)
fluent-bi 2705 enukane   14u  IPv4  14971      0t0    TCP 192.168.0.230:59236->192.168.0.240:11185 (ESTABLISHED)
fluent-bi 2705 enukane   15u  IPv4  14973      0t0    TCP 192.168.0.230:59238->192.168.0.240:11185 (ESTABLISHED)
fluent-bi 2705 enukane   16u  IPv4  14975      0t0    TCP 192.168.0.230:59240->192.168.0.240:11185 (ESTABLISHED)
fluent-bi 2705 enukane   17u  IPv4  14990      0t0    TCP 192.168.0.230:59242->192.168.0.240:11185 (ESTABLISHED)
fluent-bi 2705 enukane   18u  IPv4  15060      0t0    TCP 192.168.0.230:59244->192.168.0.240:11185 (ESTABLISHED)
fluent-bi 2705 enukane   19u  IPv4  15122      0t0    TCP 192.168.0.230:59246->192.168.0.240:11185 (ESTABLISHED)
(snip)
fluent-bi 2705 enukane   59u  IPv4  17373      0t0    TCP 192.168.0.230:59326->192.168.0.240:11185 (ESTABLISHED)

As my ubuntu's file-max is about ~40000, this won't be a problem for a while.

enukane added 2 commits May 21, 2015 17:48
cb_flush_buf handler should close new socket opened by itself

Signed-off-by: enukane <[email protected]>
edsiper added a commit that referenced this pull request May 21, 2015
out_fluentd: fix fd leak and context
@edsiper edsiper merged commit 4adfebf into fluent:master May 21, 2015
@edsiper
Copy link
Member

edsiper commented May 21, 2015

thanks for catching this.

@prashantvicky prashantvicky mentioned this pull request Aug 28, 2018
fujimotos pushed a commit to fujimotos/fluent-bit that referenced this pull request Jan 15, 2020
When Fluent Bit encounters with a partial parser definition, it
crashes badly with a segmentation fault.

    $ ./bin/fluent-bit -R parser.conf -c tail.conf
    ...
    [2020/01/15 16:11:21] [error] [parser] no parser 'format' found for 'simple' in file 'conf/timestamp.parser'
    [engine] caught signal (SIGSEGV)
    #0  0x558bc4a0a226      in  flb_parser_decoder_list_destroy() at src/flb_parser_decoder.c:700
    fluent#1  0x558bc4a05d75      in  flb_parser_conf_file() at src/flb_parser.c:566
    fluent#2  0x558bc49f4bdd      in  flb_config_set_property() at src/flb_config.c:406
    fluent#3  0x558bc49e24ae      in  flb_service_conf() at src/fluent-bit.c:446
    fluent#4  0x558bc49e2f90      in  main() at src/fluent-bit.c:807
    fluent#5  0x7fa1cb7f109a      in  ???() at ???:0
    fluent#6  0x558bc49e13a9      in  ???() at ???:0
    fluent#7  0xffffffffffffffff  in  ???() at ???:0
    Aborted

This is just because `decoders` is not being initialized properly,
and that confuses Fluent Bit to deallocate a random memmory block
on the cleanup path. Fix it.

Signed-off-by: Fujimoto Seiji <[email protected]>
edsiper pushed a commit that referenced this pull request Jan 16, 2020
When Fluent Bit encounters with a partial parser definition, it
crashes badly with a segmentation fault.

    $ ./bin/fluent-bit -R parser.conf -c tail.conf
    ...
    [2020/01/15 16:11:21] [error] [parser] no parser 'format' found for 'simple' in file 'conf/timestamp.parser'
    [engine] caught signal (SIGSEGV)
    #0  0x558bc4a0a226      in  flb_parser_decoder_list_destroy() at src/flb_parser_decoder.c:700
    #1  0x558bc4a05d75      in  flb_parser_conf_file() at src/flb_parser.c:566
    #2  0x558bc49f4bdd      in  flb_config_set_property() at src/flb_config.c:406
    #3  0x558bc49e24ae      in  flb_service_conf() at src/fluent-bit.c:446
    #4  0x558bc49e2f90      in  main() at src/fluent-bit.c:807
    #5  0x7fa1cb7f109a      in  ???() at ???:0
    #6  0x558bc49e13a9      in  ???() at ???:0
    #7  0xffffffffffffffff  in  ???() at ???:0
    Aborted

This is just because `decoders` is not being initialized properly,
and that confuses Fluent Bit to deallocate a random memmory block
on the cleanup path. Fix it.

Signed-off-by: Fujimoto Seiji <[email protected]>
edsiper pushed a commit that referenced this pull request Jan 17, 2020
When Fluent Bit encounters with a partial parser definition, it
crashes badly with a segmentation fault.

    $ ./bin/fluent-bit -R parser.conf -c tail.conf
    ...
    [2020/01/15 16:11:21] [error] [parser] no parser 'format' found for 'simple' in file 'conf/timestamp.parser'
    [engine] caught signal (SIGSEGV)
    #0  0x558bc4a0a226      in  flb_parser_decoder_list_destroy() at src/flb_parser_decoder.c:700
    #1  0x558bc4a05d75      in  flb_parser_conf_file() at src/flb_parser.c:566
    #2  0x558bc49f4bdd      in  flb_config_set_property() at src/flb_config.c:406
    #3  0x558bc49e24ae      in  flb_service_conf() at src/fluent-bit.c:446
    #4  0x558bc49e2f90      in  main() at src/fluent-bit.c:807
    #5  0x7fa1cb7f109a      in  ???() at ???:0
    #6  0x558bc49e13a9      in  ???() at ???:0
    #7  0xffffffffffffffff  in  ???() at ???:0
    Aborted

This is just because `decoders` is not being initialized properly,
and that confuses Fluent Bit to deallocate a random memmory block
on the cleanup path. Fix it.

Signed-off-by: Fujimoto Seiji <[email protected]>
edsiper pushed a commit that referenced this pull request Jan 23, 2020
When Fluent Bit encounters with a partial parser definition, it
crashes badly with a segmentation fault.

    $ ./bin/fluent-bit -R parser.conf -c tail.conf
    ...
    [2020/01/15 16:11:21] [error] [parser] no parser 'format' found for 'simple' in file 'conf/timestamp.parser'
    [engine] caught signal (SIGSEGV)
    #0  0x558bc4a0a226      in  flb_parser_decoder_list_destroy() at src/flb_parser_decoder.c:700
    #1  0x558bc4a05d75      in  flb_parser_conf_file() at src/flb_parser.c:566
    #2  0x558bc49f4bdd      in  flb_config_set_property() at src/flb_config.c:406
    #3  0x558bc49e24ae      in  flb_service_conf() at src/fluent-bit.c:446
    #4  0x558bc49e2f90      in  main() at src/fluent-bit.c:807
    #5  0x7fa1cb7f109a      in  ???() at ???:0
    #6  0x558bc49e13a9      in  ???() at ???:0
    #7  0xffffffffffffffff  in  ???() at ???:0
    Aborted

This is just because `decoders` is not being initialized properly,
and that confuses Fluent Bit to deallocate a random memmory block
on the cleanup path. Fix it.

Signed-off-by: Fujimoto Seiji <[email protected]>
@lanphan lanphan mentioned this pull request Jun 18, 2020
hsmatulisgoogle referenced this pull request in hsmatulisgoogle/fluent-bit Dec 21, 2020
This could cause a hang:
Worker #1 gets list lock,
-> Worker #1 coro #1 sends TLS handshake
-> Worker #1 coro #1 yields
-> Worker #1 coro #2 tries to acquire list lock
-> deadlock :(
cosmo0920 added a commit that referenced this pull request Oct 5, 2022
…es strictly

Without this check, the following weird error is occurred
intermittently:

```log
[0] dummy.0: [1664938706.407551000, {"message"=>"dummy"}]
[2022/10/05 11:58:27] [ info] [test] flush record
flb-rt-core_chunk_trace(32205,0x16fe87000) malloc: *** error for object 0x600002600074: pointer being realloc'd was not allocated
flb-rt-core_chunk_trace(32205,0x16fe87000) malloc: *** set a breakpoint in malloc_error_break to debug
```

The main reason is, num_records index is broken in some cases:

```
flb-rt-core_chunk_trace(32205,0x16fe87000) malloc: *** error for object 0x600002600074: pointer being realloc'd was not allocated
flb-rt-core_chunk_trace(32205,0x16fe87000) malloc: *** set a breakpoint in malloc_error_break to debug
[2022/10/05 11:58:27] [ info] [input] pausing dummy.0
Process 32205 stopped
* thread #2, name = 'flb-pipeline', stop reason = breakpoint 1.1
    frame #0: 0x00000001b34a3120 libsystem_malloc.dylib`malloc_error_break
libsystem_malloc.dylib`malloc_error_break:
->  0x1b34a3120 <+0>:  pacibsp
    0x1b34a3124 <+4>:  stp    x29, x30, [sp, #-0x10]!
    0x1b34a3128 <+8>:  mov    x29, sp
    0x1b34a312c <+12>: nop
Target 0: (flb-rt-core_chunk_trace) stopped.
(lldb) bt
* thread #2, name = 'flb-pipeline', stop reason = breakpoint 1.1
  * frame #0: 0x00000001b34a3120 libsystem_malloc.dylib`malloc_error_break
    frame #1: 0x00000001b3494844 libsystem_malloc.dylib`malloc_vreport + 428
    frame #2: 0x00000001b3497f34 libsystem_malloc.dylib`malloc_report + 64
    frame #3: 0x00000001b3488210 libsystem_malloc.dylib`realloc + 328
    frame #4: 0x0000000100006154 flb-rt-core_chunk_trace`flb_realloc(ptr=0x0000600002600074, size=18446744064764412176) at flb_mem.h:94:12
    frame #5: 0x0000000100005fc8 flb-rt-core_chunk_trace`callback_add_record(data=0x0000600003014000, size=135, cb_data=0x0000600000004010) at core_chunk_trace.c:51:28
    frame #6: 0x00000001001268b0 flb-rt-core_chunk_trace`out_lib_flush(event_chunk=0x0000600000c14000, out_flush=0x0000600001714000, i_ins=0x0000000100b09ab0, out_context=0x0000600000204a80, config=0x000000010181d200) at out_lib.c:197:9
    frame #7: 0x0000000100029d70 flb-rt-core_chunk_trace`output_pre_cb_flush at flb_output.h:517:5
    frame #8: 0x000000010044fa64 flb-rt-core_chunk_trace`co_switch(handle=0x000000010044fa64) at aarch64.c:133:4
(lldb) frane select 5
error: 'frane' is not a valid command.
(lldb) frame select 5
frame #5: 0x0000000100005fc8 flb-rt-core_chunk_trace`callback_add_record(data=0x0000600003014000, size=135, cb_data=0x0000600000004010) at core_chunk_trace.c:51:28
   48  	                           flb_calloc(1, sizeof(struct callback_record));
   49  	        } else {
   50  	            ctx->records = (struct callback_record *)
-> 51  	                           flb_realloc(ctx->records,
   52  	                                       (ctx->num_records+1)*sizeof(struct callback_record));
   53  	        }
   54  	        if (ctx->records ==  NULL) {
(lldb) po ctx->records
0x0000600002600074

(lldb) po ctx->records
0x0000600002600074

(lldb) po ctx->num_records
-559071216
```

Signed-off-by: Hiroshi Hatake <[email protected]>
cosmo0920 added a commit that referenced this pull request Mar 14, 2024
# This is the 1st commit message:

processor: input_metric: Prevent dangling pointer if cmt context is recreated

Signed-off-by: Hiroshi Hatake <[email protected]>

# This is the commit message #2:

processor_labels: Follow change of the signature of metrics callback

Signed-off-by: Hiroshi Hatake <[email protected]>

# This is the commit message #3:

processor_selector: Implement selector processor for metrics

For future extensibility, we use "selector" as a name for this processor.

Signed-off-by: Hiroshi Hatake <[email protected]>

# This is the commit message #4:

lib: Support setter for processor

Signed-off-by: Hiroshi Hatake <[email protected]>
zecke added a commit to zecke/fluent-bit that referenced this pull request May 25, 2024
The tls variable for out_flush_params is not initialized as the
flb_start function is not called during the dry run. Call flb_init
directly and then shutdown the engine.

configuration test is successful
=================================================================
==63633==ERROR: AddressSanitizer: attempting free on address which was not malloc()-ed: 0x0001f71b3ac0 in thread T0
    #0 0x103c9f260 in wrap_free+0x98 (libclang_rt.asan_osx_dynamic.dylib:arm64e+0x53260)
    fluent#1 0x100179d9c in flb_free flb_mem.h:127
    fluent#2 0x10017f4a0 in flb_output_exit flb_output.c:481
    fluent#3 0x1001cb038 in flb_engine_shutdown flb_engine.c:1119
    fluent#4 0x10010d45c in flb_destroy flb_lib.c:240
    fluent#5 0x100008c40 in flb_main fluent-bit.c:1348
    fluent#6 0x10000c644 in main fluent-bit.c:1456
    fluent#7 0x18f11e0dc  (<unknown module>)

frame fluent#6: 0x000000010017f4a4 fluent-bit`flb_output_exit(config=0x0000000102b00200) at flb_output.c:481:9
   478
   479 	    params = FLB_TLS_GET(out_flush_params);
   480 	    if (params) {
-> 481 	        flb_free(params);
   482 	    }
   483 	}

Signed-off-by: Holger Hans Peter Freyther <[email protected]>
@zecke zecke mentioned this pull request May 25, 2024
1 task
edsiper pushed a commit that referenced this pull request May 26, 2024
The tls variable for out_flush_params is not initialized as the
flb_start function is not called during the dry run. Call flb_init
directly and then shutdown the engine.

configuration test is successful
=================================================================
==63633==ERROR: AddressSanitizer: attempting free on address which was not malloc()-ed: 0x0001f71b3ac0 in thread T0
    #0 0x103c9f260 in wrap_free+0x98 (libclang_rt.asan_osx_dynamic.dylib:arm64e+0x53260)
    #1 0x100179d9c in flb_free flb_mem.h:127
    #2 0x10017f4a0 in flb_output_exit flb_output.c:481
    #3 0x1001cb038 in flb_engine_shutdown flb_engine.c:1119
    #4 0x10010d45c in flb_destroy flb_lib.c:240
    #5 0x100008c40 in flb_main fluent-bit.c:1348
    #6 0x10000c644 in main fluent-bit.c:1456
    #7 0x18f11e0dc  (<unknown module>)

frame #6: 0x000000010017f4a4 fluent-bit`flb_output_exit(config=0x0000000102b00200) at flb_output.c:481:9
   478
   479 	    params = FLB_TLS_GET(out_flush_params);
   480 	    if (params) {
-> 481 	        flb_free(params);
   482 	    }
   483 	}

Signed-off-by: Holger Hans Peter Freyther <[email protected]>
markuman pushed a commit to markuman/fluent-bit that referenced this pull request May 29, 2024
The tls variable for out_flush_params is not initialized as the
flb_start function is not called during the dry run. Call flb_init
directly and then shutdown the engine.

configuration test is successful
=================================================================
==63633==ERROR: AddressSanitizer: attempting free on address which was not malloc()-ed: 0x0001f71b3ac0 in thread T0
    #0 0x103c9f260 in wrap_free+0x98 (libclang_rt.asan_osx_dynamic.dylib:arm64e+0x53260)
    fluent#1 0x100179d9c in flb_free flb_mem.h:127
    fluent#2 0x10017f4a0 in flb_output_exit flb_output.c:481
    fluent#3 0x1001cb038 in flb_engine_shutdown flb_engine.c:1119
    fluent#4 0x10010d45c in flb_destroy flb_lib.c:240
    fluent#5 0x100008c40 in flb_main fluent-bit.c:1348
    fluent#6 0x10000c644 in main fluent-bit.c:1456
    fluent#7 0x18f11e0dc  (<unknown module>)

frame fluent#6: 0x000000010017f4a4 fluent-bit`flb_output_exit(config=0x0000000102b00200) at flb_output.c:481:9
   478
   479 	    params = FLB_TLS_GET(out_flush_params);
   480 	    if (params) {
-> 481 	        flb_free(params);
   482 	    }
   483 	}

Signed-off-by: Holger Hans Peter Freyther <[email protected]>
Signed-off-by: Markus Bergholz <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants