Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to safely close HTTP chunked encoding stream with SSL #4991

Closed
vladfaust opened this issue Sep 17, 2017 · 12 comments
Closed

Unable to safely close HTTP chunked encoding stream with SSL #4991

vladfaust opened this issue Sep 17, 2017 · 12 comments

Comments

@vladfaust
Copy link
Contributor

vladfaust commented Sep 17, 2017

I'd like to implement long polling with some query params changeable in the runtime. So the request must be stopped and run again:

@client = HTTP::Client.new

loop do
  @client.get "/long_polling?#{updated_params}" do |response|
    loop do
      puts response.body_io.gets
    end
  end
end

spawn do
  loop do
    sleep 5
    update_params
    @client.close # From docs: Closes this client. If used again, a new connection will be opened.
  end
end

Calling @client.close always causes Closed stream (IO::Error) in the first loop. If rescuing this particular error, another one will definitely appear after some successfull closes:

Unhandled exception in spawn:
SSL_shutdown: error:1408F081:SSL routines:SSL3_GET_RECORD:block cipher pad is wrong (OpenSSL::SSL::Error)
0x4a6a57: *CallStack::unwind:Array(Pointer(Void)) at ??
0x551a16: unbuffered_close at /opt/crystal/src/openssl/ssl/socket.cr 140:11
0x5518ff: close at /opt/crystal/src/io/buffered.cr 197:5
0x571d55: close at /opt/crystal/src/http/client.cr 634:19
0x571c6b: loop at /example.cr 42:9
0x4bce8e: run at /opt/crystal/src/fiber.cr 255:3
0x49a646: ~proc2Proc(Fiber, (IO::FileDescriptor | Nil)) at /opt/crystal/src/concurrent.cr 61:3
0x0: ??? at ??

And even if rescuing this line @client.close rescue nil, the program crashes after some time:

*** Error in `/home/faust/.cache/crystal/crystal-run-worker.tmp': free(): corrupted unsorted chunks: 0x00000000016fb920 ***
======= Backtrace: =========
/lib/x86_64-linux-gnu/libc.so.6(+0x777e5)[0x7f6121a107e5]
/lib/x86_64-linux-gnu/libc.so.6(+0x8037a)[0x7f6121a1937a]
/lib/x86_64-linux-gnu/libc.so.6(cfree+0x4c)[0x7f6121a1d53c]
/home/faust/.cache/crystal/crystal-run-worker.tmp(*Event::Event#free:Bool+0x1b)[0x4dfadb]
/home/faust/.cache/crystal/crystal-run-worker.tmp(*IO::FileDescriptor++0x11e)[0x4de34e]
/home/faust/.cache/crystal/crystal-run-worker.tmp(*IO::FileDescriptor++0x6f)[0x4de21f]
/home/faust/.cache/crystal/crystal-run-worker.tmp(*OpenSSL::SSL::Socket++0x631)[0x5521e1]
/home/faust/.cache/crystal/crystal-run-worker.tmp(*OpenSSL::SSL::Socket++0x6f)[0x551b9f]
/home/faust/.cache/crystal/crystal-run-worker.tmp(*HTTP::Client#close:Nil+0x85)[0x572745]
/home/faust/.cache/crystal/crystal-run-worker.tmp(*Refbot::Streamer#run_watcher:NoReturn+0x78)[0x572648]
/home/faust/.cache/crystal/crystal-run-worker.tmp[0x4a1ba9]
/home/faust/.cache/crystal/crystal-run-worker.tmp(*Fiber#run:(IO::FileDescriptor | Nil)+0xae)[0x4bd0ce]
/home/faust/.cache/crystal/crystal-run-worker.tmp(~proc2Proc(Fiber, (IO::FileDescriptor | Nil))+0x6)[0x49a886]
======= Memory map: ========
00400000-0064e000 r-xp 00000000 00:2c 810313                             /home/faust/.cache/crystal/crystal-run-worker.tmp
0084d000-0084e000 r--p 0024d000 00:2c 810313                             /home/faust/.cache/crystal/crystal-run-worker.tmp
0084e000-0084f000 rw-p 0024e000 00:2c 810313                             /home/faust/.cache/crystal/crystal-run-worker.tmp
0084f000-00892000 rw-p 00000000 00:00 0 
01670000-017ef000 rw-p 00000000 00:00 0                                  [heap]
7f6108000000-7f6108021000 rw-p 00000000 00:00 0 
@vladfaust
Copy link
Contributor Author

I might be misunderstanding term "Streaming". Maybe I don't need a block? 🤔 Closing for now, will check later today, I'm going to bar! 🍺

@vladfaust
Copy link
Contributor Author

vladfaust commented Sep 17, 2017

Nope, I see this the only way of implementing long polling and it doesn't work. Need some help here!

Still going to bar...

@vladfaust vladfaust reopened this Sep 17, 2017
@vladfaust
Copy link
Contributor Author

image
The best day of my life

However, still concerned about the issue

@vladfaust
Copy link
Contributor Author

This issue is essential to me, because my client wants Twitter streaming with a list of donors changeable in runtime. I told him Crystal is awesome but I can't figure this out. Anyone?

@asterite
Copy link
Member

It works for me if I do:

begin
  client.get "/" do |response|
    loop do
      puts response.body_io.gets
    end
  end
rescue ex
  # here the client should be closed, probably
end

@akzhan
Copy link
Contributor

akzhan commented Sep 18, 2017

Can you provide minimal POC - I have not seen updated_params etc.?

Anyway we need shortest full example that reproduces erroneous behavior.

@asterite
Copy link
Member

Yes, please, I forgot to mention this. Your example doesn't compile. To help you faster we need a full working example (and better if we have a server that sends data for a long polling client). Otherwise the work to fix the example has to be done by everyone who wants to help you.

@vladfaust
Copy link
Contributor Author

vladfaust commented Sep 18, 2017

I thought it would be obvious for you from the example - like, who doesn't know that IO should be closed by #fancy_method_i_dont_know. But looks like the issue is deeper than that. Okay, will write a more detailed example.

@vladfaust
Copy link
Contributor Author

I can't reproduce the issue yet due to I don't know how to do streaming in crystal (#5004)

@vladfaust
Copy link
Contributor Author

Not actual for now

@vladfaust
Copy link
Contributor Author

It's extremely hard to reproduce, because the code is private for my client. But I'll describe the context.

A program streams content from a server with chunked encoding. It needs to restart the stream from time to time.

That's a SIMPLIFIED piece of code:

class Stream
  def stream(&block : String -> Nil)
    begin
      @client = HTTP::Client.new(Host, tls: true)
      @client.get("/stream") do |response|
        yield response.body_io.gets
      end
    # I have to rescue it, it's a workaround
    rescue ex : Errno | IO::Error
      logger.info("Rescued #{ex}, means the stream is closed externally! Exiting stream.")
    rescue ex
      logger.error("Rescued #{ex}")
    end
  end

  def close
    begin
      @client.close
    rescue ex
      logger.error(ex)
    end
  end
end

spawn do
  sleep 5
  @stream.close
end

loop do
  @stream = Stream.new
  stream.stream do |response|
    puts response
  end
end

#close is called from time to time to restart stream afterwards. Eventually (not always) @client.close will raise SSL::SomeError (it happens rarely, I don't remember it for now), after rescuing it and trying to instantiate the client again it will crash the entire program with the following trace: https://pastebin.com/2zxGC7AD

That what's most important I think:

*** Error in `bin/worker': corrupted double-linked list: 0x0000000001638a10 ***
======= Backtrace: =========
/lib/x86_64-linux-gnu/libc.so.6(+0x777e5)[0x7f2975f497e5]
/lib/x86_64-linux-gnu/libc.so.6(+0x7d814)[0x7f2975f4f814]
/lib/x86_64-linux-gnu/libc.so.6(+0x8288e)[0x7f2975f5488e]
/lib/x86_64-linux-gnu/libc.so.6(__libc_malloc+0x54)[0x7f2975f56184]
/lib/x86_64-linux-gnu/libcrypto.so.1.0.0(CRYPTO_malloc+0x58)[0x7f2976e46e78]
/lib/x86_64-linux-gnu/libcrypto.so.1.0.0(BUF_MEM_grow+0x114)[0x7f2976ef4354]
/lib/x86_64-linux-gnu/libssl.so.1.0.0(+0x2b186)[0x7f2977253186]
bin/worker(*OpenSSL::SSL::Socket::Client#initialize<TCPSocket, OpenSSL::SSL::Context::Client, Bool, String>:Nil+0x15e)[0x6a6ece]
bin/worker(*OpenSSL::SSL::Socket::Client::new:context:sync_close:hostname<TCPSocket, OpenSSL::SSL::Context::Client, Bool, String>:OpenSSL::SSL::Socket::Client+0xe1)[0x6a6d51]
bin/worker(*HTTP::Client#socket:(OpenSSL::SSL::Socket+ | TCPSocket+)+0x137)[0x6ff327]
bin/worker(*RepostBot::Forest::Streamer#stream<Array(Int64), &Proc(Twitter::Tweet, Nil)>:(File::PReader | IO::FileDescriptor+ | OpenSSL::SSL::Socket+ | Nil)+0x330)[0x6ecac0]
bin/worker(*RepostBot::Forest#listen:NoReturn+0x1d7)[0x6da9c7]
bin/worker[0x50d00c]
bin/worker(main+0x29)[0x520ca9]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf0)[0x7f2975ef2830]
bin/worker(_start+0x29)[0x50b9d9]

The issue occurs on production machine only, which is docker with Ngnix. The app runs in docker containers, obviously.

Disclaimer

I UNDERSTAND, that this code is not reproducible, but MAYBE someone could point me where to dig.

@vladfaust vladfaust reopened this Sep 21, 2017
@vladfaust vladfaust changed the title Unable to safely close HTTP stream Unable to safely close HTTP chunked encoding stream with SSL Sep 21, 2017
@asterite
Copy link
Member

Closed as duplicate of #5020

(the later provides reproducible code)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants