-
-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Some OpenSSL connections will cause Other OpenSSL connections (hangs | waiting)? #8626
Comments
UpdateOpenSSL Version
Call graph
|
It looks like an OpenSSL call is blocking the whole thread, which prevents the event loop/fibers to run? BIO_read should be event loop compatible, but maybe there is a bug, or we call a blocking openssl function. How does it fare with MT enabled? |
Sorry, you enabled MT but force same thread, which prevents fibers to run on many threads, thus don't benefit much from it (one blocking ssl connection will block all others). |
Stupid question: i see you spawn a fiber for each server, but do you spawn a fiber to handle each client connection? If not, each server will only be capable to handle a single connection at a time. |
@ysbaddaden !!!
Oops! I did as written on the official website, Maybe I understood the mistake at the time (May be due to translation problems. 😅) Okay, I should delete all |
See my last question, if you missed it. |
@ysbaddaden Yes.
def self.listen_servers(servers : Array(HttpServerType))
channel = Channel(Nil).new
servers.each do |server|
spawn server.listen
end ensure channel.send nil
servers.size.times { channel.receive }
end
def self.listen_all
Config.load_default_setup do |setup|
servers = [] of HttpServerType
Tunnel.create_servers servers, setup
Relay.create_servers servers, setup
listen_servers servers
end
end
{% for name in ["relay", "tunnel"] %}
class {{name.capitalize.id}}
alias ToucanType = Toucan::{{name.capitalize.id}}
alias OptionType = Option::{{name.capitalize.id}}
def self.to_option(setup : Parser::Setup, &block)
setup.{{name.id}}.try &.each do |item|
yield OptionType.from_setup item
end
end
def self.create_servers(servers, setup : Parser::Setup)
to_option setup do |option|
ToucanType.bind option do |server|
servers << server if server
end
end
end
def self.listen(option)
Config.load_setup option.setupPath do |setup|
setup.try do |_setup|
servers = [] of HttpServerType
create_servers servers, _setup
Setup.listen_servers servers
end
end
end
end
{% end %}
def accept!(server : Socket::Server)
while socket = server.accept?
socket.try do |client|
_client = client.as TCPSocket
spawn same_thread: true do
handle_client _client
end
end
end
end
def listen
raise "Can't re-start closed server" if closed?
raise "Can't start server with no sockets to listen to, use Http::Proxy#bind first" if servers.empty?
raise "Can't start running server" if listening?
@listening = true
channel = Channel(Nil).new
servers.each do |server|
spawn same_thread: true do
until closed?
accept server
end
ensure
channel.send nil
end
end
servers.size.times { channel.receive }
end ---
tunnel:
- bindHost: 0.0.0.0
bindPort: 1234
- bindHost: 0.0.0.0
bindPort: 2345
... |
@ysbaddaden By default, each tunnel server will only listen on one. |
@ysbaddaden Under normal circumstances, no problem, this problem does not occur often. When busy? Or when the connection speed is poor? This problem will appear. |
I added another fiber in I also tried to remove Thank you for your help! 🙏 def perform(request : HTTP::Request)
> spawn same_thread: true do
client = upgrade_client _client, request
unless client
return ...
end
logger.try &.connect_to host, port
upstream = connect_server
unless upstream
return ...
end
set_remote_socket_timeout upstream
server = upgrade_server upstream, request
unless server
return ...
end
transport client, server, _extract
end
> end
end |
OK. There is probabky a blocking openssl call somewhere or a bug in BIO. Same thread prevents other threads from handling the fiber, thus doesn't benefit from MT, since a single thread handles all fibers. |
@ysbaddaden After testing, I have deleted all if I kill (HTTP(S) Tunnel with MITM), Then the (HTTP(S) Relay) It will quickly process the remaining accepted connections. if I kill (HTTP(S) Relay), and then restart (HTTP(S) Relay), It won't get stuck and process the request as usual. If you use
HTTP (S) Relay uses only
def call(context : HTTP::Server::Context)
_logger = Logger.new logger, 3_i32
set_logger_option _logger
_logger.connect_to_relay context
load_..., logger
Request.new context do |request|
request... = ...
request.logger = _logger
request.option = option
request.chunk_stream request.body_io do |stream|
request.set_auto_decompress
...
request.show_summary
request.create stream do |response|
...
response.show_summary
response.set_stream_body
# response.upgrade request do |processor|
# processor.perform
# end
unless response.upgraded?
_logger.finished unless response.transportError
end
end
end
end
end If you kill (HTTP (S) Tunnel with MITM) at this time,
Call graph
|
If it's "blocked until a tcp read timeout occurs" maybe related to #8108 ? I could only debug the issue by adding some |
@rdp, @ysbaddaden
I will upgrade crystal to see if this problem is solved. The latest version of Crystal requires a full Xcode installation? Too big, my internet is a bit slow, and I may need some time. @rdp, Thank you, brother :) . |
@636f7374 was this problem completely solved only by removing |
@ysbaddaden @rdp @RX14 Oops! Tested for a period of time, it seems that it has little to do with it, it appears more frequently after adding The latest version (Crystal 0.32.1 (2019-12-31)) has been upgraded, as well as openssl
I found this problem in before, I patched def bind_tls(host : String, port : Int32, context : OpenSSL::SSL::Context::Server,
write_timeout : Int32? = nil, read_timeout : Int32? = nil, reuse_port : Bool = false) : Socket::IPAddress
tcp_server = TCPServer.new host, port, reuse_port: reuse_port
read_timeout.try { |_read_timeout| tcp_server.client_read_timeout = _read_timeout }
write_timeout.try { |_write_timeout| tcp_server.client_write_timeout = _write_timeout }
private def accept(server : Socket::Server)
begin
accept! server
rescue ex
handle_exception ex
end
end
private def accept!(server : Socket::Server)
while socket = server.accept?
next unless client = socket
spawn handle_client server, client
end
end
private def set_socket_timeout(server : Socket::Server, socket : IO?)
if socket.responds_to? :read_timeout=
socket.read_timeout = server.client_read_timeout
end
if socket.responds_to? :write_timeout=
socket.write_timeout = server.client_write_timeout
end
end
def listen
raise "Can't re-start closed server" if closed?
raise "Can't start server with no sockets to listen to, use HTTP::Server#bind first" if @sockets.empty?
raise "Can't start running server" if listening?
@listening = true
done = Channel(Nil).new
@sockets.each do |socket|
spawn do
until closed?
accept socket
end
ensure
done.send nil
end
end
@sockets.size.times { done.receive }
end
private def handle_client(server, client : IO)
set_socket_timeout server, client
...
end
very possible! Crystal Version
Call graph
|
#8108 describes a situation where the normal |
Update@rdp Yes, after testing, it is not caused by DNS. I used a custom DNS resolver, but the problem still exists.
Yes, mine too.
Sorry i understand wrong, i will test this part. class OpenSSL::SSL::Server
...
def accept? : OpenSSL::SSL::Socket::Server?
if socket = @wrapped.accept?
...
end
end
end Call Graph
Lsof Verbose
|
@rdp I set some tracking tags and got some information, which is being analyzed. (I only submitted the part that appeared to be stuck). Sorry, tags verbose is a bit messy.
Tracking Tags (
|
@rdp Full verbose. tunnel.txt - 175 KiloBytes
|
Does it hang after that log? What failures are reported when
ssl_accept_failed, does it always fail?
…On Tue, Dec 31, 2019 at 9:44 PM _ ***@***.***> wrote:
@rdp <https://github.com/rdp> full verbose
log.txt <https://github.com/crystal-lang/crystal/files/4013805/log.txt>
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#8626?email_source=notifications&email_token=AAADBUAZDMUFKXOL4USBW2LQ3QNSTA5CNFSM4J7I5BF2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEH45EOI#issuecomment-570020409>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAADBUFEESDYOPEQPYIQ3MTQ3QNSTANCNFSM4J7I5BFQ>
.
|
@rdp I updated the pending location information, I re-edited this post, I may not see the updated content in the email, so I repost it. 2020-01-01 03:49:21.881545000 - Tunnel invalid memory access (my cause), Restart |
Does it always hang for around 10s? What's a backtrace (ex: gdb) during a
hang?
…On Wed, Jan 1, 2020 at 11:45 AM _ ***@***.***> wrote:
@rdp <https://github.com/rdp> I updated the pending location information,
I re-edited this post, I may not see the updated content in the email, so I
repost it.
2020-01-01 03:49:21.881545000 - Tunnel invalid memory access (my cause),
Restart
... (Processing)
2020-01-01 03:49:48.694403000 - Relay Suspected hangs ~ 10 seconds.
... (Processed some)
2020-01-01 03:50:00.379686000 - continued to hangs.
... (Processed some)
2020-01-01 03:50:00.379686000 - continued to hangs.
... (Processed some)
2020-01-01 03:50:18.787294000 - continued to hangs.
... (Processed some)
2020-01-01 03:50:33.663965000 - continued to hangs.
(Kill tunnel, at the same time, Relay starts accepting clients quickly)
... (Done)
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#8626?email_source=notifications&email_token=AAADBUDUNBDOVKVOH3MAD2TQ3TQDVA5CNFSM4J7I5BF2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEH5KLSI#issuecomment-570074569>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAADBUDYP7MMUNRVPBQMBH3Q3TQDVANCNFSM4J7I5BFQ>
.
|
Under normal circumstances (without My stupid question: Can |
The logs indicate that this problem may occur near |
OK |
Sorry, for some reason (Fix bugs), I didn't reply to you in time. After testing, this problem does not occur with
Other than that, I couldn't find any conditions that caused this problem. Before:
After:
I am going to make crystal SOCKS5 Client / Server, Maybe this problem will be resolved some days. thank you for your help @rdp. 👍 |
OK if you can come up with some way to repro it in isolation (simplified
test case would be even better) feel free to post back, maybe very specific
instructions, or something like that, good luck!
…On Mon, Jan 13, 2020 at 2:33 PM _ ***@***.***> wrote:
@rdp <https://github.com/rdp>
Sorry, for some reason (Fix bugs), I didn't reply to you in time.
After testing, this problem does not occur with TCPServer (bind_tcp), UNIXServer
(bind_unix) (so far).
So I dropped my plan to use OpenSSL::SSL::Server (bind_tls).
- I guess
- I still suspect BIO_read (at some point, it will cause the
context switch to fail). ?
- [RFC] Fiber preemption, blocking calls and other concurrency
issues #1454 <#1454> ?
- Using OpenSSL::SSL::Server and HTTP::Client at the same time can
cause context switching to fail at certain times. ?
Other than that, I couldn't find any conditions that caused this problem.
Before:
- Client <=Encrypt=> Tunnel <=Decrypt and ReCrypt=> Relay (SSL Server)
<=HTTP::Client=> Remote
After:
- Client <=Encrypt=> Tunnel <=Decrypt=> Relay (TCP Server)
<=HTTP::Client=> Remote
I am going to make crystal SOCKS5 Client / Server, Maybe this problem will
be resolved some days.
thank you for your help @rdp <https://github.com/rdp>. 👍
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#8626?email_source=notifications&email_token=AAADBUBI32O7JUXM73JXA6DQ5TMZRA5CNFSM4J7I5BF2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEI2LW5I#issuecomment-573881205>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAADBUBD774W7CTZXCSYVP3Q5TMZRANCNFSM4J7I5BFQ>
.
|
@waj Sorry, my program has been substantially modified and has been replaced with UNIXSocket, I read your pull request, I think there is no problem :) |
Summary
Process
Guess?
C.getaddrinfo
Some TCPSocket connections will cause HTTP::Server accept (freeze | blocking | hangs | waiting)? #8376 .BIO_read
? (See Call graph).same_thread: true
,-Dpreview_mt
not work?)Some Code?
Due to the large amount of code, I only capture a part.
OpenSSL::SSL::SuperSocket::{Client|Server}.new
are just lightweight wrapped forOpenSSL::SSL::Socket::{Client|Server}.new
.HTTP(S) Tunnel with MITM
Lsof Verbose
Logging (Some privacy data is involved, I am obscure some of it.)
Call graph
Crystal Version
Crystal 0.32.1
for the time being.Finally
The text was updated successfully, but these errors were encountered: