blocking call in one fiber can cause IO timeouts in others #8480

rdp · 2019-11-16T05:33:14Z

It seems that if a thread blocks on a call, other IO that is, for instance, waiting for a connect to occur can be marked as having "timed out" even though the socket operation might still be in a healthy state:

ex:

require "socket"
lib C
  fun sleep(value : UInt32) : UInt32
end

spawn {
  loop {
    TCPSocket.new("facebook.com",80,0.1,0.1).close # connect timeout 0.1s
    STDOUT.print "."
  }
}

loop {
  puts "sleeping"
  Fiber.yield
  C.sleep(1) # force the other fiber to exceed its connect timeout. replace with sleep(1) and works
}

In linux I get:

sleeping
sleeping
Unhandled exception in spawn: connect: Network is unreachable (Errno)
  from /usr/share/crystal/src/socket/tcp_socket.cr:75:15 in 'initialize'
  from /usr/share/crystal/src/socket/tcp_socket.cr:27:3 in 'new'
  from bad.cr:9:5 in '->'
  from /usr/share/crystal/src/fiber.cr:255:3 in 'run'
  from /usr/share/crystal/src/fiber.cr:48:34 in '->'
  from ???

and OS X

sleeping
sleeping
sleeping
Unhandled exception in spawn: connect timed out (IO::Timeout)
  from /usr/local/Cellar/crystal/0.31.1/src/socket/tcp_socket.cr:75:15 in 'initialize'
  from /usr/local/Cellar/crystal/0.31.1/src/socket/tcp_socket.cr:27:3 in 'new'
  from test2.cr:9:5 in '->'
  from /usr/local/Cellar/crystal/0.31.1/src/fiber.cr:255:3 in 'run'
  from /usr/local/Cellar/crystal/0.31.1/src/fiber.cr:48:34 in '->'
sleeping

Just wondering if there's some easy fix or not (retry non-blocking before raising?). Though I suppose an answer might be "don't run long running C calls" though sometimes that's tricky...

Related to #1454, but wanted to make it a separate issue so it could be discussed on its own merits.

crystal 0.31.1

The text was updated successfully, but these errors were encountered:

ysbaddaden · 2019-11-16T10:40:24Z

That's because you block the current thread, which prevents other fibers from running. By the time other fibers are resumed, whatever the state they're in: the operation timed out.

rdp · 2019-11-16T18:22:42Z

OK I guess with the current implementation it's impossible to have an "easy fix" (i.e. before raising timeout, non-blocking retry the behavior) because we mark the socket as non blocking, set the connect timeout on it, then don't return to it before the timeout has elapsed to "check" if the connect has succeeded, which apparently isn't allowable in linux. So maybe someday there'll be multi-thread that could "steal" the fiber away and finish it, or what not, or some other fix. Deferring to #1454, thanks for the feedbacks :)

rdp closed this as completed Nov 16, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

blocking call in one fiber can cause IO timeouts in others #8480

blocking call in one fiber can cause IO timeouts in others #8480

rdp commented Nov 16, 2019 •

edited

Loading

ysbaddaden commented Nov 16, 2019

rdp commented Nov 16, 2019

blocking call in one fiber can cause IO timeouts in others #8480

blocking call in one fiber can cause IO timeouts in others #8480

Comments

rdp commented Nov 16, 2019 • edited Loading

ysbaddaden commented Nov 16, 2019

rdp commented Nov 16, 2019

rdp commented Nov 16, 2019 •

edited

Loading