Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failed to handshake with localhost mongo #3346

Closed
keang opened this issue Dec 4, 2023 · 13 comments
Closed

Failed to handshake with localhost mongo #3346

keang opened this issue Dec 4, 2023 · 13 comments
Assignees
Labels

Comments

@keang
Copy link

keang commented Dec 4, 2023

I could not connect to mongo running on localhost.

On macOS 13.6, start mongo:

$ docker run -p 27018:27017 -d mongo:7-jammy

On both Truffleruby Native and Truffleruby JVM, I can't connect to that running mongo service:

$ gem i mongo
$ ruby -e 'require "mongo"; puts RUBY_DESCRIPTION; puts "Mongo gem version: #{Mongo::VERSION}"; puts Mongo::Client.new("mongodb://localhost:27018").database_names'


truffleruby 24.0.0-dev-670fdb41, like ruby 3.2.2, GraalVM CE JVM [aarch64-darwin]
Mongo gem version: 2.19.1

W, [2023-12-04T01:03:28.171351 #92821]  WARN -- : MONGODB | Error running awaited hello on localhost:27018: Mongo::Error::SocketTimeoutError: Errno::ETIMEDOUT: Connection timed out - Took more than 20 seconds to receive data (select call timed out) (for 127.0.0.1:27018 (no TLS)) (on localhost:27018)
W, [2023-12-04T01:03:28.186156 #92821]  WARN -- : MONGODB | Error checking localhost:27018: Mongo::Error::SocketTimeoutError: Errno::ETIMEDOUT: Connection timed out - Took more than 10 seconds to receive data (select call timed out) (for 127.0.0.1:27018 (no TLS)) (on localhost:27018, sconn:60)
W, [2023-12-04T01:03:28.204798 #92821]  WARN -- : MONGODB | Failed to handshake with localhost:27018: Mongo::Error::SocketError: IOError: closed stream (for 127.0.0.1:27018 (no TLS)) (on localhost:27018, connection 1:1)
W, [2023-12-04T01:03:38.210658 #92821]  WARN -- : MONGODB | Failed to handshake with localhost:27018: Mongo::Error::SocketTimeoutError: Errno::ETIMEDOUT: Connection timed out - Took more than 10 seconds to receive data (select call timed out) (for 127.0.0.1:27018 (no TLS)) (on localhost:27018)
W, [2023-12-04T01:03:38.211678 #92821]  WARN -- : MONGODB | Error checking localhost:27018: Mongo::Error::SocketTimeoutError: Errno::ETIMEDOUT: Connection timed out - Took more than 10 seconds to receive data (select call timed out) (for 127.0.0.1:27018 (no TLS)) (on localhost:27018)
W, [2023-12-04T01:03:48.736351 #92821]  WARN -- : MONGODB | Failed to handshake with localhost:27018: Mongo::Error::SocketTimeoutError: Errno::ETIMEDOUT: Connection timed out - Took more than 10 seconds to receive data (select call timed out) (for 127.0.0.1:27018 (no TLS)) (on localhost:27018)
W, [2023-12-04T01:03:48.736937 #92821]  WARN -- : MONGODB | Error checking localhost:27018: Mongo::Error::SocketTimeoutError: Errno::ETIMEDOUT: Connection timed out - Took more than 10 seconds to receive data (select call timed out) (for 127.0.0.1:27018 (no TLS)) (on localhost:27018)
/Users/keang.song/.asdf/installs/ruby/truffleruby+graalvm-dev/lib/gems/gems/mongo-2.19.1/lib/mongo/socket.rb:476:in `map_exceptions': IOError: closed stream (for 127.0.0.1:27018 (no TLS)) (on localhost:27018, connection 1:1, modern retry, attempt 1, later retry failed: Mongo::Error::NoServerAvailable: No primary server is available in cluster: #<Cluster topology=Single[localhost:27018] servers=[#<Server address=localhost:27018 UNKNOWN pool=#<ConnectionPool size=0 (0-20) used=0 avail=0 pending=0 paused>>]> with timeout=30, LT=0.015) (Mongo::Error::SocketError)
	from /Users/keang.song/.asdf/installs/ruby/truffleruby+graalvm-dev/lib/gems/gems/mongo-2.19.1/lib/mongo/socket.rb:203:in `read'
	from /Users/keang.song/.asdf/installs/ruby/truffleruby+graalvm-dev/lib/gems/gems/mongo-2.19.1/lib/mongo/protocol/message.rb:253:in `deserialize'

On mri ruby it works fine:

$ ruby -e 'require "mongo"; puts RUBY_DESCRIPTION; puts "Mongo gem version: #{Mongo::VERSION}"; puts Mongo::Client.new("mongodb://localhost:27018").database_names'

ruby 3.2.2 (2023-03-30 revision e51014f9c0) [arm64-darwin22]
Mongo gem version: 2.19.1
admin
config
local

It also works fine when connecting from an Amazon Linux 2 box to a remote mongo running on another EC2 box.

@eregon
Copy link
Member

eregon commented Dec 4, 2023

I tried to reproduce locally on Linux, but for me it worked with truffleruby-dev (in Native mode):

$ chruby 3.2.2
$ ruby -e 'require "mongo"; puts RUBY_DESCRIPTION; puts "Mongo gem version: #{Mongo::VERSION}"; puts Mongo::Client.new("mongodb://localhost:27018").database_names'
ruby 3.2.2 (2023-03-30 revision e51014f9c0) [x86_64-linux]
Mongo gem version: 2.19.3
admin
config
local

$ chruby truffleruby-dev
$ ruby -e 'require "mongo"; puts RUBY_DESCRIPTION; puts "Mongo gem version: #{Mongo::VERSION}"; puts Mongo::Client.new("mongodb://localhost:27018").database_names'
truffleruby 24.0.0-dev-4cd14c48, like ruby 3.2.2, GraalVM CE Native [x86_64-linux]
Mongo gem version: 2.19.3
admin
config
local

Could you try with the dev version of the Native Standalone (see the README)?

@eregon
Copy link
Member

eregon commented Dec 4, 2023

It also works with truffleruby+graalvm-dev from ruby-build for me locally:

$ ruby -e 'require "mongo"; puts RUBY_DESCRIPTION; puts "Mongo gem version: #{Mongo::VERSION}"; puts Mongo::Client.new("mongodb://localhost:27018").database_names'
truffleruby 24.0.0-dev-9a84a125, like ruby 3.2.2, GraalVM CE JVM [x86_64-linux]
Mongo gem version: 2.19.3
admin
config
local
ruby -e   55.37s user 1.70s system 836% cpu 6.824 total

I wonder if the issue is this 10 seconds timeout + running on a different machine & platform (macOS).
In JVM mode it's expected that startup and warmup are slower.

@eregon eregon self-assigned this Dec 4, 2023
@nirvdrum
Copy link
Collaborator

nirvdrum commented Dec 5, 2023

I was able to reproduce on macOS 14.1.2 with an M1 Pro running a 24.0.0-dev build. I haven't looked any further than that, however, so I'm not yet sure what's at fault.

truffleruby 24.0.0-dev-4cd14c48, like ruby 3.2.2, GraalVM CE JVM [aarch64-darwin]
Mongo gem version: 2.19.3
W, [2023-12-04T21:38:48.940637 #59634]  WARN -- : MONGODB | Error running awaited hello on localhost:27018: Mongo::Error::SocketTimeoutError: Errno::ETIMEDOUT: Connection timed out - Took more than 20 seconds to receive data (select call timed out) (for 127.0.0.1:27018 (no TLS)) (on localhost:27018)
W, [2023-12-04T21:38:48.939514 #59634]  WARN -- : MONGODB | Error checking localhost:27018: Mongo::Error::SocketTimeoutError: Errno::ETIMEDOUT: Connection timed out - Took more than 10 seconds to receive data (select call timed out) (for 127.0.0.1:27018 (no TLS)) (on localhost:27018, sconn:10)
W, [2023-12-04T21:38:49.001346 #59634]  WARN -- : MONGODB | Failed to handshake with localhost:27018: Mongo::Error::SocketError: IOError: closed stream (for 127.0.0.1:27018 (no TLS)) (on localhost:27018, connection 1:1)
W, [2023-12-04T21:38:59.026710 #59634]  WARN -- : MONGODB | Failed to handshake with localhost:27018: Mongo::Error::SocketTimeoutError: Errno::ETIMEDOUT: Connection timed out - Took more than 10 seconds to receive data (select call timed out) (for 127.0.0.1:27018 (no TLS)) (on localhost:27018)
W, [2023-12-04T21:38:59.032327 #59634]  WARN -- : MONGODB | Error checking localhost:27018: Mongo::Error::SocketTimeoutError: Errno::ETIMEDOUT: Connection timed out - Took more than 10 seconds to receive data (select call timed out) (for 127.0.0.1:27018 (no TLS)) (on localhost:27018)
W, [2023-12-04T21:39:09.554606 #59634]  WARN -- : MONGODB | Failed to handshake with localhost:27018: Mongo::Error::SocketTimeoutError: Errno::ETIMEDOUT: Connection timed out - Took more than 10 seconds to receive data (select call timed out) (for 127.0.0.1:27018 (no TLS)) (on localhost:27018)
W, [2023-12-04T21:39:09.557310 #59634]  WARN -- : MONGODB | Error checking localhost:27018: Mongo::Error::SocketTimeoutError: Errno::ETIMEDOUT: Connection timed out - Took more than 10 seconds to receive data (select call timed out) (for 127.0.0.1:27018 (no TLS)) (on localhost:27018)
/Users/nirvdrum/.gem/truffleruby-ws-dev/gems/mongo-2.19.3/lib/mongo/socket.rb:476:in `map_exceptions': IOError: closed stream (for 127.0.0.1:27018 (no TLS)) (on localhost:27018, connection 1:1, modern retry, attempt 1, later retry failed: Mongo::Error::NoServerAvailable: No primary server is available in cluster: #<Cluster topology=Single[localhost:27018] servers=[#<Server address=localhost:27018 UNKNOWN pool=#<ConnectionPool size=0 (0-20) used=0 avail=0 pending=0 paused>>]> with timeout=30, LT=0.015) (Mongo::Error::SocketError)
	from /Users/nirvdrum/.gem/truffleruby-ws-dev/gems/mongo-2.19.3/lib/mongo/socket.rb:203:in `read'
	from /Users/nirvdrum/.gem/truffleruby-ws-dev/gems/mongo-2.19.3/lib/mongo/protocol/message.rb:253:in `deserialize'
	from /Users/nirvdrum/.gem/truffleruby-ws-dev/gems/mongo-2.19.3/lib/mongo/server/pending_connection.rb:137:in `block (3 levels) in handshake!'
	from /Users/nirvdrum/.gem/truffleruby-ws-dev/gems/mongo-2.19.3/lib/mongo/server/connection_common.rb:146:in `add_server_diagnostics'
	from /Users/nirvdrum/.gem/truffleruby-ws-dev/gems/mongo-2.19.3/lib/mongo/server/pending_connection.rb:135:in `block (2 levels) in handshake!'
	from /Users/nirvdrum/.gem/truffleruby-ws-dev/gems/mongo-2.19.3/lib/mongo/server/round_trip_time_averager.rb:39:in `measure'
	from /Users/nirvdrum/.gem/truffleruby-ws-dev/gems/mongo-2.19.3/lib/mongo/server/pending_connection.rb:134:in `block in handshake!'
	from /Users/nirvdrum/.gem/truffleruby-ws-dev/gems/mongo-2.19.3/lib/mongo/server.rb:502:in `handle_handshake_failure!'
	from /Users/nirvdrum/.gem/truffleruby-ws-dev/gems/mongo-2.19.3/lib/mongo/server/pending_connection.rb:132:in `handshake!'
	from /Users/nirvdrum/.gem/truffleruby-ws-dev/gems/mongo-2.19.3/lib/mongo/server/pending_connection.rb:68:in `handshake_and_authenticate!'
	from /Users/nirvdrum/.gem/truffleruby-ws-dev/gems/mongo-2.19.3/lib/mongo/server/connection.rb:276:in `do_connect'
	from /Users/nirvdrum/.gem/truffleruby-ws-dev/gems/mongo-2.19.3/lib/mongo/server/connection.rb:234:in `connect!'
	from /Users/nirvdrum/.gem/truffleruby-ws-dev/gems/mongo-2.19.3/lib/mongo/server/connection_pool.rb:980:in `connect_connection'
	from /Users/nirvdrum/.gem/truffleruby-ws-dev/gems/mongo-2.19.3/lib/mongo/server/connection_pool.rb:1331:in `connect_or_raise'
	from /Users/nirvdrum/.gem/truffleruby-ws-dev/gems/mongo-2.19.3/lib/mongo/server/connection_pool.rb:1270:in `retrieve_and_connect_connection'
	from /Users/nirvdrum/.gem/truffleruby-ws-dev/gems/mongo-2.19.3/lib/mongo/server/connection_pool.rb:365:in `check_out'
	from /Users/nirvdrum/.gem/truffleruby-ws-dev/gems/mongo-2.19.3/lib/mongo/server/connection_pool.rb:704:in `with_connection'
	from /Users/nirvdrum/.gem/truffleruby-ws-dev/gems/mongo-2.19.3/lib/mongo/server.rb:494:in `with_connection'
	from /Users/nirvdrum/.gem/truffleruby-ws-dev/gems/mongo-2.19.3/lib/mongo/operation/shared/op_msg_executable.rb:35:in `execute'
	from /Users/nirvdrum/.gem/truffleruby-ws-dev/gems/mongo-2.19.3/lib/mongo/database.rb:263:in `block (2 levels) in read_command'
	from /Users/nirvdrum/.gem/truffleruby-ws-dev/gems/mongo-2.19.3/lib/mongo/retryable/read_worker.rb:193:in `modern_read_with_retry'
	from /Users/nirvdrum/.gem/truffleruby-ws-dev/gems/mongo-2.19.3/lib/mongo/retryable/read_worker.rb:117:in `read_with_retry'
	from /Users/nirvdrum/dev/workspaces/truffleruby-ws/graal/sdk/mxbuild/darwin-aarch64/GRAALVM_44DD63DDB7_JAVA21/graalvm-44dd63ddb7-java21-24.0.0-dev/Contents/Home/languages/ruby/lib/mri/forwardable.rb:240:in `read_with_retry'
	from /Users/nirvdrum/.gem/truffleruby-ws-dev/gems/mongo-2.19.3/lib/mongo/database.rb:262:in `block in read_command'
	from /Users/nirvdrum/.gem/truffleruby-ws-dev/gems/mongo-2.19.3/lib/mongo/client.rb:1156:in `with_session'
	from /Users/nirvdrum/.gem/truffleruby-ws-dev/gems/mongo-2.19.3/lib/mongo/database.rb:261:in `read_command'
	from /Users/nirvdrum/.gem/truffleruby-ws-dev/gems/mongo-2.19.3/lib/mongo/client.rb:974:in `list_databases'
	from /Users/nirvdrum/.gem/truffleruby-ws-dev/gems/mongo-2.19.3/lib/mongo/client.rb:944:in `database_names'
	from -e:1:in `<main>'
/Users/nirvdrum/.gem/truffleruby-ws-dev/gems/mongo-2.19.3/lib/mongo/server_selector/base.rb:306:in `select_server_impl': No primary server is available in cluster: #<Cluster topology=Single[localhost:27018] servers=[#<Server address=localhost:27018 UNKNOWN pool=#<ConnectionPool size=0 (0-20) used=0 avail=0 pending=0 paused>>]> with timeout=30, LT=0.015 (Mongo::Error::NoServerAvailable)
	from /Users/nirvdrum/.gem/truffleruby-ws-dev/gems/mongo-2.19.3/lib/mongo/server_selector/base.rb:178:in `select_server'
	from /Users/nirvdrum/.gem/truffleruby-ws-dev/gems/mongo-2.19.3/lib/mongo/retryable.rb:50:in `select_server'
	from /Users/nirvdrum/dev/workspaces/truffleruby-ws/graal/sdk/mxbuild/darwin-aarch64/GRAALVM_44DD63DDB7_JAVA21/graalvm-44dd63ddb7-java21-24.0.0-dev/Contents/Home/languages/ruby/lib/mri/forwardable.rb:240:in `select_server'
	from /Users/nirvdrum/.gem/truffleruby-ws-dev/gems/mongo-2.19.3/lib/mongo/retryable/read_worker.rb:265:in `retry_read'
	from /Users/nirvdrum/.gem/truffleruby-ws-dev/gems/mongo-2.19.3/lib/mongo/retryable/read_worker.rb:198:in `modern_read_with_retry'
	from /Users/nirvdrum/.gem/truffleruby-ws-dev/gems/mongo-2.19.3/lib/mongo/retryable/read_worker.rb:117:in `read_with_retry'
	from /Users/nirvdrum/dev/workspaces/truffleruby-ws/graal/sdk/mxbuild/darwin-aarch64/GRAALVM_44DD63DDB7_JAVA21/graalvm-44dd63ddb7-java21-24.0.0-dev/Contents/Home/languages/ruby/lib/mri/forwardable.rb:240:in `read_with_retry'
	from /Users/nirvdrum/.gem/truffleruby-ws-dev/gems/mongo-2.19.3/lib/mongo/database.rb:262:in `block in read_command'
	from /Users/nirvdrum/.gem/truffleruby-ws-dev/gems/mongo-2.19.3/lib/mongo/client.rb:1156:in `with_session'
	from /Users/nirvdrum/.gem/truffleruby-ws-dev/gems/mongo-2.19.3/lib/mongo/database.rb:261:in `read_command'
	from /Users/nirvdrum/.gem/truffleruby-ws-dev/gems/mongo-2.19.3/lib/mongo/client.rb:974:in `list_databases'
	from /Users/nirvdrum/.gem/truffleruby-ws-dev/gems/mongo-2.19.3/lib/mongo/client.rb:944:in `database_names'
	from -e:1:in `<main>'
/Users/nirvdrum/.gem/truffleruby-ws-dev/gems/mongo-2.19.3/lib/mongo/socket.rb:476:in `map_exceptions': IOError: closed stream (for 127.0.0.1:27018 (no TLS)) (on localhost:27018, connection 1:1, modern retry, attempt 1, later retry failed: Mongo::Error::NoServerAvailable: No primary server is available in cluster: #<Cluster topology=Single[localhost:27018] servers=[#<Server address=localhost:27018 UNKNOWN pool=#<ConnectionPool size=0 (0-20) used=0 avail=0 pending=0 paused>>]> with timeout=30, LT=0.015) (Mongo::Error::SocketError)
	from /Users/nirvdrum/.gem/truffleruby-ws-dev/gems/mongo-2.19.3/lib/mongo/socket.rb:203:in `read'
	from /Users/nirvdrum/.gem/truffleruby-ws-dev/gems/mongo-2.19.3/lib/mongo/protocol/message.rb:253:in `deserialize'
	from /Users/nirvdrum/.gem/truffleruby-ws-dev/gems/mongo-2.19.3/lib/mongo/server/pending_connection.rb:137:in `block (3 levels) in handshake!'
	from /Users/nirvdrum/.gem/truffleruby-ws-dev/gems/mongo-2.19.3/lib/mongo/server/connection_common.rb:146:in `add_server_diagnostics'
	from /Users/nirvdrum/.gem/truffleruby-ws-dev/gems/mongo-2.19.3/lib/mongo/server/pending_connection.rb:135:in `block (2 levels) in handshake!'
	from /Users/nirvdrum/.gem/truffleruby-ws-dev/gems/mongo-2.19.3/lib/mongo/server/round_trip_time_averager.rb:39:in `measure'
	from /Users/nirvdrum/.gem/truffleruby-ws-dev/gems/mongo-2.19.3/lib/mongo/server/pending_connection.rb:134:in `block in handshake!'
	from /Users/nirvdrum/.gem/truffleruby-ws-dev/gems/mongo-2.19.3/lib/mongo/server.rb:502:in `handle_handshake_failure!'
	from /Users/nirvdrum/.gem/truffleruby-ws-dev/gems/mongo-2.19.3/lib/mongo/server/pending_connection.rb:132:in `handshake!'
	from /Users/nirvdrum/.gem/truffleruby-ws-dev/gems/mongo-2.19.3/lib/mongo/server/pending_connection.rb:68:in `handshake_and_authenticate!'
	from /Users/nirvdrum/.gem/truffleruby-ws-dev/gems/mongo-2.19.3/lib/mongo/server/connection.rb:276:in `do_connect'
	from /Users/nirvdrum/.gem/truffleruby-ws-dev/gems/mongo-2.19.3/lib/mongo/server/connection.rb:234:in `connect!'
	from /Users/nirvdrum/.gem/truffleruby-ws-dev/gems/mongo-2.19.3/lib/mongo/server/connection_pool.rb:980:in `connect_connection'
	from /Users/nirvdrum/.gem/truffleruby-ws-dev/gems/mongo-2.19.3/lib/mongo/server/connection_pool.rb:1331:in `connect_or_raise'
	from /Users/nirvdrum/.gem/truffleruby-ws-dev/gems/mongo-2.19.3/lib/mongo/server/connection_pool.rb:1270:in `retrieve_and_connect_connection'
	from /Users/nirvdrum/.gem/truffleruby-ws-dev/gems/mongo-2.19.3/lib/mongo/server/connection_pool.rb:365:in `check_out'
	from /Users/nirvdrum/.gem/truffleruby-ws-dev/gems/mongo-2.19.3/lib/mongo/server/connection_pool.rb:704:in `with_connection'
	from /Users/nirvdrum/.gem/truffleruby-ws-dev/gems/mongo-2.19.3/lib/mongo/server.rb:494:in `with_connection'
	from /Users/nirvdrum/.gem/truffleruby-ws-dev/gems/mongo-2.19.3/lib/mongo/operation/shared/op_msg_executable.rb:35:in `execute'
	from /Users/nirvdrum/.gem/truffleruby-ws-dev/gems/mongo-2.19.3/lib/mongo/database.rb:263:in `block (2 levels) in read_command'
	from /Users/nirvdrum/.gem/truffleruby-ws-dev/gems/mongo-2.19.3/lib/mongo/retryable/read_worker.rb:193:in `modern_read_with_retry'
	from /Users/nirvdrum/.gem/truffleruby-ws-dev/gems/mongo-2.19.3/lib/mongo/retryable/read_worker.rb:117:in `read_with_retry'
	from /Users/nirvdrum/dev/workspaces/truffleruby-ws/graal/sdk/mxbuild/darwin-aarch64/GRAALVM_44DD63DDB7_JAVA21/graalvm-44dd63ddb7-java21-24.0.0-dev/Contents/Home/languages/ruby/lib/mri/forwardable.rb:240:in `read_with_retry'
	from /Users/nirvdrum/.gem/truffleruby-ws-dev/gems/mongo-2.19.3/lib/mongo/database.rb:262:in `block in read_command'
	from /Users/nirvdrum/.gem/truffleruby-ws-dev/gems/mongo-2.19.3/lib/mongo/client.rb:1156:in `with_session'
	from /Users/nirvdrum/.gem/truffleruby-ws-dev/gems/mongo-2.19.3/lib/mongo/database.rb:261:in `read_command'
	from /Users/nirvdrum/.gem/truffleruby-ws-dev/gems/mongo-2.19.3/lib/mongo/client.rb:974:in `list_databases'
	from /Users/nirvdrum/.gem/truffleruby-ws-dev/gems/mongo-2.19.3/lib/mongo/client.rb:944:in `database_names'
	from -e:1:in `<main>'

@keang
Copy link
Author

keang commented Dec 5, 2023

@eregon Native dev also doesn't work:

 ruby -e 'require "mongo"; puts RUBY_DESCRIPTION; puts "Mongo gem version: #{Mongo::VERSION}"; puts Mongo::Client.new("mongodb://localhost:27018").database_names'

truffleruby 24.0.0-dev-9a84a125, like ruby 3.2.2, GraalVM CE Native [aarch64-darwin]
Mongo gem version: 2.19.3
W, [2023-12-05T21:50:50.855895 #44378]  WARN -- : MONGODB | Failed to handshake with localhost:27018: Mongo::Error::SocketTimeoutError: Errno::ETIMEDOUT: Connection timed out - Took more than 10 seconds to receive data (select call timed out) (for 127.0.0.1:27018 (no TLS)) (on localhost:27018)
W, [2023-12-05T21:50:50.857974 #44378]  WARN -- : MONGODB | Error checking localhost:27018: Mongo::Error::SocketTimeoutError: Errno::ETIMEDOUT: Connection timed out - Took more than 10 seconds to receive data (select call timed out) (for 127.0.0.1:27018 (no TLS)) (on localhost:27018)

Thanks @nirvdrum! This is likely a macOS issue?

@eregon eregon added the macos label Dec 5, 2023
@eregon
Copy link
Member

eregon commented Dec 5, 2023

Right, so it seems macOS-specific.
I took a look at the stacktrace and I wonder if it could be an issue of IO.select, TruffleRuby implements it using poll() but that has some weird edge case on macOS IIRC.
There is a Kernel.select at lib/mongo/socket.rb.

@eregon
Copy link
Member

eregon commented Dec 5, 2023

To test that theory, @keang could you try with TruffleRuby 23.0 to see if it works there or if it's the same or different issue?
(IO.select changed to use poll() in 23.1)

@eregon
Copy link
Member

eregon commented Dec 5, 2023

I found that edge case on macOS, it only happens for MSG_OOB though, which I don't think mongo uses: 318148843f6#diff-4c37105fd6c10e347e8a274b39b2d26c7dc196bef32c5af3b940b974ed48d61b
But it could be some other disparity between select(2) and poll(2), so it's useful if you can try 23.0.

@keang
Copy link
Author

keang commented Dec 5, 2023

I couldn't install 23.0 as I have xcode 15.0, and as per this comment I could only install truffleruby-dev 🤔 unless I could cherry pick the changes

asdf install ruby truffleruby-23.0.0
TruffleRuby 23.0 and later installed by ruby-build use the faster Oracle GraalVM distribution
Oracle GraalVM uses the GFTC license, which is free for development and production use, see https://medium.com/graalvm/161527df3d76
==> Downloading content...
-> curl -q -fL -o truffleruby-23.0.0.tar.gz https://dqw8nmjcqpjn7.cloudfront.net/3e6fa0d4a76d9d7d701fe1ea1b75a5d3eab29e77d21bf9454a67b0aa31c63bb6
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  186M  100  186M    0     0  6150k      0  0:00:31  0:00:31 --:--:-- 6411k
==> Installing truffleruby-23.0.0...
-> ./lib/truffle/post_install_hook.sh
==> Installed truffleruby-23.0.0 to /Users/keang.song/.asdf/installs/ruby/truffleruby-23.0.0

gem install mongo
<internal:core> core/kernel.rb:234:in `gem_original_require': dlopen(/Users/keang.song/.asdf/installs/ruby/truffleruby-23.0.0/lib/mri/psych.bundle, 0x0002): Library not loaded: @rpath/libgraalvm-llvm.1.dylib (RuntimeError)
  Referenced from: <A9F0C409-605E-301D-9410-CC441B2298D3> /Users/keang.song/.asdf/installs/ruby/truffleruby-23.0.0/lib/mri/psych.bundle
  Reason: tried: '/opt/homebrew/opt/libyaml/lib/libgraalvm-llvm.1.dylib' (no such file), '/System/Volumes/Preboot/Cryptexes/OS/opt/homebrew/opt/libyaml/lib/libgraalvm-llvm.1.dylib' (no such file), '/opt/homebrew/opt/libyaml/lib/libgraalvm-llvm.1.dylib' (no such file), '/System/Volumes/Preboot/Cryptexes/OS/opt/homebrew/opt/libyaml/lib/libgraalvm-llvm.1.dylib' (no such file)
	from <internal:/Users/keang.song/.asdf/installs/ruby/truffleruby-23.0.0/lib/mri/rubygems/core_ext/kernel_require.rb>:85:in `require'
	from /Users/keang.song/.asdf/installs/ruby/truffleruby-23.0.0/lib/mri/psych.rb:13:in `<top (required)>'
	from <internal:core> core/kernel.rb:234:in `gem_original_require'
	from <internal:/Users/keang.song/.asdf/installs/ruby/truffleruby-23.0.0/lib/mri/rubygems/core_ext/kernel_require.rb>:85:in `require'
	from /Users/keang.song/.asdf/installs/ruby/truffleruby-23.0.0/lib/mri/rubygems.rb:610:in `load_yaml'
	from /Users/keang.song/.asdf/installs/ruby/truffleruby-23.0.0/lib/mri/rubygems/config_file.rb:351:in `load_file'
	from /Users/keang.song/.asdf/installs/ruby/truffleruby-23.0.0/lib/mri/rubygems/config_file.rb:194:in `initialize'
	from /Users/keang.song/.asdf/installs/ruby/truffleruby-23.0.0/lib/mri/rubygems/gem_runner.rb:71:in `do_configuration'
	from /Users/keang.song/.asdf/installs/ruby/truffleruby-23.0.0/lib/mri/rubygems/gem_runner.rb:33:in `run'
	from /Users/keang.song/.asdf/installs/ruby/truffleruby-23.0.0/bin/gem:40:in `<main>'

@eregon
Copy link
Member

eregon commented Dec 5, 2023

Looking around that code in mongo-ruby-driver it might be interesting to run with MONGO_RUBY_DRIVER_LINT=1 to get extra sanity checks

@andrykonchin
Copy link
Member

I reproduce the issue on macOS 13.4.1 (Intel) running truffleruby-dev and don't reproduce with 23.0.0:

truffleruby-dev:

ruby -e 'require "mongo"; puts RUBY_DESCRIPTION; puts "Mongo gem version: #{Mongo::VERSION}"; puts Mongo::Client.new("mongodb://localhost:27018").database_names'
truffleruby 24.0.0-dev-4f786f47, like ruby 3.2.2, GraalVM CE Native [x86_64-darwin]
Mongo gem version: 2.19.3
W, [2023-12-05T19:06:35.124504 #23408]  WARN -- : MONGODB | Failed to handshake with localhost:27018: Mongo::Error::SocketTimeoutError: Errno::ETIMEDOUT: Connection timed out - Took more than 10 seconds to receive data (select call timed out) (for 127.0.0.1:27018 (no TLS)) (on localhost:27018)
W, [2023-12-05T19:06:35.125786 #23408]  WARN -- : MONGODB | Error checking localhost:27018: Mongo::Error::SocketTimeoutError: Errno::ETIMEDOUT: Connection timed out - Took more than 10 seconds to receive data (select call timed out) (for 127.0.0.1:27018 (no TLS)) (on localhost:27018)
W, [2023-12-05T19:06:45.644465 #23408]  WARN -- : MONGODB | Failed to handshake with localhost:27018: Mongo::Error::SocketTimeoutError: Errno::ETIMEDOUT: Connection timed out - Took more than 10 seconds to receive data (select call timed out) (for 127.0.0.1:27018 (no TLS)) (on localhost:27018)
W, [2023-12-05T19:06:45.644974 #23408]  WARN -- : MONGODB | Error checking localhost:27018: Mongo::Error::SocketTimeoutError: Errno::ETIMEDOUT: Connection timed out - Took more than 10 seconds to receive data (select call timed out) (for 127.0.0.1:27018 (no TLS)) (on localhost:27018)
W, [2023-12-05T19:06:56.158072 #23408]  WARN -- : MONGODB | Failed to handshake with localhost:27018: Mongo::Error::SocketTimeoutError: Errno::ETIMEDOUT: Connection timed out - Took more than 10 seconds to receive data (select call timed out) (for 127.0.0.1:27018 (no TLS)) (on localhost:27018)
W, [2023-12-05T19:06:56.158626 #23408]  WARN -- : MONGODB | Error checking localhost:27018: Mongo::Error::SocketTimeoutError: Errno::ETIMEDOUT: Connection timed out - Took more than 10 seconds to receive data (select call timed out) (for 127.0.0.1:27018 (no TLS)) (on localhost:27018)
/Users/andrykonchin/.rbenv/versions/truffleruby-dev/lib/gems/gems/mongo-2.19.3/lib/mongo/server_selector/base.rb:306:in `select_server_impl': No primary_preferred server is available in cluster: #<Cluster topology=Unknown[localhost:27018] servers=[#<Server address=localhost:27018 UNKNOWN>]> with timeout=30, LT=0.015 (Mongo::Error::NoServerAvailable)
	from /Users/andrykonchin/.rbenv/versions/truffleruby-dev/lib/gems/gems/mongo-2.19.3/lib/mongo/server_selector/base.rb:178:in `select_server'
	from /Users/andrykonchin/.rbenv/versions/truffleruby-dev/lib/gems/gems/mongo-2.19.3/lib/mongo/cluster.rb:994:in `validate_session_support!'
	from /Users/andrykonchin/.rbenv/versions/truffleruby-dev/lib/gems/gems/mongo-2.19.3/lib/mongo/client.rb:1245:in `get_session!'
	from /Users/andrykonchin/.rbenv/versions/truffleruby-dev/lib/gems/gems/mongo-2.19.3/lib/mongo/client.rb:1125:in `get_session'
	from /Users/andrykonchin/.rbenv/versions/truffleruby-dev/lib/gems/gems/mongo-2.19.3/lib/mongo/client.rb:1154:in `with_session'
	from /Users/andrykonchin/.rbenv/versions/truffleruby-dev/lib/gems/gems/mongo-2.19.3/lib/mongo/database.rb:261:in `read_command'
	from /Users/andrykonchin/.rbenv/versions/truffleruby-dev/lib/gems/gems/mongo-2.19.3/lib/mongo/client.rb:974:in `list_databases'
	from /Users/andrykonchin/.rbenv/versions/truffleruby-dev/lib/gems/gems/mongo-2.19.3/lib/mongo/client.rb:944:in `database_names'
	from -e:1:in `<main>'

truffleruby-23.0.0:

ruby -e 'require "mongo"; puts RUBY_DESCRIPTION; puts "Mongo gem version: #{Mongo::VERSION}"; puts Mongo::Client.new("mongodb://localhost:27018").database_names'
truffleruby 23.0.0, like ruby 3.1.3, Oracle GraalVM Native [x86_64-darwin]
Mongo gem version: 2.19.3
admin
config
local

@keang
Copy link
Author

keang commented Dec 6, 2023

Here's the output after enabling lint (I checked Mongo::Lint.enabled as well):

 ❯ MONGO_RUBY_DRIVER_LINT=1  ruby -e 'require "mongo"; puts RUBY_DESCRIPTION; puts "Mongo gem version: #{Mongo::VERSION} lint: #{Mongo::Lint.enabled?}"; puts Mongo::Client.new("mongodb://localhost:27018").database_names'
truffleruby 24.0.0-dev-9a84a125, like ruby 3.2.2, GraalVM CE Native [aarch64-darwin]
Mongo gem version: 2.19.3 lint: true
W, [2023-12-06T15:04:13.482141 #76914]  WARN -- : MONGODB | Failed to handshake with localhost:27018: Mongo::Error::SocketTimeoutError: Errno::ETIMEDOUT: Connection timed out - Took more than 10 seconds to receive data (select call timed out) (for 127.0.0.1:27018 (no TLS)) (on localhost:27018)
W, [2023-12-06T15:04:13.484761 #76914]  WARN -- : MONGODB | Error checking localhost:27018: Mongo::Error::SocketTimeoutError: Errno::ETIMEDOUT: Connection timed out - Took more than 10 seconds to receive data (select call timed out) (for 127.0.0.1:27018 (no TLS)) (on localhost:27018)
W, [2023-12-06T15:04:24.016786 #76914]  WARN -- : MONGODB | Failed to handshake with localhost:27018: Mongo::Error::SocketTimeoutError: Errno::ETIMEDOUT: Connection timed out - Took more than 10 seconds to receive data (select call timed out) (for 127.0.0.1:27018 (no TLS)) (on localhost:27018)
W, [2023-12-06T15:04:24.017973 #76914]  WARN -- : MONGODB | Error checking localhost:27018: Mongo::Error::SocketTimeoutError: Errno::ETIMEDOUT: Connection timed out - Took more than 10 seconds to receive data (select call timed out) (for 127.0.0.1:27018 (no TLS)) (on localhost:27018)
W, [2023-12-06T15:04:34.538958 #76914]  WARN -- : MONGODB | Failed to handshake with localhost:27018: Mongo::Error::SocketTimeoutError: Errno::ETIMEDOUT: Connection timed out - Took more than 10 seconds to receive data (select call timed out) (for 127.0.0.1:27018 (no TLS)) (on localhost:27018)
W, [2023-12-06T15:04:34.539430 #76914]  WARN -- : MONGODB | Error checking localhost:27018: Mongo::Error::SocketTimeoutError: Errno::ETIMEDOUT: Connection timed out - Took more than 10 seconds to receive data (select call timed out) (for 127.0.0.1:27018 (no TLS)) (on localhost:27018)
/Users/keang.song/.asdf/installs/ruby/truffleruby-dev/lib/gems/gems/mongo-2.19.3/lib/mongo/server_selector/base.rb:306:in `select_server_impl': No primary_preferred server is available in cluster: #<Cluster topology=Unknown[localhost:27018] servers=[#<Server address=localhost:27018 UNKNOWN>]> with timeout=30, LT=0.015 (Mongo::Error::NoServerAvailable)
	from /Users/keang.song/.asdf/installs/ruby/truffleruby-dev/lib/gems/gems/mongo-2.19.3/lib/mongo/server_selector/base.rb:178:in `select_server'
	from /Users/keang.song/.asdf/installs/ruby/truffleruby-dev/lib/gems/gems/mongo-2.19.3/lib/mongo/cluster.rb:994:in `validate_session_support!'
	from /Users/keang.song/.asdf/installs/ruby/truffleruby-dev/lib/gems/gems/mongo-2.19.3/lib/mongo/client.rb:1245:in `get_session!'
	from /Users/keang.song/.asdf/installs/ruby/truffleruby-dev/lib/gems/gems/mongo-2.19.3/lib/mongo/client.rb:1125:in `get_session'
	from /Users/keang.song/.asdf/installs/ruby/truffleruby-dev/lib/gems/gems/mongo-2.19.3/lib/mongo/client.rb:1154:in `with_session'
	from /Users/keang.song/.asdf/installs/ruby/truffleruby-dev/lib/gems/gems/mongo-2.19.3/lib/mongo/database.rb:261:in `read_command'
	from /Users/keang.song/.asdf/installs/ruby/truffleruby-dev/lib/gems/gems/mongo-2.19.3/lib/mongo/client.rb:974:in `list_databases'
	from /Users/keang.song/.asdf/installs/ruby/truffleruby-dev/lib/gems/gems/mongo-2.19.3/lib/mongo/client.rb:944:in `database_names'
	from -e:1:in `<main>'

@eregon
Copy link
Member

eregon commented Dec 8, 2023

I paired with @andrykonchin and indeed it's related to TruffleRuby's poll() usage for IO.select.
Specifically, if we change the Kernel.select calls in https://github.com/mongodb/mongo-ruby-driver/blob/5c641687c4d866c2fd2aa15323a39dc3b18a43f9/lib/mongo/socket.rb#L340-L342 to not pass the 3rd set (error set/array) then it works fine.

So I think this is a bug of macOS poll(2) which does not handle being called like:

poll([{fd, POLLIN}, {fd, POLLPRI}], 2, 0) // hangs on macOS, fine on Linux

Specifically it hangs even though the fd can be read (i.e., poll([{fd, POLLIN}], 1, 0) returns fine).
We also tried if manually merging them works and it does, i.e.:

poll([{fd, POLLIN|POLLPRI}], 1, 0) // returns 3=POLLIN|POLLPRI, good

Interestingly, this works fine on macOS:

poll([{fd, POLLIN}, {fd, POLLOUT}], 2, 0) // returns [POLLIN, POLLOUT], so works fine on macOS

So it sounds a specific issue for POLLPRI.

The poll man page does not specify whether it's allowed or not to have the same fd in multiple struct pollfd structs.

I'll find a fix. Ideas so far:

  • Manually merge to have a single struct per fd, cons: not so cheap to do and some complexity
  • Ignore the 3rd array of IO.select on macOS. Errors are actually handled in POLLIN and POLLOUT already for poll(), and POLLPRI seems very broken on macOS, + TCP OOB stuff is not portable and basically never used. We should still check that IOs in the 3rd array are either in the 1st or 2nd array to ensure errors are reported there, and throw some exception otherwise (suspicious call to IO.select then).

@eregon eregon assigned andrykonchin and eregon and unassigned eregon Dec 8, 2023
@eregon
Copy link
Member

eregon commented Dec 12, 2023

@keang This is fixed now on truffleruby-head, and @andrykonchin verified it indeed works on macOS.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants