Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Patch HTTPClient to accept socket_connect_timeout #3174

Merged

Conversation

philippthun
Copy link
Member

Monkey patch HTTPClient to accept socket_connect_timeout as optional configuration. If set, it is used as connect_timeout parameter when creating a TCPSocket.

Set the socket_connect_timeout in Diego::Client to (overall) connect_timeout / 2.

Add test to ensure that the httpclient version is not updated without revisiting the monkey patch.

Background information

ℹ️ Use case: Mitigates the risk of Runner is unavailable and Stager is unavailable errors during a cf push when the client is unable to reach a bbs instance.


❌ First, discarded solution proposal: #3002


✔️ Second, merged solution proposal: #3048

💥 Issues caused by #3048

ℹ️ Root cause: Net::HTTP and altered Diego::Client implementations are not thread-safe.

🔄 Revert PR: #3113


✔️ Third, merged solution proposal: #3170

💩 Issues caused by #3170

  • cc-unit-tests failing (Concourse)

ℹ️ Root cause: Concourse executes the unit tests with Ruby 3.1.3 (as opposed to the version 3.1.2 specified in .ruby-version) which includes a newer Net::HTTP version that reverted the usage of the Socket.tcp's connect_timeout option (see ruby/net-http#74).

🔄 Revert PR: #3171


  • I have reviewed the contributing guide

  • I have viewed, signed, and submitted the Contributor License Agreement

  • I have made this pull request to the main branch

  • I have run all the unit tests using bundle exec rake

  • I have run CF Acceptance Tests

Monkey patch HTTPClient to accept socket_connect_timeout as optional
configuration. If set, it is used as connect_timeout parameter when
creating a TCPSocket.

Set the socket_connect_timeout in Diego::Client to (overall)
connect_timeout / 2.

Add test to ensure that the httpclient version is not updated without
revisiting the monkey patch.

Co-authored-by: Johannes Haass <[email protected]>
@philippthun
Copy link
Member Author

@johha and me prepared a local test environment to simulate an unresponsive bbs server and ran tests with different HTTP clients:

HTTP client 10 sequential pings 50 parallel threads Notes
httpclient ❌ fails n/a current implementation
httpclient + monkey patch ✅ succeeds ✅ succeeds set connect_timeout on TCPSocket
net_http (only v0.2.x) ✅ succeeds ❌ fails
net_http_persistent
(only with net_http v0.2.x)
✅ succeeds ✅ succeeds pool_size: 50
net_http (other versions) ❌ fails n/a
http.rb ❌ fails n/a

@johha johha marked this pull request as ready for review February 3, 2023 09:55
@johha johha merged commit 71ba242 into cloudfoundry:main Feb 3, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants