Skip to content

Commit

Permalink
fix: Gracefully handle several types of transient TCP errors
Browse files Browse the repository at this point in the history
  • Loading branch information
adamlogic committed Mar 11, 2023
1 parent 68b3020 commit bbe4813
Show file tree
Hide file tree
Showing 2 changed files with 11 additions and 3 deletions.
12 changes: 10 additions & 2 deletions judoscale-ruby/lib/judoscale/adapter_api.rb
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,14 @@ class AdapterApi
include Logger

SUCCESS = "success"
TRANSIENT_ERRORS = [
Errno::ECONNREFUSED,
Errno::ECONNRESET,
Net::OpenTimeout,
Net::ReadTimeout,
OpenSSL::SSL::SSLError,
]


def initialize(config)
@config = config
Expand Down Expand Up @@ -43,14 +51,14 @@ def post_raw(options)
when 200...300 then SuccessResponse.new(response.body)
else FailureResponse.new([response.code, response.message].join(" - "))
end
rescue Net::OpenTimeout
rescue *TRANSIENT_ERRORS => ex
if attempts < 3
# TCP timeouts happen sometimes, but they can usually be successfully retried in a moment
sleep 0.01
attempts += 1
retry
else
FailureResponse.new("Timeout while obtaining TCP connection to #{uri.host}")
FailureResponse.new("Could not connect to #{uri.host}: #{ex.inspect}")
end
end

Expand Down
2 changes: 1 addition & 1 deletion judoscale-ruby/test/adapter_api_test.rb
Original file line number Diff line number Diff line change
Expand Up @@ -57,7 +57,7 @@
result = adapter_api.report_metrics(report_params)

_(result).must_be_instance_of Judoscale::AdapterApi::FailureResponse
_(result.failure_message).must_equal "Timeout while obtaining TCP connection to railsautoscale.dev"
_(result.failure_message).must_equal "Could not connect to railsautoscale.dev: #<Net::OpenTimeout: execution expired>"
assert_requested stub, times: 3
end

Expand Down

0 comments on commit bbe4813

Please sign in to comment.