Skip to content

Commit

Permalink
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
better handling of network related timeouts
Browse files Browse the repository at this point in the history
With a working network connection:
* command batch is called on non gvl thread
* tinytds_err_handler is called with timeout error and returns INT_TIMEOUT
* dbcancel is called and command batch returns
* nogvl_cleanup is called and timeout error is raised

This is all great. The timeout is hit, the db connection lives, and a error is thrown.

With a network failure:
* command batch is called on non gvl thread
* tinytds_err_handler is called with timeout error and returns INT_TIMEOUT
* dbcancel is called and does not succeed. command batch never returns
* nogvl_cleanup is not called

This is not great. dbcancel does not succeed because of the network failure. the command batch does not return until the underlying network timeout on the os is hit. TinyTds doesn't throw an error in the expected timeout window.

To fix, we set a flag when a timeout is encountered. We use dbsetinterrupt to check this flag periodically while waiting on a read from the server. Once the flag is set the interrupt with send INT_CANCEL causing the pending command batch to return early. This means nogvl_cleanup will be called and raise the timeout error.

This shouldn't have any affect in "normal" timeout conditions due to the fact that dbcancel will actually succeed and cause the normal flow before the interrupt can be called/handled. This is good because in these situtations we still want the dbproc to remain in working condition.
bvogelzang committed Apr 16, 2021

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
1 parent 4953cd9 commit a284169
Showing 9 changed files with 128 additions and 26 deletions.
2 changes: 1 addition & 1 deletion .travis.yml
Original file line number Diff line number Diff line change
@@ -14,9 +14,9 @@ rvm:
- 2.7.0
before_install:
- docker info
- docker-compose up -d
- sudo ./test/bin/install-openssl.sh
- sudo ./test/bin/install-freetds.sh
- sudo ./test/bin/setup.sh
install:
- gem install bundler
- bundle --version
2 changes: 2 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,7 @@
## (unreleased)

* Improve handling of network related timeouts

## 2.1.3

* Removed old/unused appveyor config
11 changes: 7 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
@@ -419,17 +419,20 @@ First, clone the repo using the command line or your Git GUI of choice.
$ git clone [email protected]:rails-sqlserver/tiny_tds.git
```

After that, the quickest way to get setup for development is to use [Docker](https://www.docker.com/). Assuming you have [downloaded docker](https://www.docker.com/products/docker) for your platform and you have , you can run our test setup script.
After that, the quickest way to get setup for development is to use [Docker](https://www.docker.com/). Assuming you have [downloaded docker](https://www.docker.com/products/docker) for your platform, you can use [docker-compose](https://docs.docker.com/compose/install/) to run the necessary containers for testing.

```shell
$ ./test/bin/setup.sh
$ docker-compose up -d
```

This will download our SQL Server for Linux Docker image based from [microsoft/mssql-server-linux/](https://hub.docker.com/r/microsoft/mssql-server-linux/). Our image already has the `[tinytdstest]` DB and `tinytds` users created. Basically, it does the following.
This will download our SQL Server for Linux Docker image based from [microsoft/mssql-server-linux/](https://hub.docker.com/r/microsoft/mssql-server-linux/). Our image already has the `[tinytdstest]` DB and `tinytds` users created. This will also download a [toxiproxy](https://github.com/shopify/toxiproxy) Docker image which we can use to simulate network failures for tests. Basically, it does the following.

```shell
$ docker network create main-network
$ docker pull metaskills/mssql-server-linux-tinytds
$ docker run -p 1433:1433 -d metaskills/mssql-server-linux-tinytds
$ docker run -p 1433:1433 -d --name sqlserver --network main-network metaskills/mssql-server-linux-tinytds
$ docker pull shopify/toxiproxy
$ docker run -p 8474:8474 -p 1234:1234 -d --name toxiproxy --network main-network shopify/toxiproxy
```

If you are using your own database. Make sure to run these SQL commands as SA to get the test database and user installed.
22 changes: 22 additions & 0 deletions docker-compose.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
version: '3'

networks:
main-network:

services:
mssql:
image: metaskills/mssql-server-linux-tinytds:2017-GA
container_name: sqlserver
ports:
- "1433:1433"
networks:
- main-network

toxiproxy:
image: shopify/toxiproxy
container_name: toxiproxy
ports:
- "8474:8474"
- "1234:1234"
networks:
- main-network
36 changes: 35 additions & 1 deletion ext/tiny_tds/client.c
Original file line number Diff line number Diff line change
@@ -86,7 +86,13 @@ int tinytds_err_handler(DBPROCESS *dbproc, int severity, int dberr, int oserr, c
but we don't ever want to automatically retry. Instead have the app
decide what to do.
*/
return_value = INT_TIMEOUT;
if (userdata->timing_out) {
return INT_CANCEL;
}
else {
userdata->timing_out = 1;
return_value = INT_TIMEOUT;
}
cancel = 1;
break;

@@ -165,6 +171,33 @@ int tinytds_msg_handler(DBPROCESS *dbproc, DBINT msgno, int msgstate, int severi
return 0;
}

/*
Used by dbsetinterrupt -
This gets called periodically while waiting on a read from the server
Right now, we only care about cases where a read from the server is
taking longer than the specified timeout and dbcancel is not working.
In these cases we decide that we actually want to handle the interrupt
*/
static int check_interrupt(void *ptr) {
GET_CLIENT_USERDATA((DBPROCESS *)ptr);
return userdata->timing_out;
}

/*
Used by dbsetinterrupt -
This gets called if check_interrupt returns TRUE.
Right now, this is only used in cases where a read from the server is
taking longer than the specified timeout and dbcancel is not working.
Return INT_CANCEL to abort the current command batch.
*/
static int handle_interrupt(void *ptr) {
GET_CLIENT_USERDATA((DBPROCESS *)ptr);
if (userdata->timing_out) {
return INT_CANCEL;
}
return INT_CONTINUE;
}

static void rb_tinytds_client_reset_userdata(tinytds_client_userdata *userdata) {
userdata->timing_out = 0;
userdata->dbsql_sent = 0;
@@ -381,6 +414,7 @@ static VALUE rb_tinytds_connect(VALUE self, VALUE opts) {
}
}
dbsetuserdata(cwrap->client, (BYTE*)cwrap->userdata);
dbsetinterrupt(cwrap->client, check_interrupt, handle_interrupt);
cwrap->userdata->closed = 0;
if (!NIL_P(database) && (azure != Qtrue)) {
dbuse(cwrap->client, StringValueCStr(database));
1 change: 1 addition & 0 deletions ext/tiny_tds/result.c
Original file line number Diff line number Diff line change
@@ -91,6 +91,7 @@ static void nogvl_setup(DBPROCESS *client) {
static void nogvl_cleanup(DBPROCESS *client) {
GET_CLIENT_USERDATA(client);
userdata->nonblocking = 0;
userdata->timing_out = 0;
/*
Now that the blocking operation is done, we can finally throw any
exceptions based on errors from SQL Server.
57 changes: 38 additions & 19 deletions test/client_test.rb
Original file line number Diff line number Diff line change
@@ -68,6 +68,9 @@ class ClientTest < TinyTds::TestCase
end

describe 'With in-valid options' do
before(:all) do
init_toxiproxy
end

it 'raises an argument error when no :host given and :dataserver is blank' do
assert_raises(ArgumentError) { new_connection :dataserver => nil, :host => nil }
@@ -129,30 +132,46 @@ class ClientTest < TinyTds::TestCase
end
end

it 'must run this test to prove we account for dropped connections' do
skip
it 'raises TinyTds exception with tcp socket network failure' do
skip if ENV['CI'] && ENV['APPVEYOR_BUILD_FOLDER'] # only CI using docker
begin
client = new_connection :login_timeout => 2, :timeout => 2
client = new_connection timeout: 2, port: 1234
assert_client_works(client)
STDOUT.puts "Disconnect network!"
sleep 10
STDOUT.puts "This should not get stuck past 6 seconds!"
action = lambda { client.execute('SELECT 1 as [one]').each }
assert_raise_tinytds_error(action) do |e|
assert_equal 20003, e.db_error_number
assert_equal 6, e.severity
assert_match %r{timed out}i, e.message, 'ignore if non-english test run'
action = lambda { client.execute("waitfor delay '00:00:05'").do }

# Use toxiproxy to close the TCP socket after 1 second.
# We want TinyTds to execute the statement, hit the timeout configured above, and then not be able to use the network to cancel
# the network connection needs to close after the sql batch is sent and before the timeout above is hit
Toxiproxy[:sqlserver_test].toxic(:slow_close, delay: 1000).apply do
assert_raise_tinytds_error(action) do |e|
assert_equal 20003, e.db_error_number
assert_equal 6, e.severity
assert_match %r{timed out}i, e.message, 'ignore if non-english test run'
end
end
ensure
STDOUT.puts "Reconnect network!"
sleep 10
action = lambda { client.execute('SELECT 1 as [one]').each }
assert_raise_tinytds_error(action) do |e|
assert_equal 20047, e.db_error_number
assert_equal 1, e.severity
assert_match %r{dead or not enabled}i, e.message, 'ignore if non-english test run'
assert_new_connections_work
end
end

it 'raises TinyTds exception with dead connection network failure' do
skip if ENV['CI'] && ENV['APPVEYOR_BUILD_FOLDER'] # only CI using docker
begin
client = new_connection timeout: 2, port: 1234
assert_client_works(client)
action = lambda { client.execute("waitfor delay '00:00:05'").do }

# Use toxiproxy to close the network connection after 1 second.
# We want TinyTds to execute the statement, hit the timeout configured above, and then not be able to use the network to cancel
# the network connection needs to close after the sql batch is sent and before the timeout above is hit
Toxiproxy[:sqlserver_test].toxic(:timeout, timeout: 1000).apply do
assert_raise_tinytds_error(action) do |e|
assert_equal 20047, e.db_error_number
assert_includes [1,9], e.severity
assert_match %r{dead or not enabled}i, e.message, 'ignore if non-english test run'
end
end
close_client(client)
ensure
assert_new_connections_work
end
end
22 changes: 21 additions & 1 deletion test/test_helper.rb
Original file line number Diff line number Diff line change
@@ -2,6 +2,7 @@
require 'bundler' ; Bundler.require :development, :test
require 'tiny_tds'
require 'minitest/autorun'
require 'toxiproxy'

TINYTDS_SCHEMAS = ['sqlserver_2000', 'sqlserver_2005', 'sqlserver_2008', 'sqlserver_2014', 'sqlserver_azure', 'sybase_ase'].freeze

@@ -212,6 +213,25 @@ def rollback_transaction(client)
client.execute("ROLLBACK TRANSACTION").do
end

def init_toxiproxy
return if ENV['APPVEYOR_BUILD_FOLDER'] # only for CI using docker

# In order for toxiproxy to work for local docker instances of mssql, the containers must be on the same network
# and the host used below must match the mssql container name so toxiproxy knows where to proxy to.
# localhost from the perspective of toxiproxy's container is its own container an *not* the mssql container it needs to proxy to.
# docker-compose.yml handles this automatically for us. In instances where someone is using their own local mssql container they'll
# need to set up the networks manually and set TINYTDS_UNIT_HOST to their mssql container name
# For anything other than localhost just use the environment config
env_host = ENV['TINYTDS_UNIT_HOST_TEST'] || ENV['TINYTDS_UNIT_HOST'] || 'localhost'
host = ['localhost', '127.0.0.1', '0.0.0.0'].include?(env_host) ? 'sqlserver' : env_host
port = ENV['TINYTDS_UNIT_PORT'] || 1433
Toxiproxy.populate([
{
name: "sqlserver_test",
listen: "0.0.0.0:1234",
upstream: "#{host}:#{port}"
}
])
end
end
end

1 change: 1 addition & 0 deletions tiny_tds.gemspec
Original file line number Diff line number Diff line change
@@ -26,4 +26,5 @@ Gem::Specification.new do |s|
s.add_development_dependency 'rake-compiler-dock', '~> 1.0'
s.add_development_dependency 'minitest', '~> 5.6'
s.add_development_dependency 'connection_pool', '~> 2.2'
s.add_development_dependency 'toxiproxy', '~> 2.0.0'
end

0 comments on commit a284169

Please sign in to comment.