Skip to content

06. Connection Timeouts

kwrodarmer edited this page Nov 27, 2019 · 3 revisions

Connection Timeouts

Bucket Stores

The public SRA has been replicated into AWS and GCP, which has caused a lot of changes. In addition, the NCBI on-premises storage has been transferred to a bucket-store system as of Fall 2019.

Bucket stores do not behave like POSIX filesystems. The database technology for SRA (VDB) is designed to work with POSIX filesystems, and now they are gone - and with them, random access and fast response. The first issue that comes up is Time to First Byte, which is generally very poor. It differs between the cloud and appliance providers, but it is always significant. Once a transfer starts, the data rates are reasonable but must be made in chunks of at least 10M. And any seeking within a file over HTTP incurs yet another TTFB penalty. All of these have a negative effect upon VDB (in particular, VDB2).

Caching

With release 2.10.0, we made a number of changes to our retrieval and caching strategies. We now cache by default to a temporary file in a POSIX file system and perform our reads on a background thread that uses 10M chunks to try to avoid random access. Our strategies are evolving and the VDB2 code is being tuned.

Timeout Behavior

VDB has a concept of a reliable URL: one that is not expected to fail. When accessing a reliable URL, VDB becomes very stubborn in the face of network errors and retries a number of times. In the past you will have observed a lot of errors and warnings about network errors that give the appearance of hard errors and little indication that the tools were able to overcome them and continue. We are still learning the behavior of different storage systems, and tuning timeouts is a delicate endeavor: too short and the access attempt is prematurely aborted, while too long and the tools appear to hang.

VDB Configuration

VDB uses a hierarchical configuration approach. When you run vdb-config in interactive or command line modes, you will be editing a user-local configuration. There are a large number of variables, most of which are only available at this time from command line mode.

/libs/kns/connect/timeout       # mS to wait for a successful connection
/libs/kns/connect/timeout/read  # mS to wait for data to become available to read
/libs/kns/connect/timeout/write # mS to wait for output buffer to accept data

/http/timeout/read              # mS to wait for an HTTP response
/http/timeout/write             # mS to wait to send an HTTP request

The first set (/libs/kns) affect general socket behavior. The second set (/http) affect behavior of a simulated file over HTTP protocol. There is some overlap, but to illustrate the difference - a single and arbitrary read of an HTTP file may involve multiple socket reads. This means that the HTTP timeout is intended to be a higher-level value.