Implement client-side timeouts #329

psarna · 2021-11-04T15:01:42Z

Pre-review checklist

I have split my patch into logically separate commits.
All commit messages clearly explain what they change and why.
I added relevant tests for new features and bug fixes.
All commits compile, pass static checks and pass test.
PR description sums up the changes and reasons why they should be introduced.
I added appropriate Fixes: annotations to PR description.

This pull request introduces client-side timeouts and a management layer for orphaned stream ids. Orphaned stream ids are the ones for which the client-side timeout already fired, but no response came from the server yet. In such scenario, a stream id cannot be released yet, because the response associated with the same id may still be received later. Instead, orphaned stream ids are tracked per connection, and once a threshold is reached, the connection is killed.

Fixes #304
Fixes #305

psarna · 2021-11-05T09:26:08Z

Tested manually by setting up a program with very low client timeout and observing the orphan count rise up to the threshold.

piodul · 2021-11-08T15:10:09Z

scylla/src/transport/connection.rs

+    AllocStreamId {
+        stream_id_sender: oneshot::Sender<i16>,
+        response_handler: ResponseHandler,
+    },
+    // Send a request to the server
+    Request {
+        stream_id: i16,
+        serialized_request: SerializedRequest,
+    },


I think those should be a single message. Consider a situation in which the future returned from send_request sends AllocStreamId, then is dropped while waiting for the stream ID and does not send a Request message. The stream ID will not be freed in this case.

Hm, right, makes sense to reduce the back-and-forth anyway. And all it takes is to add the stream_id_sender to Request. Will do in v2

psarna · 2021-11-09T10:16:06Z

v2:

merged AllocStreamId and Request into a single task -- there was no point in splitting it in two in the first place; stream id is simply returned via a separate oneshot channel
retested

piodul · 2021-11-09T17:23:30Z

scylla/src/transport/connection.rs

@@ -61,9 +61,17 @@ pub struct Connection {



Regarding the commit message: I believe there are 32k streams available on a connection, not 16k.

Yeah, but negative ids are reserved for events iirc, so users can de facto use 16k

Oh wait, the pool is 64k and 32k are for the user, you're right

piodul · 2021-11-09T17:33:36Z

scylla/src/transport/connection.rs

+                        if let Some(id) = hmap.allocate(response_handler) {
+                            stream_id_sender
+                                .send(id)
+                                .map_err(|_| QueryError::UnableToAllocStreamId)?;
+                            id
+                        } else {
+                            error!("Unable to allocate stream id");
+                            return Err(QueryError::UnableToAllocStreamId);
+                        }


If an error occurs here, it will be returned from the writer function, which will close the whole connection and will propagate this error to everybody who waits for a response on this connection. Previously, if we couldn't allocate a stream ID for a request, only the one who waits for this request got an error (because the Sender half of the response channel was dropped and the Receiver gets an error). Is that an intended change?

Yeah, it was intended, but we can discuss whether it's too harsh or not

piodul · 2021-11-09T17:38:48Z

scylla/src/transport/connection.rs

+                            "Too many orphaned stream ids: {}, dropping connection",
+                            hmap.orphan_count()
+                        );
+                        return Err(QueryError::TooManyOrphanedStreamIds(hmap.orphan_count()));


I'm a bit worried that immediately closing a connection in case of too many orphaned IDs may unnecessarily cause availability issues. What if all connections start timing out requests at the same rate and we decide to close all of them for that reason at the same time?

piodul · 2021-11-10T13:13:37Z

scylla/src/transport/session_test.rs

+    // Spontaneous errors are expected when running with a client timeout set to 0 seconds.
+    // If they happen, the test case is assumed to be correct


I feel that it would be nice for client-side timeouts to be configurable per-query, not per-session. Latency requirements may be different for different queries - in particular, setting the timeout too low may cause some of the internal queries not to work correctly, e.g. topology refresh. Also, if we did that, this test could be written differently so that it only expects failure from the query with explicitly set client-side timeout.

We could have a separate internal timeout for queries (maybe the connection timeout could be used?)

True, I'll add a way to define a client timeout per query

piodul · 2021-11-10T13:17:53Z

scylla/src/transport/connection.rs

-            req.set_stream(stream_id);
-            write_half.write_all(req.get_data()).await?;
+                Task::Orphan { stream_id } => {
+                    let mut hmap = handler_map.try_lock().unwrap();


Here, we probably can remove the entry from handler_map corresponding to the orphaned stream ID.

Why? The id is orphaned, but a response to it can still arrive, so once it does, it will remove the entry as well. And if it doesn't and eventually we hit max orphan count, the connection dies anway. Or do you mean that only as an optimization?

I mainly meant it as an optimization - the memory used by the entry can be freed at this point because the contents of the entry won't be used (it's a Sender of a oneshot channel which won't be read from).

The configuration option will be used to apply client-side timeouts for requests.

which indicates, that the query timed out client-side, but it can still be processed by Scylla for all we know.

Client-side timeout is now applied to requests - if a request takes longer to receive a response than the specified timeout, it will be reported as an error.

The error fires when a threshold of orphaned stream ids, i.e. the ones on which the client stopped waiting, but no response came from the server, is reached.

psarna · 2021-11-12T10:37:20Z

v3:

added a way to specify per-query client timeout
added docs

psarna · 2021-11-12T10:38:59Z

The fact that the whole connection is killed now in the event of not enough stream ids is still not addressed, but after some consideration I decided to keep the original semantics. We should have better handling of no stream ids anyway, but there's no point in severing a connection which could still have ongoing requests just because somebody pushed one too many new requests in there.

When a client-side timeout fires, the driver stops waiting for a response for a particular stream id. We cannot however release this stream id to the pool, because the response may still arrive late. Such abandonded stream ids are tracked separately, and once a connection has too many of them, it breaks and a new connection should be established instead. The current threshold is hardcoded to 1024 (out of ~32k stream ids total).

The number of acceptable orphaned stream ids is now configurable. After the specified limit is hit, the connection will be killed.

It's now possible to specify the client-side timeout independently for each query.

The test case checks if client-side timeout fires when its time passes.

1. a typo 2. a statement about unlimited concurrency was a bit misleading

A short description of client-side timeouts is added to the docs.

psarna · 2021-11-12T10:44:42Z

v4:

restored the original behavior of not killing the connection when a stream id could not be allocated; instead, only the offending request will get an error

psarna · 2021-11-16T16:14:08Z

review ping

havaker · 2021-11-16T17:59:36Z

scylla/src/transport/connection.rs

-            .map_err(|e| {
-                QueryError::ClientTimeout(format!(
+            .map_err(|_| QueryError::UnableToAllocStreamId)?;
+        let received = match tokio::time::timeout(self.config.client_timeout, receiver).await {


Library user can decide to stop polling future returned by Session::query* method, causing this line to be last line executed. In this case no Task::Orphan message will be send via self.submit_channel.

I think that speculative execution can also cancel futures, so the problem persists even without user intentionally canceling futures.

It's true, but it's also no worse than what we have now - the connection would eventually go out of stream ids and die. Users should be encouraged to use our timeout instead of tokio timeout. Do you have any suggestions on how to approach this issue?

One idea that comes to my mind would be to make the router responsible for detecting timeouts and marking stream IDs as orphaned. It would keep a BTreeMap-based queue of stream IDs ordered by their deadlines.

I'm more for a different solution.
The problem in current solution is that we are not able to deliver notification about orphanage to the router, due to future cancellation.

To ensure, that some action is run in an async function, even when a function's future was dropped, one can use a destructor of an object living in the body of this function. I propose introducing such object to the send_request function, to use it for notifying router about orphanage.

Due to destructors not being async, notifying via bounded channel as it was done now is not an option. Luckily, unbounded tokio channels do not require awaiting to send some value, so they can be used in rust's drop implementations.

struct OrphanhoodNotifier { disabled: bool, // initialized to false stream_id: i16, sender: UnboundedSender<i16>, } impl OrphanhoodNotifier { fn disable(mut self) { self.disabled = true; } } impl Drop for OrphanhoodNotifier { fn drop(&mut self) { if (!self.disabled) { self.sender.send(self.stream_id); } } }

I imagine, that the OrphanhoodNotifier would be sent instead of raw stream_id using stream_id_sender (as this would protect us from not notifying about orphanage, when send_request is canceled before receiving stream_id from stream_id_receiver but after it submitted Task::Request). After receival of query response, orphan notification would be disabled by calling .disable() method of OrphanhoodNotifier.

Such design would also require changes in router's way of allocating stream_ids and marking stream_ids as orphaned.
Why? Let's consider a situation, when:

A request is sent by send_request. Let's call this invocation of send_request "the first invocation".

The request sent by the first invocation gets a stream_id = 1.

Connection's router executes this request and, after receiving response, stream_id = 1 goes back to pool.

Next, another request is sent by another invocation of send_request (let's call it "the second invocation").

The second request also gets a stream_id = 1, because it was already available in a pool.

Now, if a late orphanage notification somehow arrives from the first invocation of send_request, it would mark the wrong request (request currently having stream_id = 1 sent by the send_request second invocation).

One way to get rid of this problem, is to track generations of stream_ids.

struct OrphanhoodNotifier { disabled: bool, // initialized to false stream_id: i16, generation: u64, // new field denoting the stream_id's generation number sender: UnboundedSender<i16>, }

On the router side, an associative array containing mapping stream_id -> current generation number would be kept. When a new stream_id is allocated, it's generation number inside the associative array is incremented. If a orphanhood notification is received, there should be check if the generations match (the one in notification and other in associative array). After check succeeds, stream_id can be considered as orphaned. If the check does not succeed, we can be sure, that the router already received the response, and the stream_id was reused.

I believe this design is better than keeping BTreeMap queue of stream IDs, because it allows to handle timeouts by future cancellation (and therefore allows us to implement timeouts without knowing the timeout duration in advance). Being able to handle cancellations is crucial as the implementation of speculative execution heavily depends on them (by using select!).

psarna · 2022-03-04T07:52:48Z

Noble cause, but the implementation is severely outdated. Closing this one, we'll get it rewritten.

psarna force-pushed the client_side_timeouts branch 3 times, most recently from 19eff12 to b287ef2 Compare November 5, 2021 08:10

psarna requested review from piodul and cvybhu November 8, 2021 11:35

psarna marked this pull request as ready for review November 8, 2021 11:36

piodul reviewed Nov 8, 2021

View reviewed changes

psarna force-pushed the client_side_timeouts branch from 3cfb15a to 6b4051f Compare November 9, 2021 10:13

psarna mentioned this pull request Nov 9, 2021

0.3.0 release #332

Closed

piodul reviewed Nov 10, 2021

View reviewed changes

psarna force-pushed the client_side_timeouts branch from 6b4051f to 8a32ff9 Compare November 12, 2021 09:56

psarna added 5 commits November 12, 2021 11:25

session,connection: add client_timeout to config

ae39b68

The configuration option will be used to apply client-side timeouts for requests.

errors: introduce ClientTimeout error

79e20e7

which indicates, that the query timed out client-side, but it can still be processed by Scylla for all we know.

connection: obey client-side timeouts

6b21126

Client-side timeout is now applied to requests - if a request takes longer to receive a response than the specified timeout, it will be reported as an error.

errors: add a TooManyOrphanedStreamIds error

323822d

The error fires when a threshold of orphaned stream ids, i.e. the ones on which the client stopped waiting, but no response came from the server, is reached.

errors: add an error for not being able to allocate stream ids

420bfcb

psarna force-pushed the client_side_timeouts branch from 8a32ff9 to b993a7c Compare November 12, 2021 10:26

psarna added 6 commits November 12, 2021 11:44

session,connection: make max orphan count configurable

6a8b6b5

The number of acceptable orphaned stream ids is now configurable. After the specified limit is hit, the connection will be killed.

statement: add per-query client timeout

4e39ce5

It's now possible to specify the client-side timeout independently for each query.

session_test: add a case for client-side timeout

20d1ef9

The test case checks if client-side timeout fires when its time passes.

docs: fix a few nitpicks in queries.md

f0ba12b

1. a typo 2. a statement about unlimited concurrency was a bit misleading

docs: add entries about client timeouts

d453980

A short description of client-side timeouts is added to the docs.

psarna force-pushed the client_side_timeouts branch from b993a7c to d453980 Compare November 12, 2021 10:44

psarna requested a review from havaker November 16, 2021 16:14

havaker reviewed Nov 16, 2021

View reviewed changes

psarna closed this Mar 4, 2022

Lorak-mmk deleted the client_side_timeouts branch October 12, 2023 13:34

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement client-side timeouts #329

Implement client-side timeouts #329

psarna commented Nov 4, 2021 •

edited

Loading

psarna commented Nov 5, 2021

piodul Nov 8, 2021

psarna Nov 8, 2021

psarna commented Nov 9, 2021

piodul Nov 9, 2021

psarna Nov 10, 2021

psarna Nov 10, 2021

piodul Nov 9, 2021

psarna Nov 10, 2021

piodul Nov 9, 2021

piodul Nov 10, 2021

psarna Nov 10, 2021

piodul Nov 10, 2021

psarna Nov 12, 2021

piodul Nov 17, 2021

psarna commented Nov 12, 2021

psarna commented Nov 12, 2021

psarna commented Nov 12, 2021

psarna commented Nov 16, 2021

havaker Nov 16, 2021

psarna Nov 16, 2021

piodul Nov 17, 2021

havaker Nov 18, 2021

psarna commented Mar 4, 2022

		// Spontaneous errors are expected when running with a client timeout set to 0 seconds.
		// If they happen, the test case is assumed to be correct

Implement client-side timeouts #329

Implement client-side timeouts #329

Conversation

psarna commented Nov 4, 2021 • edited Loading

Pre-review checklist

psarna commented Nov 5, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

psarna commented Nov 9, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

psarna commented Nov 12, 2021

psarna commented Nov 12, 2021

psarna commented Nov 12, 2021

psarna commented Nov 16, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

psarna commented Mar 4, 2022

psarna commented Nov 4, 2021 •

edited

Loading