-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RPC delay metrics #2293
Merged
Merged
RPC delay metrics #2293
Changes from all commits
Commits
Show all changes
5 commits
Select commit
Hold shift + click to select a range
cf50944
rpc: Fix hard-coded constants when sending unknown verb reply
xemul dc42c7b
rpc: Track handler execution time
xemul 5fc9acd
rpc: Exchange handler duration with server responses
xemul 41ee6ee
rpc: Calculate delay and export it via metrics
xemul 595ae18
test,rpc: Extend simple ping-pong case
xemul File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -118,6 +118,7 @@ struct client_options { | |
/// \see resource_limits::isolate_connection | ||
sstring isolation_cookie; | ||
sstring metrics_domain = "default"; | ||
bool send_handler_duration = true; | ||
}; | ||
|
||
/// @} | ||
|
@@ -179,6 +180,7 @@ enum class protocol_features : uint32_t { | |
CONNECTION_ID = 2, | ||
STREAM_PARENT = 3, | ||
ISOLATION = 4, | ||
HANDLER_DURATION = 5, | ||
}; | ||
|
||
// internal representation of feature data | ||
|
@@ -285,6 +287,7 @@ protected: | |
std::unique_ptr<compressor> _compressor; | ||
bool _propagate_timeout = false; | ||
bool _timeout_negotiated = false; | ||
bool _handler_duration_negotiated = false; | ||
// stream related fields | ||
bool _is_stream = false; | ||
connection_id _id = invalid_connection_id; | ||
|
@@ -427,6 +430,7 @@ class client : public rpc::connection, public weakly_referencable<client> { | |
struct reply_handler_base { | ||
timer<rpc_clock_type> t; | ||
cancellable* pcancel = nullptr; | ||
rpc_clock_type::time_point start; | ||
virtual void operator()(client&, id_type, rcv_buf data) = 0; | ||
virtual void timeout() {} | ||
virtual void cancel() {} | ||
|
@@ -487,7 +491,11 @@ private: | |
private: | ||
future<> negotiate_protocol(feature_map map); | ||
void negotiate(feature_map server_features); | ||
future<std::tuple<int64_t, std::optional<rcv_buf>>> | ||
// Returned future is | ||
// - message id | ||
// - optional server-side handler duration | ||
// - message payload | ||
future<std::tuple<int64_t, std::optional<uint32_t>, std::optional<rcv_buf>>> | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Better to document what the optional means. Note: microseconds can overflow 32 bits for an rpc call. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It's ~1 hour. Are there handlers that last more? |
||
read_response_frame_compressed(input_stream<char>& in); | ||
public: | ||
/** | ||
|
@@ -592,7 +600,7 @@ public: | |
public: | ||
connection(server& s, connected_socket&& fd, socket_address&& addr, const logger& l, void* seralizer, connection_id id); | ||
future<> process(); | ||
future<> respond(int64_t msg_id, snd_buf&& data, std::optional<rpc_clock_type::time_point> timeout); | ||
future<> respond(int64_t msg_id, snd_buf&& data, std::optional<rpc_clock_type::time_point> timeout, std::optional<rpc_clock_type::duration> handler_duration); | ||
client_info& info() { return _info; } | ||
const client_info& info() const { return _info; } | ||
stats get_stats() const { | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about time waiting in the send queue (usually none) or the receive queue? Or waiting for the rpc semaphore?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This new metrics include all of these.
Or does your question really mean "lets adding those metrics too"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean that the difference is not RTT, it's RTT + waits in queues we don't (and can't) measure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's the definition of RTT I get used to
and "starting point" here is whoever calls
send_helper::operator()
. It doesn't tell if there are or there are no queues/delays/throttlers/whatever on the way of the request.I'm fine to rename it to be clear, but how do you define RTT then?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
RTT is the network time without any queues (which ping measures if it doesn't have a bad day).
Do you subtract service time from the measured round trip?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The "without any queues" is pretty optimistic requirement, I would say. I admit that outgoing is likely to take queue-less paths even in the kernel, but incoming all consists of queues.
Depends on what "service time" is. If it's duration of the verb handler, then of course, that's the point (9bbe38c)