-
Notifications
You must be signed in to change notification settings - Fork 314
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Store duration time of task in rally-results metrics record #1220
Conversation
# Conflicts: # esrally/metrics.py
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR! That's very helpful. I left a couple of suggestions.
I also think that a prerequisite for this is #1198 so we have consistent time scale. Otherwise all measurements would be in milliseconds except for relative-time, which would be denoted in microseconds.
esrally/metrics.py
Outdated
@@ -1673,6 +1675,11 @@ def total_transform_metric(self, metric_name): | |||
def error_rate(self, task_name, operation_type): | |||
return self.store.get_error_rate(task=task_name, operation_type=operation_type, sample_type=SampleType.Normal) | |||
|
|||
def duration(self, task_name): | |||
values = self.store.get_raw("service_time", task_name, mapper=lambda doc: doc["relative-time"]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can potentially return millions of documents (e.g. in a benchmark with thousands of clients) and we're only interested in one value. I wonder whether this would justify a specialized query method in the metrics store as it would be much cheaper only to retrieve the maximum value directly?
I also want to point out that relative-time
is determined before a request is issued:
rally/esrally/driver/driver.py
Lines 1060 to 1062 in 62595bd
@property | |
def relative_time(self): | |
return self.request_start - self.task_start |
So strictly speaking duration
does not account for the service time of the last request. If this is a long-running request (e.g. in frozen tier benchmarks) the duration measurement could be several minutes off. This could be specifically misleading if there is only one measurement iteration as duration
would report something in order of several milliseconds when the actual duration would more be like several minutes.
So if we settle on a special query method, I propose that the API allows us to retrieve one sample ordered by a criterion. E.g. we could amend the current get_one
method in the metrics store:
Line 619 in 62595bd
def get_one(self, name, sample_type=None, node_name=None, task=None): |
to allow sorting:
def get_one(self, name, sample_type=None, node_name=None, task=None, sort_key=None, sort_reverse=False):
Note: I've borrowed the names of the new parameters sort_key
an sort_reverse
from Python's sorted
function for an idiomatic API.
What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good, I will use that, thank you for the suggestion!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@danielmitterdorfer interestingly the current implementation for get_one uses get and the just returns values[0] so it would still return all the values and then pick the first one locally.
Line 633 in 62595bd
return values[0] if values else None |
Was your idea to modify the search query so that it would only return one value?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we should implement get_one
in both subclasses (InMemoryMetricsStore
and EsMetricsStore
) and remove the generic implementation in the base class.
esrally/metrics.py
Outdated
@@ -1821,6 +1830,7 @@ def add_op_metrics(self, task, operation, throughput, latency, service_time, pro | |||
"service_time": service_time, | |||
"processing_time": processing_time, | |||
"error_rate": error_rate, | |||
"duration_time": duration |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd argue that duration
by itself conveys its meaning already and the _time
suffix is redundant? (see also https://docs.oracle.com/en/java/javase/15/docs/api/java.base/java/time/Duration.html)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for iterating! I left a couple more comments. Can you please also implement tests for get_one
for both metrics store implementations (including tests for handling an empty search result)?
esrally/metrics.py
Outdated
@@ -825,6 +828,29 @@ def _get(self, name, task, operation_type, sample_type, node_name, mapper): | |||
self.logger.debug("Metrics query produced [%s] results.", result["hits"]["total"]) | |||
return [mapper(v["_source"]) for v in result["hits"]["hits"]] | |||
|
|||
def get_one(self, name, sample_type=None, node_name=None, task=None, mapper=lambda doc: doc["value"], | |||
sort_key=None, sort_reverse=False): | |||
order = "desc" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe this would be simpler?
order = "desc" if sort_reverse else "asc"
order = "desc" | ||
if not sort_reverse: | ||
order = "asc" | ||
if sort_key: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alternatively you could implement this block as follows:
query = {
"query": self._query_by_name(name, task, None, sample_type, node_name),
"size": 1
}
if sort_key:
query["sort"] = [{sort_key: {"order": order}}]
This avoids repetition of common elements.
esrally/metrics.py
Outdated
self.logger.debug("Issuing get against index=[%s], query=[%s].", self._index, query) | ||
result = self._client.search(index=self._index, body=query) | ||
self.logger.debug("Metrics query produced [%s] results.", result["hits"]["total"]) | ||
return mapper(result["hits"]["hits"][0]["_source"]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would fail with a key error if there are no hits. Can we check this and return None
if there are no hits?
docs = sorted(self.docs, key=lambda k: k[sort_key], reverse=sort_reverse) | ||
else: | ||
docs = self.docs | ||
for doc in docs: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need to return None
if there are no hits (i.e. after the for loop).
tests/metrics_test.py
Outdated
@@ -1583,6 +1585,7 @@ def test_calculate_global_stats(self): | |||
self.assertEqual(collections.OrderedDict( | |||
[("50_0", 200), ("100_0", 210), ("mean", 200), ("unit", "ms")]), opm["service_time"]) | |||
self.assertAlmostEqual(0.3333333333333333, opm["error_rate"]) | |||
self.assertAlmostEqual(709*1000, opm["duration"]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't we able to use assertEqual
here as this is an integer value?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Opps thank you for catching
tests/metrics_test.py
Outdated
@@ -1595,6 +1598,7 @@ def test_calculate_global_stats(self): | |||
self.assertEqual(17.2, stats.ml_processing_time[0]["median"]) | |||
self.assertEqual(36.0, stats.ml_processing_time[0]["max"]) | |||
self.assertEqual("ms", stats.ml_processing_time[0]["unit"]) | |||
self.assertAlmostEqual(600*1000, opm2["duration"]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't we able to use assertEqual
here as this is an integer value?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for iterating! I left one comment but no need for another review round. LGTM
tests/metrics_test.py
Outdated
actual_duration = self.metrics_store.get_one("service_time", task="task1", mapper=lambda doc: doc["relative-time-ms"], | ||
sort_key="relative-time-ms", sort_reverse=True) | ||
|
||
self.assertEqual(None, actual_duration) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can use self.assertIsNone
here.
Thank you for the review @danielmitterdorfer |
Closes: #1197