Add ability to continuously stream metrics #217

cdahlqvist · 2017-02-09T10:51:10Z

Based on tests it looks like all metrics currently are written to the metrics store at the end of the challenge. For long-running benchmarks this is impractical as it gives a potentially vary large amount of data to index at the end and makes it hard to see how the benchmark is doing during the execution.

It would be very useful to be able to stream metrics through bulk requests to the metrics store as the benchmark progresses.

danielmitterdorfer · 2017-02-09T13:04:45Z

This is a tricky one. The reason why Rally does not stream metrics is to minimize any overhead. For our own long-running benchmarks, we use multiple iterations (--laps). Rally does store metrics between laps. I think we could get down to the operation level (i.e. send metrics after each operation has finished) without too much hassle but streaming them without creating any bottlenecks will be tricky.

cdahlqvist · 2017-02-09T13:21:18Z

I agree it is a tricky one. Writing metrics directly to a Elasticsearch could make the benchmark execution sensitive to back-pressure applied from the metrics store. What I have done in the past is to stream the records to a file in JSON format as they are generated. This has allowed me to set up filebeat/Logstash to tail this file and index them as quickly as possible. Would it be possible to have 'file' as a third mode in order to add flexibility? This also opens up for post-processing, which would be a useful option.

danielmitterdorfer · 2017-08-09T15:07:02Z

The work done in #278 should help when implementing this. That change makes it relatively easy to send request and maybe even system metrics once a task has finished (instead of doing all this at the end of a benchmark). However, streaming still remains extremely tricky given that the throughput calculation is done via a sliding window. So I think we will implement this in two steps:

Provide metrics after a task has finished
"Stream" metrics (probably in micro-batches covering 5 seconds of data; this is how Rally works internally today)

danielmitterdorfer · 2017-10-26T12:23:09Z

Rally will store request metrics in microbatches every 30 seconds (not configurable). This includes service time, latency and throughput samples and all system metrics in case Rally is not only used as a load generator. All postprocessing is done on the coordinator machine. Thus, we don't influence performance of load generators.

danielmitterdorfer added :Load Driver Changes that affect the core of the load driver such as scheduling, the measurement approach etc. enhancement Improves the status quo labels Feb 9, 2017

danielmitterdorfer added this to the 0.6.0 milestone Feb 9, 2017

danielmitterdorfer modified the milestones: 0.7.x, 0.8.x Aug 9, 2017

danielmitterdorfer mentioned this issue Aug 9, 2017

Add 'file' as a possible metrics store #229

Closed

danielmitterdorfer modified the milestones: 0.8.x, 0.7.x Oct 13, 2017

danielmitterdorfer modified the milestones: 0.7.x, 0.7.4 Oct 23, 2017

danielmitterdorfer closed this as completed in f7635cc Oct 26, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add ability to continuously stream metrics #217

Add ability to continuously stream metrics #217

cdahlqvist commented Feb 9, 2017

danielmitterdorfer commented Feb 9, 2017

cdahlqvist commented Feb 9, 2017 •

edited

Loading

danielmitterdorfer commented Aug 9, 2017

danielmitterdorfer commented Oct 26, 2017

Add ability to continuously stream metrics #217

Add ability to continuously stream metrics #217

Comments

cdahlqvist commented Feb 9, 2017

danielmitterdorfer commented Feb 9, 2017

cdahlqvist commented Feb 9, 2017 • edited Loading

danielmitterdorfer commented Aug 9, 2017

danielmitterdorfer commented Oct 26, 2017

cdahlqvist commented Feb 9, 2017 •

edited

Loading