Skip to content
This repository has been archived by the owner on Jun 27, 2019. It is now read-only.

Reporting

Zsolt Kővári edited this page Jul 1, 2015 · 4 revisions

Reporting

Data

The automatic diagram generation depends on the appropriate CSV format that contains the measurement results. The results follow a long row orientated concept, which means, every row in the CSV represents a certain result of the execution of an atomic phase restricted to only one metric.

One required expectation about the CSV file is the precisely corresponding header. The necessary columns are the following:

  • PhaseName: Identifies the executed atomic phase. It is recommended to name differently the phases which involve various functionalities.
  • Tool: The name of the measured tool.
  • Iteration: Denotes the number of executions of a certain phase in a running benchmark.
  • CaseName: Identifies the benchmark case after its name.
  • MetricValue: The exact value of the measured metric. This one will be represented on the y-axis on the plots.
  • RunIndex: Similarly to the iteration parameter, it denotes an index, however, it is related to the number of running benchmarks which contain the same configurations. The RunIndex parameter assures that every benchmark with the same configuration can be treated separately, and thus, (typically) the median of the uniform metric values are represented. In the other case, if the RunIndex values were the same as the other configuration parameters (as Tool, Scenario, Size etc.), the values of the same metrics would be summarised.
  • MetricName: Indicates the measured metric after its name.
  • Sequence: Identifies a precise order amongst the evaluated phases in a benchmark. Actually, this parameter is not used during the reporting process.
  • Scenario: Denotes the name of the current scenario.
  • Size: Basically, it represents the used model after its size, however, it is also suited to identify the models as descriptors, or identifiers.

The result serializer in MONDO-SAM and the Python converter always provide an appropriate format.

Example

For clarity's sake, an appropriate usage of the source (CSV) file is represented below:

Size PhaseName MetricName Sequence MetricValue Scenario CaseName RunIndex Tool Iteration
2 Read Time 1 3319370759 Batch ConnectedSegments 1 EMFIncQuery 1
2 Check Time 2 38932 Batch ConnectedSegments 1 EMFIncQuery 1
8 Read Time 1 2502094440 User PosLength 1 Sesame 1
8 Check Time 2 714011507 User PosLength 1 Sesame 1
8 ReCheck Time 3 136291111 User PosLength 1 Sesame 1
8 ReCheck Time 4 110182804 User PosLength 1 Sesame 2
8 ReCheck Time 5 982323145 User PosLength 1 Sesame 3
2 Read Time 1 34546353533 Batch ConnectedSegments 2 EMFIncQuery 1
2 Check Time 2 65233 Batch ConnectedSegments 2 EMFIncQuery 1

At first, notice that there are three benchmark measurements in the CSV. The easiest way to discover it is to check the Sequence column. This parameter is always increased by one as a new phase is executed, and is starts with 1. Furthermore, there are two different scenarios (Batch, User), model sizes (2 and 8), cases (ConnectedSegments , PosLength), and two different tools are measured such as EMFIncQuery and Sesame. In this example, the MetricName equals with Time in every row, thus, the MetricValue represents the evaluation times of phases. As far as the Iteration column is concerned, it defines a clear sequence of that phases which are evaluated more times during the same benchmark. For example, the ReCheck phase is executed three times in the same benchmark (6th-7th row). Note that the Iteration differs from the Sequence parameter, since the latter is increased every time when a new execution of a phase is started. Finally, the RunIndex parameter equals with 2 in the last two rows, since that configuration was used once already in the first benchmark (first two rows).

By using the benchmark engine of MONDO-SAM, the Sequence and Iteration parameters are determined automatically. However, the developer's task to manage the RunIndex parameter, and thus, differentiates the same benchmark configurations. It is highly recommended to handle the same benchmarks with different RunIndex parameters.

Configuration

The reporting mechanism is controlled by the reporting/config.json file. A possible configuration is seen below:

{
  "Plot": [
    {
      "X_Dimension": "Size",
      "Legend": "Tool",
      "Summarize_Function": [
        "Check",
        "ReCheck"
      ],
      "Title": "Check+Recheck phase, SCENARIO CASENAME",
      "Metrics": [
        "Time"
      ],
      "Metric_Scale": -6,
      "Min_Iteration": 1,
      "Max_Iteration": -1,
      "Y_Label": "Time (ms)",
      "X_Axis_Scale": "con",
      "Y_Axis_Scale": "log2",
      "Extensions": "PNG"
    }
  ]
}

As it is illustrated above, key-value parameters define the exact plot that will be generated, and the represented data as well. The obligatory parameters and their possible values are the following:

  • X_Dimension: Defines the represented dimension on the x-axis. Its value can be chosen from the predefined header.
  • Legend: Presents the legend on the plot. The measurement results will be grouped by this parameter.
  • Summarize_Function: It includes arbitrary number of benchmark phases as an array. Every listed phase and the corresponding metric values to them will be summarised on the diagrams. For clarity's sake, the configuration above entails that the values of the defined metric (Time) will be summarised, restricted to the given phases, Check and ReCheck.
  • Title: This string will be showed on the plots as a title. In order to receive comprehensive fitting titles to the current diagram, it is feasible to define templates in the title, which will be injected by the current value of certain parameter. For instance, the configuration above defines a title with the templates SCENARIO and CASENAME. As a result, the certain name of the scenario and the current benchmark case will be inserted to the string, thus, it will correspond to represented data as well. The possible templates: SCENARIO, TOOL, SIZE, CASENAME. It is necessary to write every one of them uppercase.
  • Metrics: Similarly to the Summarize_Function, it represents an array, and contains the metrics that will be represented on plots. In other words, the listed metrics and the belonging values to them will be summarised on the diagrams.
  • Metric_Scale: The MetricValue columns's results will be modified with adjusted parameter. This parameter denotes the exponent of ten with which the MetricValue will be multiplied. In this case, -6 means that a values will be divided by a million. If the value of this parameter equals to 0, the results will not be changed. On the other hand, if the value is 3, the values of the metrics will be multiplied by one thousand. The value of the parameter can be any arbitrary integer.
  • Min_Iteration: Determines the first represented iteration. With the usage of Max_Iteration, they define an interval.
  • Max_Iteration: Similarly the previous one, but defines the maximum iteration. If the value equals to -1 (or lower), every possible iteration will take into account.
  • Y_Label: Names the label on y-axis.
  • X_Axis_Scale and Y_Axis_Scale: Determine the scales on axes. Three options are available: con: continuous, log2: two-base logarithm, factor: handle the values as factors.
  • Extension: The generated plots will be saved to the given format.

Furthermore, there is a chance to adjust optional parameters as well. These are the following:

Parameter Possible values Default value Description
Legend_Filters character array An array that contains values from the Legend column. If it is given, only that values which are in the array will be represented in the plots.
X_Label character "" Label on x-axis.
Show_Values TRUE/FALSE FALSE Show measurement values on the diagrams.
Draw_Lines TRUE/FALSE TRUE Show lines on diagrams.
Legend_Position ["top", "right", "bottom", "left"] "bottom"
Legend_Direction ["vertical", "horizontal"] "vertical"
Theme ["r", "black", "sam", "specific"] "sam" See below.
Point_Size numeric 2
Line_Size numeric 1
Text_Size numeric 16 The size of the title and legends.
Text_Font ["Times", "Helvetica", "Bookman", "AvantGarde", "Courier", "NewCenturySchoolbook", "Palatino"] "Helvetica"
X_Text_Size numeric 13 Size of the labels on the x-axis.
Y_Text_Size numeric 15 Size of the labels on the y-axis.
X_Axis_Horizontal_Justice [-1, 1] 0.5 Between -1 and 1.
X_Axis_Vertical_Justice [-1, 1] 0 Between -1 and 1.
Y_Axis_Horizontal_Justice [-1, 1] 1 Between -1 and 1.
Y_Axis_Vertical_Justice [-1, 1] 0.5 Between -1 and 1.
Automatic_Filename TRUE/FALSE TRUE Generate an automatic filename by injecting the current date to it.
Specified_Filename character "" Defined by the user.
Diagram_Width numeric 14 Dimension: cm
Diagram_Height numeric 7 Dimension: cm
Diagram_DPI numeric 300

Theme: Defines the theme of the plots. The "r" means the default ggplot theme, used by R; "black" indicates a black and white theme; and "sam" is the MONDO-SAM default theme. Furthermore, by adjusting "specific", it becomes feasible to give further parameters (marked with bold).

A typical usage of the extended configuration is the following. For the sake of representation, let rely on the same CSV file from earlier:

{
  "Plot": [
    {
      "X_Dimension": "Size",
      "Legend": "Tool",
      "Legend_Filters": ["Sesame", "EMFIncQuery"],
      "Summarize_Function": [
        "Check"
      ],
      "Metrics": [
        "Time"
      ],
      "Min_Iteration": null,
      "Max_Iteration": null,
      "Title": "",
      "Metric_Scale": -6,
      "X_Label": "",
      "Y_Label": "",
      "X_Axis_Scale": "log2",
      "Y_Axis_Scale": "log2",
      "Show_Values": false,
      "Draw_Lines": true,
      "Legend_Position": "bottom",
      "Legend_Direction": "vertical",
      "Theme": "sam",
      "Point_Size": 2,
      "Line_Size": 1,
      "Text_Size": 16,
      "Text_Font": "Helvetica",
      "X_Text_Size": 13,
      "Y_Text_Size": 13,
      "X_Axis_Horizontal_Justice": 0.5,
      "X_Axis_Vertical_Justice": 0,
      "Y_Axis_Horizontal_Justice": 1,
      "Y_Axis_Vertical_Justice": 0.5,
      "Extension": "PDF",
      "Automatic_Filename": true,
      "Specified_Filename": null,
      "Diagram_Width": 14,
      "Diagram_Height": 7,
      "Diagram_DPI": 300
    }
  ]
}

There is a way to adjust more configurations in the same file by creating more objects in the Plot array. For example:

{
  "Plot": [
    {
      "X_Dimension": "Size",
      "Legend": "Tool",
      "Summarize_Function": [
        "Check",
        "ReCheck"
      ],
      "Title": "Check+Recheck phase, SCENARIO CASENAME",
      "Metrics": [
        "Time"
      ],
      "Metric_Scale": -6,
      "Min_Iteration": 1,
      "Max_Iteration": -1,
      "Y_Label": "Time (ms)",
      "X_Axis_Scale": "con",
      "Y_Axis_Scale": "log2",
      "Extensions": "PNG"
    },
    {
      "X_Dimension": "Size",
      "Legend": "Tool",
      "Legend_Filter": ["Sesame"],
      "Summarize_Function": [
        "ReCheck"
      ],
      "Title": "Recheck phase, SCENARIO CASENAME",
      "Metrics": [
        "Time"
      ],
      "Metric_Scale": -6,
      "Min_Iteration": 1,
      "Max_Iteration": 2,
      "Y_Label": "Time (ms)",
      "X_Axis_Scale": "factor",
      "Y_Axis_Scale": "log2",
      "Extensions": "PDF"
    }
  ]
}
Clone this wiki locally