Skip to content
Richard Hsu edited this page Jul 7, 2014 · 15 revisions

Collecting Data / Metrics

Naarad provides a shell script to help you collect the performance metrics that you can then later feed into naarad to analyze. The script collects Linux system metrics in logs using sar and top commands, and /proc/meminfo, /proc/vmstat and /proc/zoneinfo files.

Copy the script to the machines that are you want to analyze and then run it to start collecting data. You will need to give it an argument specifying where to save the logs.

./sar.sh /tmp/logs

You can also update the parameters COUNT and INTERVAL in sar.sh to customize how many times and how frequently to collect metrics. By default, it is set to collect sar data every 2 seconds for 450 times, so in total, for 15 minutes.

Config

Naarad needs a config file that lists all the metrics and the graphing options. Example config files can be found in naarad/examples/conf directory. Here is a sample config:

[GC]
infile=/tmp/logs/gc.log
gc-options=GC appstop alloc promo used0 used1 used commit0 commit1 commit gen0 gen0t gen0usr gen0sys cmsIM cmsRM cmsRS GC cmsCM

[SAR-cpuusage]
infile=/tmp/logs/sar.cpuusage.out
 
[GRAPH]
outdir=/tmp/naarad-out

The config is in INI format with each section describing details about each metric and a special section called GRAPH specifying details about the graphing options.

Once you have a config describing all your metrics, parsing and plotting needs, just call naarad with the config file as its argument and it should produce all the plots in a basic html report in the outdir specified in config

 naarad -c config

Naarad can also take command line arguments: -i or --input_dir and -o or --output_dir. If input_dir is specified, all the infile options in the config are assumed to be relative to input_dir. User can also specify output_dir on command line and skip specifying the outdir option in the config. But if outdir is specified in the config, that takes precedence.

So you could have a shorter config file:

[GC]
infile=gc.log
gc-options=GC appstop alloc promo used0 used1 used commit0 commit1 commit gen0 gen0t gen0usr gen0sys cmsIM cmsRM cmsRS GC cmsCM

[SAR-cpuusage]
infile=sar.cpuusage.out

[GRAPH]
graphs=GC.GC,all GC.cmsRM,GC.cmsIM,GC.gen0t GC.promo,GC.alloc

And run it as:

 naarad -c config -i /tmp/logs -o /tmp/naarad-out

Calculated metrics

Naarad supports basic calculation over a single metric. In particular, it supports calculating rate and diff.

  • rate - calculate rate of a metric defined as difference in 2 consecutive data points divided by the time difference. Or, (V[n+1) - V[n])/(T[n+1] - T[n])
  • diff - calculate diff if a metric, defined as the difference in 2 consecutive data points. Or, (V[n+1) - V[n])

These metrics can be defined by adding a calc_metrics option in metric definition. Example config:

[GC]
infile=/tmp/logs/loggc-small
gc-options=alloc promo
calc_metrics=alloc-rate=rate(alloc) promo-rate=rate(promo) alloc-diff=diff(alloc)

SLAs

Naarad also supports basic SLAs to pass or fail a performance run based on rules over metric stats. The SLA rules can be specified in the config in each metric section. The attribute name should be the sub_metric name with a .sla suffix and the value should be space separated list of rules. E.g.,

[GC-machine1]
infile=/tmp/logs/gc.log
gc-options=GC alloc promo
GC.sla=mean<0.05 p50<0.05 p99<0.05

Naarad generates a results file for each metric called <metric-name>.sla.csv. The file reports each sla rule and whether it passed or failed. The csv columns are: sub_metric,stat_name,threshold,sla_type,stat_value,sla_passed.

GC,mean,0.05,lt,0.0365469535637,True
GC,p50,0.05,lt,0.0352635,True
GC,p99,0.05,lt,0.06489572,False