libCacheSim provides a set of tools to help you analyze traces. After building the project, you can find a binary called traceAnalyzer
This doc shows how to use the tool.
If you are interested, the source code is located in the bin/traceAnalyzer/ and traceAnalyzer directory.
# ./bin/traceAnalyzer --help for a list of tasks and options
./bin/traceAnalyzer PATH_TO_TRACE traceType [--task1] [--task2]
: run all common tasks, including--stat
: run all tasks--accessPattern
: generate access pattern data for plotting using scripts/traceAnalysis/
: generate request rate data for plotting using scripts/traceAnalysis/
: generate size distribution data for plotting using scripts/traceAnalysis/ and scripts/traceAnalysis/
: generate reuse distribution data for plotting using scripts/traceAnalysis/ and scripts/traceAnalysis/
: generate popularity data for plotting using scripts/traceAnalysis/
: generate popularity data for plotting using scripts/traceAnalysis/
# run all common tasks
./bin/traceAnalyzer PATH_TO_TRACE traceType --common
The trace analyzer will generate statistics of the trace and save them to stat
and traceStat
An example output running a block cache workload:
dat: w92.oracleGeneral.bin.zst
number of requests: 4284658, number of objects: 606386
number of req GiB: 133.5165, number of obj GiB: 21.7219
compulsory miss ratio (req/byte): 0.1415/0.1627
object size weighted by req/obj: 33459/38463
frequency mean: 7.0659
time span: 609774(7.0576 day)
request rate min 0.5400 req/s, max 257.7533 req/s, window 300s
object rate min 0.1333 obj/s, max 256.6933 obj/s, window 300s
X-hit (number of obj accessed X times): 5720(0.0094), 4813(0.0079), 4514(0.0074), 5044(0.0083), 34919(0.0576), 529106(0.8726), 2551(0.0042), 2161(0.0036),
freq (fraction) of the most popular obj: 74030(0.0173), 74007(0.0173), 59484(0.0139), 39656(0.0093), 27403(0.0064), 27391(0.0064), 21380(0.0050), 19828(0.0046),
A second example showing the first 10 million requests of the Twitter cluster52 trace (can be found in the data directory)
dat: ../data/twitter_cluster52_10m.csv
number of requests: 10000000, number of objects: 897664
number of req GiB: 1.8806, number of obj GiB: 0.1627
compulsory miss ratio (req/byte): 0.0898/0.0865
object size weighted by req/obj: 201/194
frequency mean: 11.1400
time span: 5293(0.0613 day)
write: 0(0), overwrite: 0(0), del:0(0)
request rate min 1753.7533 req/s, max 1986.3433 req/s, window 300s
object rate min 300.3567 obj/s, max 319.8633 obj/s, window 300s
popularity: Zipf linear fitting slope=0.9472
X-hit (number of obj accessed X times): 323699(0.3606), 218436(0.2433), 51516(0.0574), 128181(0.1428), 48785(0.0543), 25172(0.0280), 14606(0.0163), 14769(0.0165),
freq (fraction) of the most popular obj: 546563(0.0547), 365140(0.0365), 221311(0.0221), 190811(0.0191), 154037(0.0154), 151832(0.0152), 127070(0.0127), 98851(0.0099),
We provide plot scripts in scripts/traceAnalysis/ to help you plot the trace statistics. After generating plot data, we can plot access pattern, request rate, size, reuse, and popularity using the following commands:
# plot the access pattern using wall clock (real) time
python3 scripts/traceAnalysis/ ${dataname}.accessRtime
# plot the access pattern using logical/virtual (request count) time
python3 scripts/traceAnalysis/ ${dataname}.accessRtime
Some example plots are shown below:
# this is only supported for traces that have (wall clock) time field
python3 scripts/traceAnalysis/ ${dataname}.reqRate_w300
Some example plots are shown below:
The block workload has a daily request spike, while the Twitter workload is too short to observe a pattern.
# this is only supported for traces that have object size
python3 scripts/traceAnalysis/ ${dataname}.size
Some example plots are shown below:
The block workload has most objects being 4 KiB and 64 KiB, while the Twitter workload has most objects being 64 B.
The Request curve is weighted by request count, and the Object curve is weighted by object count.
This is the time since the last access of the object.
python3 scripts/traceAnalysis/ ${dataname}.reuse
Some example plots are shown below
# the popularity skewness ($\alpha$) is in the output of traceAnalyzer
# this plots the request count/freq over object rank
# note that measuring popularity plot does not make sense for very small traces and some block workloads
# and note that popularity is highly affected by the the layer of the cache hierarchy
python3 scripts/traceAnalysis/ ${dataname}.popularity
Some example plots are shown below:
This and the following plots are more expensive plots that require more CPU cycles and DRAM usage to generate. This plot requires wall clock time and object size in the trace. This is a heatmap of the size distribution of the trace. The x-axis is the clock time, and the y-axis is the size. The color represents the number of requests having a certain size range at that time. The darker the color, the more requests of the certain size at that time. The heatmap is generated using the following command:
python3 scripts/traceAnalysis/ ${dataname}.sizeWindow_w300
Some example plots are shown below:
This is a heatmap of the reuse distribution of the trace. The x-axis is the wall clock time, and the y-axis is the reuse time (in seconds) or reuse distance (the number of requests since last access of the object). The color represents the number of requests having the reuse time or reuse distance. The heatmap is generated using the following command:
python3 scripts/traceAnalysis/ ${dataname}.reuseWindow_w300
Some example plots are shown below:
There are two versions of the plots, one is line plot, and the other is a heatmap.
# this requires a long trace (e.g., 7 day) to generate a meaningful plot
# and most block workloads do not have enough requests to plot meaningful popularity decay
python3 scripts/traceAnalysis/ ${dataname}.popularityDecay_w300