Skip to content

Commit

Permalink
i#3995 multi-window: Write separate raw files per window (#5450)
Browse files Browse the repository at this point in the history
Adds a drmemtrace feature under a new on-by-default -split_windows
option to create a separate subdirectory with a separate set of raw
files per traced window.  This avoids disk space issues with a single
file, and splitting at the raw stage is relatively simple for regular
drmemtrace usage (though not as simple for external users of the file
i/o redirection).

Files in raw/window.NNNN/ subdirectories are mirrored in
trace/window.NNNN/ subdirectories upon being post-processed.
Post-processing handles just the first window by default; the others
must be explicitly passed as input directories in separate
post-processing invocations.

This changes the non-window behavior to not create an output file
until tracing starts, which necessitated changing the
tool.drcacheoff.delay-func test to check for no output files as a
slightly different type of test.

Adds a test of split-file offline windows.

Fixes an infinite loop bug in raw2trace hit when a file is truncated:
hit while the windows were buggy and missing footers.

Issue: #3995
  • Loading branch information
derekbruening authored Apr 6, 2022
1 parent ceb1c8d commit eae2f4e
Show file tree
Hide file tree
Showing 19 changed files with 424 additions and 166 deletions.
6 changes: 5 additions & 1 deletion api/docs/multi_trace_window.dox
Original file line number Diff line number Diff line change
Expand Up @@ -96,7 +96,7 @@ For typical use with files on the local disk, we could add creation of a new
directory (and duplication of the module file) for each window by the tracing thread
that hits the end-of-window trigger. The other threads would each create a new
output raw file each time they transitioned to a new window (see also the Proposal A
discussion below).
discussion below). This is implemented today with the \p -split_windows option.

## Splitting during raw2trace

Expand Down Expand Up @@ -154,6 +154,10 @@ online traces we will probably stick with multi-window-at-once.

We’ll create a tool to manually split up multi-window trace files.

Update: We ended up implementing splitting at the raw file output level for the
local-disk use case; we may additionally implement splitting in an anlyzer for other
use cases.

# Design Point: Continuous Control v. Re-Attach

One method of obtaining multiple traces is to repeat today’s bursts over and over,
Expand Down
3 changes: 2 additions & 1 deletion clients/drcachesim/analyzer_multi.cpp
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
/* **********************************************************
* Copyright (c) 2016-2021 Google, Inc. All rights reserved.
* Copyright (c) 2016-2022 Google, Inc. All rights reserved.
* **********************************************************/

/*
Expand Down Expand Up @@ -114,6 +114,7 @@ analyzer_multi_t::analyzer_multi_t()
error_string_ = "raw2trace failed: " + error;
}
}
tracedir = raw2trace_directory_t::tracedir_from_rawdir(op_indir.get_value());
if (!init_file_reader(tracedir, op_verbose.get_value()))
success_ = false;
} else if (op_infile.get_value().empty()) {
Expand Down
25 changes: 21 additions & 4 deletions clients/drcachesim/common/directory_iterator.cpp
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
/* **********************************************************
* Copyright (c) 2017-2020 Google, Inc. All rights reserved.
* Copyright (c) 2017-2022 Google, Inc. All rights reserved.
* **********************************************************/

/*
Expand Down Expand Up @@ -30,6 +30,7 @@
* DAMAGE.
*/

#include <algorithm>
#include "directory_iterator.h"
#include "utils.h"
#include "dr_frontend.h"
Expand Down Expand Up @@ -128,8 +129,24 @@ directory_iterator_t::is_directory(const std::string &path)
}

bool
directory_iterator_t::create_directory(const std::string &path)
directory_iterator_t::create_directory(const std::string &path_in)
{
drfront_status_t res = drfront_create_dir(path.c_str());
return res == DRFRONT_SUCCESS;
std::string path = path_in;
#ifdef WINDOWS
// Canonicalize.
std::replace(path.begin(), path.end(), ALT_DIRSEP[0], DIRSEP[0]);
#endif
// Create all components.
drfront_status_t res;
auto pos = path.find(DIRSEP, 1);
while (pos != std::string::npos) {
std::string sub = path.substr(0, pos);
if (!is_directory(sub)) {
res = drfront_create_dir(sub.c_str());
if (res != DRFRONT_SUCCESS)
return false;
}
pos = path.find(DIRSEP, pos + 1);
}
return drfront_create_dir(path.c_str()) == DRFRONT_SUCCESS;
}
3 changes: 2 additions & 1 deletion clients/drcachesim/common/directory_iterator.h
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
/* **********************************************************
* Copyright (c) 2017-2020 Google, Inc. All rights reserved.
* Copyright (c) 2017-2022 Google, Inc. All rights reserved.
* **********************************************************/

/*
Expand Down Expand Up @@ -98,6 +98,7 @@ class directory_iterator_t : public std::iterator<std::input_iterator_tag, std::
// Static cross-platform utility functions.
static bool
is_directory(const std::string &path);
// Recursively creates all sub-directories.
static bool
create_directory(const std::string &path);

Expand Down
14 changes: 12 additions & 2 deletions clients/drcachesim/common/options.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -261,9 +261,19 @@ droption_t<bytesize_t> op_retrace_every_instrs(
"Trace for -trace_for_instrs, execute this many, and repeat.",
"This option augments -trace_for_instrs. After tracing concludes, this option "
"causes non-traced instructions to be counted and after the number specified by "
"this option, tracing start up again for the -trace_for_instrs duration. This "
"this option, tracing will start up again for the -trace_for_instrs duration. This "
"process repeats itself. This can be combined with -trace_after_instrs for an "
"initial period of non-tracing.");
"initial period of non-tracing. Each tracing window is delimited by "
"TRACE_MARKER_TYPE_WINDOW_ID markers. For -offline traces, each window is placed "
"into its own separate set of output files, unless -no_split_windows is set.");

droption_t<bool> op_split_windows(
DROPTION_SCOPE_CLIENT, "split_windows", true,
"Whether -retrace_every_instrs should write separate files",
"By default, offline traces in separate windows from -retrace_every_instrs are "
"written to a different set of files for each window. If this option is disabled, "
"all windows are concatenated into a single trace, separated by "
"TRACE_MARKER_TYPE_WINDOW_ID markers.");

droption_t<bytesize_t> op_exit_after_tracing(
DROPTION_SCOPE_CLIENT, "exit_after_tracing", 0,
Expand Down
1 change: 1 addition & 0 deletions clients/drcachesim/common/options.h
Original file line number Diff line number Diff line change
Expand Up @@ -91,6 +91,7 @@ extern droption_t<bytesize_t> op_max_global_trace_refs;
extern droption_t<bytesize_t> op_trace_after_instrs;
extern droption_t<bytesize_t> op_trace_for_instrs;
extern droption_t<bytesize_t> op_retrace_every_instrs;
extern droption_t<bool> op_split_windows;
extern droption_t<bytesize_t> op_exit_after_tracing;
extern droption_t<std::string> op_raw_compress;
extern droption_t<bool> op_online_instr_types;
Expand Down
7 changes: 6 additions & 1 deletion clients/drcachesim/drcachesim.dox.in
Original file line number Diff line number Diff line change
Expand Up @@ -1062,7 +1062,12 @@ limited in several ways:
executing its specified instruction count without tracing and then
re-enabling tracing for \p -trace_for_instrs again, resulting in
tracing windows repeated at regular intervals throughout the execution.
A single final trace is created at the end, with #TRACE_MARKER_TYPE_WINDOW_ID
There are two options for how these windows are stored for offline traces.
If the \p -split_windows option is set (which is the default), each window
produces a separate set of output files inside a window.NNNN subdirectory.
Post-processing by default targets the first window; the others must be explicitly
passed to separate post-processing invocations. If \p -no_split_windows is set,
a single trace is created with #TRACE_MARKER_TYPE_WINDOW_ID
markers (see \ref sec_drcachesim_format_other) identifying the trace window
transitions.
- The \p -max_global_trace_refs option causes the recording of trace
Expand Down
8 changes: 7 additions & 1 deletion clients/drcachesim/simulator/analyzer_interface.cpp
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
/* **********************************************************
* Copyright (c) 2017-2021 Google, Inc. All rights reserved.
* Copyright (c) 2017-2022 Google, Inc. All rights reserved.
* **********************************************************/

/*
Expand Down Expand Up @@ -49,6 +49,7 @@
#include "../tools/func_view_create.h"
#include "../tools/invariant_checker_create.h"
#include "../tracer/raw2trace.h"
#include "../tracer/raw2trace_directory.h"
#include <fstream>

/* Get the path to an auxiliary file by examining
Expand All @@ -75,6 +76,11 @@ get_aux_file_path(std::string option_val, std::string default_filename)
if (sep_index != std::string::npos)
trace_dir = std::string(op_infile.get_value(), 0, sep_index);
}
if (raw2trace_directory_t::is_window_subdir(trace_dir)) {
// If we're operating on a specific window, point at the parent for the
// modfile.
trace_dir += std::string(DIRSEP) + "..";
}
file_path = trace_dir + std::string(DIRSEP) + default_filename;
/* Support the aux file in either raw/ or trace/. */
if (!std::ifstream(file_path.c_str()).good()) {
Expand Down
19 changes: 0 additions & 19 deletions clients/drcachesim/tests/offline-delay-func.templatex
Original file line number Diff line number Diff line change
@@ -1,20 +1 @@
Hello, world!
Basic counts tool results:
Total counts:
0 total \(fetched\) instructions
0 total unique \(fetched\) instructions
0 total non-fetched instructions
0 total prefetches
0 total data loads
0 total data stores
0 total icache flushes
0 total dcache flushes
1 total threads
2 total scheduling markers
0 total transfer markers
0 total function id markers
0 total function return address markers
0 total function argument markers
0 total function return value markers
3 total other markers
.*
10 changes: 10 additions & 0 deletions clients/drcachesim/tests/offline-windows-split.templatex
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
Hit delay threshold: enabling tracing.
Hit tracing window #0 limit: disabling tracing.
Hit retrace threshold: enabling tracing for window #1.
.*
Basic counts tool results:
.*
Basic counts tool results:
.*
Basic counts tool results:
.*
21 changes: 17 additions & 4 deletions clients/drcachesim/tools/basic_counts.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -113,10 +113,23 @@ basic_counts_t::parallel_shard_memref(void *shard_data, const memref_t &memref)
++counters->xfer_markers;
} else {
if (memref.marker.marker_type == TRACE_MARKER_TYPE_WINDOW_ID &&
memref.marker.marker_value != per_shard->last_window) {
per_shard->last_window = memref.marker.marker_value;
per_shard->counters.resize(per_shard->last_window + 1 /*0-based*/);
counters = &per_shard->counters[per_shard->counters.size() - 1];
static_cast<intptr_t>(memref.marker.marker_value) !=
per_shard->last_window) {
if (per_shard->last_window == -1 && memref.marker.marker_value != 0) {
// We assume that a single file with multiple windows always
// starts at 0, which is how we distinguish it from a split
// file starting at a high window number. We check this below.
per_shard->last_window = memref.marker.marker_value;
} else if (per_shard->last_window != -1 &&
per_shard->counters.size() !=
static_cast<size_t>(per_shard->last_window + 1)) {
per_shard->error = "Multi-window file must start at 0";
return false;
} else {
per_shard->last_window = memref.marker.marker_value;
per_shard->counters.resize(per_shard->last_window + 1 /*0-based*/);
counters = &per_shard->counters[per_shard->counters.size() - 1];
}
}
switch (memref.marker.marker_type) {
case TRACE_MARKER_TYPE_FUNC_ID: ++counters->func_id_markers; break;
Expand Down
2 changes: 1 addition & 1 deletion clients/drcachesim/tools/basic_counts.h
Original file line number Diff line number Diff line change
Expand Up @@ -111,7 +111,7 @@ class basic_counts_t : public analysis_tool_t {
// A vector to support windows.
std::vector<counters_t> counters;
std::string error;
uintptr_t last_window = 0;
intptr_t last_window = -1;
};

static bool
Expand Down
4 changes: 3 additions & 1 deletion clients/drcachesim/tracer/drmemtrace.h
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
/* **********************************************************
* Copyright (c) 2016-2021 Google, Inc. All rights reserved.
* Copyright (c) 2016-2022 Google, Inc. All rights reserved.
* **********************************************************/

/*
Expand Down Expand Up @@ -135,6 +135,8 @@ typedef bool (*drmemtrace_create_dir_func_t)(const char *dir);
DR_EXPORT
/**
* Registers functions to replace the default file operations for offline tracing.
* If tracing windows are used and separate files per window are not meant to
* be supported by "open_file_func", it is up to the user to set \p -no_split_windows.
*
* \note The caller is responsible for the transparency and isolation of using
* those functions, which will be called in the middle of arbitrary
Expand Down
2 changes: 1 addition & 1 deletion clients/drcachesim/tracer/raw2trace.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -629,7 +629,7 @@ raw2trace_t::process_thread_file(raw2trace_thread_data_t *tdata)
VPRINT(4, "About to read thread #%d==%d at pos %d\n", tdata->index,
(uint)tdata->tid, (int)tdata->thread_file->tellg());
tdata->error = process_next_thread_buffer(tdata, &end_of_file);
if (!tdata->error.empty()) {
if (!tdata->error.empty() || (!end_of_file && thread_file_at_eof(tdata))) {
if (thread_file_at_eof(tdata)) {
// Rather than a fatal error we try to continue to provide partial
// results in case the disk was full or there was some other issue.
Expand Down
3 changes: 3 additions & 0 deletions clients/drcachesim/tracer/raw2trace.h
Original file line number Diff line number Diff line change
Expand Up @@ -68,6 +68,9 @@
# define OUTFILE_SUFFIX_SZ "raw.sz"
#endif
#define OUTFILE_SUBDIR "raw"
#define WINDOW_SUBDIR_PREFIX "window"
#define WINDOW_SUBDIR_FORMAT "window.%04zd" /* ptr_int_t is the window number type. */
#define WINDOW_SUBDIR_FIRST "window.0000"
#define TRACE_SUBDIR "trace"
#ifdef HAS_ZLIB
# define TRACE_SUFFIX "trace.gz"
Expand Down
50 changes: 42 additions & 8 deletions clients/drcachesim/tracer/raw2trace_directory.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -205,6 +205,25 @@ raw2trace_directory_t::read_module_file(const std::string &modfilename)
return "";
}

bool
raw2trace_directory_t::is_window_subdir(const std::string &dir)
{
return dir.rfind(WINDOW_SUBDIR_PREFIX) != std::string::npos &&
dir.rfind(WINDOW_SUBDIR_PREFIX) >= dir.size() - strlen(WINDOW_SUBDIR_FIRST);
}

std::string
raw2trace_directory_t::window_subdir_if_present(const std::string &dir)
{
// Support window subdirs. If the base is passed, target the first.
if (is_window_subdir(dir))
return dir;
std::string windir = dir + std::string(DIRSEP) + WINDOW_SUBDIR_FIRST;
if (directory_iterator_t::is_directory(windir))
return windir;
return dir;
}

std::string
raw2trace_directory_t::tracedir_from_rawdir(const std::string &rawdir_in)
{
Expand All @@ -221,10 +240,11 @@ raw2trace_directory_t::tracedir_from_rawdir(const std::string &rawdir_in)
if (rawdir.size() > trace_sub.size() &&
rawdir.compare(rawdir.size() - trace_sub.size(), trace_sub.size(), trace_sub) ==
0)
return rawdir;
// If it ends in "/raw", replace with "/trace".
if (rawdir.size() > raw_sub.size() &&
rawdir.compare(rawdir.size() - raw_sub.size(), raw_sub.size(), raw_sub) == 0) {
return window_subdir_if_present(rawdir);
// If it ends in "/raw" or a window subdir, replace with "/trace".
if ((rawdir.size() > raw_sub.size() &&
rawdir.compare(rawdir.size() - raw_sub.size(), raw_sub.size(), raw_sub) == 0) ||
is_window_subdir(rawdir)) {
std::string tracedir = rawdir;
size_t pos = rawdir.rfind(raw_sub);
if (pos == std::string::npos)
Expand All @@ -236,7 +256,7 @@ raw2trace_directory_t::tracedir_from_rawdir(const std::string &rawdir_in)
// If it contains a "/raw" or "/trace" subdir, add "/trace" to it.
if (directory_iterator_t::is_directory(rawdir + raw_sub) ||
directory_iterator_t::is_directory(rawdir + trace_sub)) {
return rawdir + trace_sub;
return window_subdir_if_present(rawdir + trace_sub);
}
// Use it directly.
return rawdir;
Expand All @@ -257,10 +277,24 @@ raw2trace_directory_t::initialize(const std::string &indir, const std::string &o
if (!directory_iterator_t::is_directory(indir_))
return "Directory does not exist: " + indir_;
// Support passing both base dir and raw/ subdir.
if (indir_.rfind(OUTFILE_SUBDIR) == std::string::npos ||
indir_.rfind(OUTFILE_SUBDIR) < indir_.size() - strlen(OUTFILE_SUBDIR)) {
if (!is_window_subdir(indir_) &&
(indir_.rfind(OUTFILE_SUBDIR) == std::string::npos ||
indir_.rfind(OUTFILE_SUBDIR) < indir_.size() - strlen(OUTFILE_SUBDIR))) {
indir_ += std::string(DIRSEP) + OUTFILE_SUBDIR;
}
std::string modfile_dir = indir_;
// Support window subdirs.
indir_ = window_subdir_if_present(indir_);
if (is_window_subdir(indir_)) {
// If we're operating on a specific window, point at the parent for the modfile.
// Windows dr_open_file() doesn't like "..".
modfile_dir = indir_;
auto pos = modfile_dir.rfind(DIRSEP);
if (pos == std::string::npos)
return "Window subdir missing slash";
modfile_dir.erase(pos);
}

// Support a default outdir_.
if (outdir_.empty()) {
outdir_ = tracedir_from_rawdir(indir_);
Expand All @@ -271,7 +305,7 @@ raw2trace_directory_t::initialize(const std::string &indir, const std::string &o
}
}
std::string modfilename =
indir_ + std::string(DIRSEP) + DRMEMTRACE_MODULE_LIST_FILENAME;
modfile_dir + std::string(DIRSEP) + DRMEMTRACE_MODULE_LIST_FILENAME;
std::string err = read_module_file(modfilename);
if (!err.empty())
return err;
Expand Down
8 changes: 7 additions & 1 deletion clients/drcachesim/tracer/raw2trace_directory.h
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
/* **********************************************************
* Copyright (c) 2017-2020 Google, Inc. All rights reserved.
* Copyright (c) 2017-2022 Google, Inc. All rights reserved.
* **********************************************************/

/*
Expand Down Expand Up @@ -72,6 +72,12 @@ class raw2trace_directory_t {
static std::string
tracedir_from_rawdir(const std::string &rawdir);

static std::string
window_subdir_if_present(const std::string &dir);

static bool
is_window_subdir(const std::string &dir);

char *modfile_bytes_;
std::vector<std::istream *> in_files_;
std::vector<std::ostream *> out_files_;
Expand Down
Loading

0 comments on commit eae2f4e

Please sign in to comment.