Skip to content

Commit

Permalink
i#5520 memtrace encodings: Add online encodings (#5682)
Browse files Browse the repository at this point in the history
Adds a new option -instr_encodings which enables instruction encoding
records for online traces.  This is under this off-by-default option
as it adds significant overhead to online tracing and is only needed
for some tools.

Currently we emit the encoding again for every dynamic instance
of an instruction.  Future work involves recording which we've emitted
to avoid duplicate instances (the reader caches prior encodings) but
this requires careful consideration of global locks and per-thread
invalidation on code changes which is out of the scope of this
initial implementation.

Updates the launcher and docs.

Adds an online opcode_mix test and adds encodings to a basic_counts
online test.

Manually tested the view tool with and without -instr_encodings:

  $ bin64/drrun -t drcachesim -instr_encodings -simulator_type view -- suite/tests/bin/allasm_x86_64
      ...
      166: T1718245 ifetch       4 byte(s) @ 0x0000000000401028 48 83 eb 01          sub    $0x0000000000000001 %rbx -> %rbx
      167: T1718245 ifetch       4 byte(s) @ 0x000000000040102c 48 83 fb 00          cmp    %rbx $0x0000000000000000
      168: T1718245 ifetch       2 byte(s) @ 0x0000000000401030 75 d9                jnz    $0x000000000040100b
      ...

  $ bin64/drrun -t drcachesim -simulator_type view -- suite/tests/bin/allasm_x86_64
      ...
      126: T1933188 ifetch       4 byte(s) @ 0x0000000000401028 non-branch
      127: T1933188 ifetch       4 byte(s) @ 0x000000000040102c non-branch
      128: T1933188 ifetch       2 byte(s) @ 0x0000000000401030 conditional jump
      ...

Fixes #5520
  • Loading branch information
derekbruening authored Oct 13, 2022
1 parent 7bc0b6a commit 2273888
Show file tree
Hide file tree
Showing 13 changed files with 150 additions and 26 deletions.
6 changes: 6 additions & 0 deletions clients/drcachesim/common/options.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -122,6 +122,12 @@ droption_t<bytesize_t> op_chunk_instr_count(
"support for writing .zip files, this option is ignored. "
"For 32-bit this cannot exceed 4G.");

droption_t<bool> op_instr_encodings(
DROPTION_SCOPE_CLIENT, "instr_encodings", false,
"Whether to include encodings for online tools",
"By default instruction encodings are not sent to online tools, to reduce "
"overhead. (Offline tools have them added by default.)");

droption_t<std::string> op_funclist_file(
DROPTION_SCOPE_ALL, "funclist_file", "",
"Path to function map file for func_view tool",
Expand Down
1 change: 1 addition & 0 deletions clients/drcachesim/common/options.h
Original file line number Diff line number Diff line change
Expand Up @@ -69,6 +69,7 @@ extern droption_t<std::string> op_indir;
extern droption_t<std::string> op_module_file;
extern droption_t<std::string> op_alt_module_dir;
extern droption_t<bytesize_t> op_chunk_instr_count;
extern droption_t<bool> op_instr_encodings;
extern droption_t<std::string> op_funclist_file;
extern droption_t<unsigned int> op_num_cores;
extern droption_t<unsigned int> op_line_size;
Expand Down
11 changes: 7 additions & 4 deletions clients/drcachesim/drcachesim.dox.in
Original file line number Diff line number Diff line change
Expand Up @@ -115,7 +115,9 @@ The raw encoding of the instruction is provided. This can be decoded
using the drdecode decoder or any other decoder. An additional field
`encoding_is_new` is provided to indicate when any cached decoding
information should be invalidated due to possibly changed application
code.
code. (For online traces, encodings are not provided unless the
option `-instr_encodings` is passed, as encodings add overhead and
are not needed for many tools.)

Older legacy traces may not contain instruction encodings. For those
traces, encodings for static code can be obtained by
Expand Down Expand Up @@ -586,8 +588,8 @@ to cache simulators.
The opcode_mix tool uses the non-fetched instruction information along with
the preserved libraries and binaries from the traced execution to gather
more information on each executed instruction than was stored in the trace.
It only supports offline traces, and the \p modules.log file created during
post-processing of the trace must be preserved. The results are broken
To run on online traces, pass the `-instr_encodings` option.
The results are broken
down by the opcodes used in DR's IR, where for x86 \p mov is split into a separate
opcode for load and store but both have the same public string "mov":

Expand Down Expand Up @@ -628,7 +630,8 @@ Opcode mix tool results:
\section sec_tool_view Human-Readable View

The view tool prints out the contents of the trace for human viewing, including
disassembling instructions in AT&T, Intel, Arm, or DR format, for offline traces. The
disassembling instructions in AT&T, Intel, Arm, or DR format (to see
disassembly for online traces, pass the `-instr_encodings` option). The
-skip_refs and -sim_refs flags can be used to
set a start point and end point for the disassembled view. Note that these
flags compute the number of instructions which are skipped or displayed which
Expand Down
5 changes: 3 additions & 2 deletions clients/drcachesim/simulator/analyzer_interface.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -187,8 +187,9 @@ drmemtrace_analysis_tool_create()
} else if (op_simulator_type.get_value() == OPCODE_MIX) {
std::string module_file_path = get_module_file_path();
if (module_file_path.empty() && op_indir.get_value().empty() &&
op_infile.get_value().empty()) {
ERRMSG("Usage error: the opcode_mix tool requires offline traces.\n");
op_infile.get_value().empty() && !op_instr_encodings.get_value()) {
ERRMSG("Usage error: the opcode_mix tool requires offline traces, or "
"-instr_encodings for online traces.\n");
return nullptr;
}
return opcode_mix_tool_create(module_file_path, op_verbose.get_value(),
Expand Down
4 changes: 2 additions & 2 deletions clients/drcachesim/tests/allasm-repstr-basic-counts.templatex
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ Total counts:
0 total physical address \+ virtual address marker pairs
0 total physical address unavailable markers
4 total other markers
0 total encodings
102 total encodings
Thread [0-9]* counts:
98 \(fetched\) instructions
26 unique \(fetched\) instructions
Expand All @@ -48,4 +48,4 @@ Thread [0-9]* counts:
0 physical address \+ virtual address marker pairs
0 physical address unavailable markers
4 other markers
0 encodings
102 encodings
9 changes: 9 additions & 0 deletions clients/drcachesim/tests/opcode_mix.templatex
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
Hello, world!
---- <application exited with code 0> ----
Opcode mix tool results:
*[0-9]* : total executed instructions
*[0-9]* : [a-z ]*
*[0-9]* : [a-z ]*
*[0-9]* : [a-z ]*
*[0-9]* : [a-z ]*
.*
2 changes: 1 addition & 1 deletion clients/drcachesim/tools/view.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -418,7 +418,7 @@ view_t::process_memref(const memref_t &memref)
<< std::setw(2) << memref.instr.size << " byte(s) @ 0x" << std::hex
<< std::setfill('0') << std::setw(sizeof(void *) * 2) << memref.instr.addr
<< std::dec << std::setfill(' ');
if (!has_modules_) {
if (!TESTANY(OFFLINE_FILE_TYPE_ENCODINGS, filetype_) && !has_modules_) {
// We can't disassemble so we provide what info the trace itself contains.
// XXX i#5486: We may want to store the taken target for conditional
// branches; if added, we can print it here.
Expand Down
16 changes: 14 additions & 2 deletions clients/drcachesim/tracer/instru.h
Original file line number Diff line number Diff line change
Expand Up @@ -239,6 +239,10 @@ class instru_t {
instrument_ibundle(void *drcontext, instrlist_t *ilist, instr_t *where,
reg_id_t reg_ptr, int adjust, instr_t **delay_instrs,
int num_delay_instrs) = 0;
virtual int
instrument_instr_encoding(void *drcontext, void *tag, void *bb_field,
instrlist_t *ilist, instr_t *where, reg_id_t reg_ptr,
int adjust, instr_t *app) = 0;

virtual void
bb_analysis(void *drcontext, void *tag, void **bb_field, instrlist_t *ilist,
Expand Down Expand Up @@ -347,6 +351,10 @@ class online_instru_t : public instru_t {
instrument_ibundle(void *drcontext, instrlist_t *ilist, instr_t *where,
reg_id_t reg_ptr, int adjust, instr_t **delay_instrs,
int num_delay_instrs) override;
int
instrument_instr_encoding(void *drcontext, void *tag, void *bb_field,
instrlist_t *ilist, instr_t *where, reg_id_t reg_ptr,
int adjust, instr_t *app) override;

void
bb_analysis(void *drcontext, void *tag, void **bb_field, instrlist_t *ilist,
Expand All @@ -356,8 +364,8 @@ class online_instru_t : public instru_t {

private:
void
insert_save_pc(void *drcontext, instrlist_t *ilist, instr_t *where, reg_id_t base,
reg_id_t scratch, app_pc pc, int adjust);
insert_save_immed(void *drcontext, instrlist_t *ilist, instr_t *where, reg_id_t base,
reg_id_t scratch, ptr_int_t immed, int adjust);
void
insert_save_addr(void *drcontext, instrlist_t *ilist, instr_t *where,
reg_id_t reg_ptr, reg_id_t reg_addr, int adjust, opnd_t ref);
Expand Down Expand Up @@ -424,6 +432,10 @@ class offline_instru_t : public instru_t {
instrument_ibundle(void *drcontext, instrlist_t *ilist, instr_t *where,
reg_id_t reg_ptr, int adjust, instr_t **delay_instrs,
int num_delay_instrs) override;
int
instrument_instr_encoding(void *drcontext, void *tag, void *bb_field,
instrlist_t *ilist, instr_t *where, reg_id_t reg_ptr,
int adjust, instr_t *app) override;

void
bb_analysis(void *drcontext, void *tag, void **bb_field, instrlist_t *ilist,
Expand Down
11 changes: 11 additions & 0 deletions clients/drcachesim/tracer/instru_offline.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -754,6 +754,17 @@ offline_instru_t::instrument_ibundle(void *drcontext, instrlist_t *ilist, instr_
return adjust;
}

int
offline_instru_t::instrument_instr_encoding(void *drcontext, void *tag, void *bb_field,
instrlist_t *ilist, instr_t *where,
reg_id_t reg_ptr, int adjust, instr_t *app)
{
// We emit non-module-code or modified-module-code encodings separately in
// record_instr_encodings(). Encodings for static code are added in the
// post-processor.
return adjust;
}

void
offline_instru_t::bb_analysis(void *drcontext, void *tag, void **bb_field,
instrlist_t *ilist, bool repstr_expanded)
Expand Down
74 changes: 63 additions & 11 deletions clients/drcachesim/tracer/instru_online.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,7 @@
/* instru_online: inserts instrumentation for online traces.
*/

#define NOMINMAX // Avoid windows.h messing up std::min.
#include "dr_api.h"
#include "drreg.h"
#include "drutil.h"
Expand Down Expand Up @@ -196,23 +197,23 @@ online_instru_t::append_unit_header(byte *buf_ptr, thread_id_t tid, intptr_t win
}

void
online_instru_t::insert_save_pc(void *drcontext, instrlist_t *ilist, instr_t *where,
reg_id_t base, reg_id_t scratch, app_pc pc, int adjust)
online_instru_t::insert_save_immed(void *drcontext, instrlist_t *ilist, instr_t *where,
reg_id_t base, reg_id_t scratch, ptr_int_t immed,
int adjust)
{
int disp = adjust + offsetof(trace_entry_t, addr);
#ifdef X86_32
ptr_int_t val = (ptr_int_t)pc;
MINSERT(ilist, where,
INSTR_CREATE_mov_st(drcontext, OPND_CREATE_MEM32(base, disp),
OPND_CREATE_INT32((int)val)));
OPND_CREATE_INT32((int)immed)));
#else
// For X86_64, we can't write the PC immed directly to memory and
// skip the top half for a <4GB PC b/c if we're in the sentinel
// region of the buffer we'll be leaving 0xffffffff in the top
// half (i#1735). Thus we go through a register on x86 (where we
// can skip the top half), just like on ARM.
instrlist_insert_mov_immed_ptrsz(drcontext, (ptr_int_t)pc, opnd_create_reg(scratch),
ilist, where, NULL, NULL);
instrlist_insert_mov_immed_ptrsz(drcontext, immed, opnd_create_reg(scratch), ilist,
where, NULL, NULL);
MINSERT(ilist, where,
XINST_CREATE_store(drcontext, OPND_CREATE_MEMPTR(base, disp),
opnd_create_reg(scratch)));
Expand Down Expand Up @@ -315,8 +316,8 @@ online_instru_t::instrument_memref(void *drcontext, void *bb_field, instrlist_t
// The 0 size indicates it's a non-icache entry.
insert_save_type_and_size(drcontext, ilist, where, reg_ptr, reg_tmp,
TRACE_TYPE_INSTR, 0, adjust);
insert_save_pc(drcontext, ilist, where, reg_ptr, reg_tmp, instr_get_app_pc(app),
adjust);
insert_save_immed(drcontext, ilist, where, reg_ptr, reg_tmp,
reinterpret_cast<ptr_int_t>(instr_get_app_pc(app)), adjust);
adjust += sizeof(trace_entry_t);
}
insert_save_addr(drcontext, ilist, where, reg_ptr, reg_tmp, adjust, ref);
Expand Down Expand Up @@ -367,12 +368,63 @@ online_instru_t::instrument_instr(void *drcontext, void *tag, void *bb_field,
ushort size = (ushort)instr_length(drcontext, app);
insert_save_type_and_size(drcontext, ilist, where, reg_ptr, reg_tmp, type, size,
adjust);
insert_save_pc(drcontext, ilist, where, reg_ptr, reg_tmp, pc, adjust);
insert_save_immed(drcontext, ilist, where, reg_ptr, reg_tmp,
reinterpret_cast<ptr_int_t>(pc), adjust);
res = drreg_unreserve_register(drcontext, ilist, where, reg_tmp);
DR_ASSERT(res == DRREG_SUCCESS); // Can't recover.
return (adjust + sizeof(trace_entry_t));
}

int
online_instru_t::instrument_instr_encoding(void *drcontext, void *tag, void *bb_field,
instrlist_t *ilist, instr_t *where,
reg_id_t reg_ptr, int adjust, instr_t *app)
{
// TODO i#5520: Currently we emit the encoding again for every dynamic instance
// of an instruction. We should record which we've emitted and avoid duplicate
// instances (the reader caches prior encodings). For offline we do this
// separately per thread, which makes knowing when code has changed complex as
// any per-thread structures would need a global walk across them on a fragment
// deletion event. For online, however, we may be able to emit just once globally
// if the reader is always interleaving all threads: but while that simplifies
// invalidation/code changes it requires global locks to update the structure.
// Since encodings are off by default we leave it as emitting every time
// with corresponding extra overhead for now.

DR_ASSERT(instr_is_app(app));

byte buf[MAX_ENCODING_LENGTH];
size_t len = 0;
// Most of the time this will be a memcpy, but in some cases we need to encode.
byte *end_pc = instr_encode_to_copy(drcontext, app, buf, instr_get_app_pc(app));
DR_ASSERT(end_pc != nullptr);
len = end_pc - buf;
DR_ASSERT(len < sizeof(buf));

reg_id_t reg_tmp;
drreg_status_t res =
drreg_reserve_register(drcontext, ilist, where, reg_vector_, &reg_tmp);
DR_ASSERT(res == DRREG_SUCCESS); // Can't recover.

size_t len_left = len;
size_t buf_offs = 0;
do {
size_t len_cur = std::min(len_left, sizeof(((trace_entry_t *)0)->encoding));
insert_save_type_and_size(drcontext, ilist, where, reg_ptr, reg_tmp,
TRACE_TYPE_ENCODING, static_cast<ushort>(len_cur),
adjust);
ptr_int_t immed = *(ptr_int_t *)(buf + buf_offs);
insert_save_immed(drcontext, ilist, where, reg_ptr, reg_tmp, immed, adjust);
buf_offs += len_cur;
len_left -= len_cur;
adjust += sizeof(trace_entry_t);
} while (len_left > 0);

res = drreg_unreserve_register(drcontext, ilist, where, reg_tmp);
DR_ASSERT(res == DRREG_SUCCESS); // Can't recover.
return adjust;
}

int
online_instru_t::instrument_ibundle(void *drcontext, instrlist_t *ilist, instr_t *where,
reg_id_t reg_ptr, int adjust, instr_t **delay_instrs,
Expand All @@ -395,8 +447,8 @@ online_instru_t::instrument_ibundle(void *drcontext, instrlist_t *ilist, instr_t
if (entry.size == sizeof(entry.length) || i == num_delay_instrs - 1) {
insert_save_type_and_size(drcontext, ilist, where, reg_ptr, reg_tmp,
entry.type, entry.size, adjust);
insert_save_pc(drcontext, ilist, where, reg_ptr, reg_tmp, (app_pc)entry.addr,
adjust);
insert_save_immed(drcontext, ilist, where, reg_ptr, reg_tmp, entry.addr,
adjust);
adjust += sizeof(trace_entry_t);
entry.size = 0;
}
Expand Down
12 changes: 11 additions & 1 deletion clients/drcachesim/tracer/output.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -173,6 +173,12 @@ get_file_type()
file_type = static_cast<offline_file_type_t>(file_type |
OFFLINE_FILE_TYPE_INSTRUCTION_ONLY);
}
if (op_instr_encodings.get_value()) {
// This is generally only for online tracing, as raw2trace adds this
// flag during post-processing for offline.
file_type =
static_cast<offline_file_type_t>(file_type | OFFLINE_FILE_TYPE_ENCODINGS);
}
file_type = static_cast<offline_file_type_t>(
file_type |
IF_X86_ELSE(
Expand Down Expand Up @@ -659,7 +665,11 @@ create_v2p_buffer(per_thread_t *data)
static bool
is_ok_to_split_before(trace_type_t type)
{
return type_is_instr(type) || type == TRACE_TYPE_INSTR_MAYBE_FETCH ||
// We can split before the start of each sequence: we don't want to split
// an <encoding, instruction, address> combination.
return (op_instr_encodings.get_value()
? type == TRACE_TYPE_ENCODING
: (type_is_instr(type) || type == TRACE_TYPE_INSTR_MAYBE_FETCH)) ||
type == TRACE_TYPE_MARKER || type == TRACE_TYPE_THREAD_EXIT ||
op_L0I_filter.get_value();
}
Expand Down
19 changes: 17 additions & 2 deletions clients/drcachesim/tracer/tracer.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -425,13 +425,24 @@ instrument_delay_instrs(void *drcontext, void *tag, instrlist_t *ilist, user_dat
instr_t *where, reg_id_t reg_ptr, int adjust)
{
// Instrument to add a full instr entry for the first instr.
if (op_instr_encodings.get_value()) {
adjust = instru->instrument_instr_encoding(drcontext, tag, ud->instru_field,
ilist, where, reg_ptr, adjust,
ud->delay_instrs[0]);
}
adjust = instru->instrument_instr(drcontext, tag, ud->instru_field, ilist, where,
reg_ptr, adjust, ud->delay_instrs[0]);
if (op_use_physical.get_value()) {
if (op_use_physical.get_value() || op_instr_encodings.get_value()) {
// No instr bundle if physical-2-virtual since instr bundle may
// cross page bundary.
// cross page bundary, and no bundles for encodings so we can easily
// insert encoding entries.
int i;
for (i = 1; i < ud->num_delay_instrs; i++) {
if (op_instr_encodings.get_value()) {
adjust = instru->instrument_instr_encoding(
drcontext, tag, ud->instru_field, ilist, where, reg_ptr, adjust,
ud->delay_instrs[i]);
}
adjust =
instru->instrument_instr(drcontext, tag, ud->instru_field, ilist, where,
reg_ptr, adjust, ud->delay_instrs[i]);
Expand Down Expand Up @@ -904,6 +915,10 @@ instrument_instr(void *drcontext, void *tag, user_data_t *ud, instrlist_t *ilist
}
if (op_L0I_filter.get_value() || op_L0D_filter.get_value()) // Else already loaded.
insert_load_buf_ptr(drcontext, ilist, where, reg_ptr);
if (op_instr_encodings.get_value()) {
adjust = instru->instrument_instr_encoding(drcontext, tag, ud->instru_field,
ilist, where, reg_ptr, adjust, app);
}
adjust = instru->instrument_instr(drcontext, tag, ud->instru_field, ilist, where,
reg_ptr, adjust, app);
if ((op_L0I_filter.get_value() || op_L0D_filter.get_value()) && adjust != 0) {
Expand Down
6 changes: 5 additions & 1 deletion suite/tests/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -3564,6 +3564,9 @@ if (BUILD_CLIENTS)
get_target_path_for_execution(drraw2trace_path drraw2trace "${location_suffix}")
prefix_cmd_if_necessary(drraw2trace_path ON ${drraw2trace_path})

torunonly_drcachesim(opcode_mix ${ci_shared_app}
"-instr_encodings -simulator_type opcode_mix" "")

# The sim_atops should start with @ and be in @-as-space format (hence "atops").
# If the exetgt has drcachesim statically linked in, the _nodr property must be
# set *before* invoking this macro.
Expand Down Expand Up @@ -3979,7 +3982,8 @@ if (BUILD_CLIENTS)
"" "@-simulator_type@basic_counts" "")
unset(tool.drcacheoff.allasm-repstr-basic-counts_rawtemp) # use preprocessor
torunonly_drcachesim(allasm-repstr-basic-counts allasm_repstr
"-simulator_type basic_counts" "")
# We test counting encodings for online.
"-instr_encodings -simulator_type basic_counts" "")
unset(tool.drcachesim.allasm-repstr-basic-counts_rawtemp) # use preprocessor
endif (UNIX AND X86 AND X64)

Expand Down

0 comments on commit 2273888

Please sign in to comment.