Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

generalize drcachesim to support arbitrary trace analysis tools #2006

Closed
derekbruening opened this issue Sep 15, 2016 · 9 comments
Closed

generalize drcachesim to support arbitrary trace analysis tools #2006

derekbruening opened this issue Sep 15, 2016 · 9 comments

Comments

@derekbruening
Copy link
Contributor

We plan to generalize drcachesim to support two types of trace analysis tools:

  1. Tools that are cache-agnostic, will not interact with the cache or TLB simulator, and will operate on the memory stream in the trace on their own. These will simply use the tracer and the trace reader but not the simulator.
  2. Tools built on top of the cache or TLB simulator that will analyze additional aspects of the trace in relation to cache misses and other simulator features. For these, the simulator will become a platform that may provide an event API.
@derekbruening
Copy link
Contributor Author

Xref #1703, #1729

@derekbruening
Copy link
Contributor Author

We've put a bunch of work into this:

However, we still have not finalized the best way to split non-simulator and simulator tools: today the simulator is dispatching on non-simulator tools, though we do have support for making simple front ends for other tools.

derekbruening added a commit that referenced this issue Nov 2, 2017
Marks the file_reader_t::is_complete() and analyzer_t::init_file_reader()
methods as virtual to provide more flexibility in subclassing.

Cleans up some now-stale comments about drcachesim optimizations.

Issue: #2006
derekbruening added a commit that referenced this issue Nov 2, 2017
Marks the file_reader_t::is_complete() and analyzer_t::init_file_reader()
methods as virtual to provide more flexibility in subclassing.

Cleans up some now-stale comments about drcachesim optimizations.

Issue: #2006
derekbruening added a commit that referenced this issue Nov 11, 2017
Adds an explicit index field to drmodtrack_info_t so that the data
structure can be passed around without an accompanying iterator index.

Issue: #2006
derekbruening added a commit that referenced this issue Nov 12, 2017
…2697)

Adds an explicit index field to drmodtrack_info_t so that the data
structure can be passed around without an accompanying iterator index.

Issue: #2006
fhahn pushed a commit that referenced this issue Dec 4, 2017
Marks the file_reader_t::is_complete() and analyzer_t::init_file_reader()
methods as virtual to provide more flexibility in subclassing.

Cleans up some now-stale comments about drcachesim optimizations.

Issue: #2006
fhahn pushed a commit that referenced this issue Dec 4, 2017
…2697)

Adds an explicit index field to drmodtrack_info_t so that the data
structure can be passed around without an accompanying iterator index.

Issue: #2006
@derekbruening
Copy link
Contributor Author

We have settled on a drmemtrace_analyzer library that includes a reader and two iterator modes (internal vs external control).

I am now exporting libraries and headers. Some decisions:

  • I'm renaming the libraries to have a drmemtrace_ prefix (i.e., drmemtrac_reuse_distance) and I'm storing them in tools/lib* (clients/lib* for build dir)

  • I'm making a new header location, tools/include/, and putting the headers in a drmemtrace/ subdir. I'm leaving the poorly-named-for-export utils.h: the subdir will keep it isolated.

  • I'm adding a new package-provided function use_DynamoRIO_drmemtrace()

  • I'm also adding a new function configure_DynamoRIO_main_headers() to pull in dr_frontend.h w/o any other headers

  • I'm adding a --build-and-test test to test building from a separate cmake project

  • I decided to leave the tracer's drmemtrace.h in ext/ as it's for client use, rather than putting it into tools/include/drmemtrace/tracer.h or sthg

  • I decided to not export simulator headers for extending the simulator for now

Still TODO:

  • Use doxygen comments in headers and have genapi process them so they show up in the html docs (this includes drmemtrace.h)

derekbruening added a commit that referenced this issue Dec 20, 2017
Exports a new drmemtrace_analyzer library along with all trace analysis
libraries as drmemtrace_basic_counts, drmemtrace_histogram,
drmemtrace_reuse_distance, drmemtrace_reuse_time, and drmemtrace_simulator.

Renames the drmemtrace_histogram executable to histogram_launcher to avoid
clashing with the new public library drmemtrace_ prefix.

Exports header files for building trace analysis tools and using our
provided tools in a new location tools/include/drmemtrace/.

Adds a new DynamoRIO CMake package command
use_DynamoRIO_drmemtrace() to facilitate building third-party trace
analysis tools.

Adds a new DynamoRIO CMake package command
configure_DynamoRIO_main_headers() to facilitate using just drfrontendlib
and finding dr_frontend.h, and fixes dr_frontend.h typedef issues when used
by itself.

Adds a build-and-test test of using the framework from a separate CMake
project.

Adds documentation of the new features.

Manually tested in a release package.  Tweaks the package.cmake for a
missing utils function.

Still missing: adding extensive doxygen comments in all of the headers, and
including the headers in genapi so they show up in the html docs.

Issue: #2006
@derekbruening
Copy link
Contributor Author

A point point is the zlib link:

    # XXX i#2006: Can we automate the zlib link somehow?  Should we provide two versions
    # of our libs, one with and one without?  Can we include some version of zlib.a?

derekbruening added a commit that referenced this issue Dec 20, 2017
…2780)

Exports a new drmemtrace_analyzer library along with all trace analysis
libraries as drmemtrace_basic_counts, drmemtrace_histogram,
drmemtrace_reuse_distance, drmemtrace_reuse_time, and drmemtrace_simulator.

Renames the drmemtrace_histogram executable to histogram_launcher to avoid
clashing with the new public library drmemtrace_ prefix.

Exports header files for building trace analysis tools and using our
provided tools in a new location tools/include/drmemtrace/.

Adds a new DynamoRIO CMake package command
use_DynamoRIO_drmemtrace() to facilitate building third-party trace
analysis tools.

Adds a new DynamoRIO CMake package command
configure_DynamoRIO_main_headers() to facilitate using just drfrontendlib
and finding dr_frontend.h, and fixes dr_frontend.h typedef issues when used
by itself.

Adds a build-and-test test of using the framework from a separate CMake
project.

Adds documentation of the new features.

Manually tested in a release package.  Tweaks the package.cmake for a
missing utils function.

Still missing: adding extensive doxygen comments in all of the headers, and
including the headers in genapi so they show up in the html docs.

Issue: #2006
derekbruening added a commit that referenced this issue Dec 21, 2017
Adds doxygen docs to the exported headers in the drmemtrace framework and
includes them all in the doxygen-generated html docs.

Adds hierarchy to the html docs file list to group the drmemtrace headers
together.

Moves drmemtrace.h from ext/include/ to tools/include/drmemtrace with the
analysis tool headers and adds a new CMake function
use_DynamoRIO_drmemtrace_tracer() for finding the header.

Adds a tracer customization section to the drcachesim prose docs.

Installation tested manually and tweaked to remove superfluous drcachesim
debug files.

Issue: #2006
@derekbruening
Copy link
Contributor Author

We may also want to doxygen-ize and make public all of the simulator headers for subclassing and extending the simulator?

Perhaps we should also split off drmemtrace from drcachesim in the docs, with a separate Memory Trace Analysis page instead of having it underneath the cache simulator?

derekbruening added a commit that referenced this issue Dec 21, 2017
)

Adds doxygen docs to the exported headers in the drmemtrace framework and
includes them all in the doxygen-generated html docs.

Adds hierarchy to the html docs file list to group the drmemtrace headers
together.

Moves drmemtrace.h from ext/include/ to tools/include/drmemtrace with the
analysis tool headers and adds a new CMake function
use_DynamoRIO_drmemtrace_tracer() for finding the header.

Adds a tracer customization section to the drcachesim prose docs.

Installation tested manually and tweaked to remove superfluous drcachesim
debug files.

Issue: #2006
derekbruening pushed a commit that referenced this issue Jan 18, 2018
….h portable (#2810)

Removes relative paths from the histogram.h and reuse_distance.h headers to make it easier to
extend these tools in third-party code.

Issue: #2006
derekbruening added a commit that referenced this issue Feb 4, 2018
Adds further support for tools that want information beyond just memory
addresses by adding an API to leverage the raw2trace code to map in the
binaries used during traced execution and examine the instruction bytes.
This takes the shape of two new routines:
raw2trace_t::do_module_parsing_and_mapping() and
raw2trace_t::find_mapped_trace_address().

Adds a new simulator tool "opcode_mix" which uses the new API to decode the
opcode for each executed instruction and print out the dynamic count of
each opcode.  The tool only operates with offline traces and needs access
to the modules.log and binaries of the traced execution.

Adds documentation and a test.

Issue: #2006
@derekbruening
Copy link
Contributor Author

Quoting from #975:

drmemtrace's use of C++ and droption incurs malloc and free calls at init time (libstdc++ lib init, droption static initializers, droption_parser_t::parse_argv) and exit time (droption). We would have to rewrite the tracer in C and not use droption to avoid these.

Right now we live with that and we just avoid mid-run malloc calls.

The regression mentioned above only happens mid-run with custom user data and dynamic app library loading.

derekbruening added a commit that referenced this issue Feb 28, 2018
Adds a feature where a client library can request that the private loader
complain if malloc & co. are called at any time other than process init or
exit, to help clients that want to support being linked statically with the
app.  Because it has to be early, the feature is triggered by a variable
declaration DR_DISALLOW_UNSAFE_STATIC.  It can be overridden
dynamically by a new API routine dr_allow_unsafe_static_behavior().

Fixes drcachesim to use placement new for its offline custom module data
allocations.

Issue: #975, #2006
derekbruening pushed a commit that referenced this issue Apr 19, 2018
…ts (#2940)

This change changes the modules.log format to provide a separate entry
for each segment instead of combining entries for contiguous segments
mapped from each image. The module file version is incremented to 4 and
adds a new field called offset after the "entry" field.

The file offset information for unix is added to DR's module_segment_data_t
and propagated to drmodtrack to be written out with each segment.

The client-interface/drmodtrack-test.dll.cpp test is extended to check
whether the offset recorded by drmodtrack matches the offset for the
segment in DR's module data.  Also fixed lint issue.

Issue: #2006
derekbruening added a commit that referenced this issue Apr 25, 2018
…nt offsets (#2940)"

This reverts commit 8c90e2d as it pushes
the module count for drcachesim offline traces beyond the limit in the
module index bitfield (#2956).

Issue: #2006, #2939, #2956.
derekbruening added a commit that referenced this issue Apr 25, 2018
…nt offsets (#2940)" (#2963)

This reverts commit 8c90e2d
as it pushes the module count for drcachesim offline traces beyond
the limit in the module index bitfield (#2956) for some apps.

Issue: #2006, #2939, #2956.
derekbruening pushed a commit that referenced this issue Apr 29, 2018
This reverts commit 48db566 which
reverted the changes in PR #2940. The changes pushed caused some apps to
overflow the modidx field (issue #2956). PR #2969 increased the width of
the modidx field. We can now safely revert the revert.

Issue: #2006, #2939, #2956
derekbruening added a commit that referenced this issue May 29, 2018
Adds documentation for several recent features geared toward core
simulator support: avoiding thread switch gaps after branches; cpu
markers; kernel xfer markers; and the raw2trace mapping interfaces.

Issue: #2638, #2843, #2708, #2006
derekbruening added a commit that referenced this issue May 29, 2018
Adds documentation for several recent features geared toward core
simulator support: avoiding thread switch gaps after branches; cpu
markers; kernel xfer markers; and the raw2trace mapping interfaces.

Issue: #2638, #2843, #2708, #2006
fhahn pushed a commit that referenced this issue Jun 18, 2018
Adds documentation for several recent features geared toward core
simulator support: avoiding thread switch gaps after branches; cpu
markers; kernel xfer markers; and the raw2trace mapping interfaces.

Issue: #2638, #2843, #2708, #2006
derekbruening added a commit that referenced this issue Jun 27, 2018
Eliminates the too-early destruction of raw2trace_directory_t, as we
need to reference the vdso contents later on.

Issue: #2006
derekbruening added a commit that referenced this issue Jun 27, 2018
Eliminates the too-early destruction of raw2trace_directory_t, as we
need to reference the vdso contents later on.

Issue: #2006
@derekbruening
Copy link
Contributor Author

Xref ba12147

derekbruening added a commit that referenced this issue Aug 21, 2018
Adds a standalone launcher for opcode_mix primarily as a test of the
linking complexities beyond what histogram_launcher hits, due to
raw2trace and drdecode being required and the history of
not-perfectly-cleanly-isolated libraries there (xref #1409).

Issue: #2006, #1409
derekbruening added a commit that referenced this issue Aug 22, 2018
Adds a standalone launcher for opcode_mix primarily as a test of the
linking complexities beyond what histogram_launcher hits, due to
raw2trace and drdecode being required and the history of
not-perfectly-cleanly-isolated libraries there (xref #1409).

Issue: #2006, #1409
derekbruening added a commit that referenced this issue Aug 22, 2018
Fixes 32-bit gcc7.3 build breakage from 4ac9441 by adding
__x86.get_pc_thunk.bp to the globalize_pc_thunks list.

Issue: #2006
derekbruening added a commit that referenced this issue Aug 22, 2018
Fixes 32-bit gcc7.3 build breakage from 4ac9441 by adding
__x86.get_pc_thunk.bp to the globalize_pc_thunks list.

Issue: #2006
derekbruening added a commit that referenced this issue Oct 16, 2020
Changes the drcachesim configuration file interface to use a
std::istream rather than a file path string, to support non-standard
filesystems.

Tested with a proprietary filesystem library.

Issue: #2006
derekbruening added a commit that referenced this issue Oct 16, 2020
Changes the drcachesim configuration file interface to use a
std::istream rather than a file path string, to support non-standard
filesystems.

Tested with a proprietary filesystem library.

Issue: #2006
@derekbruening
Copy link
Contributor Author

This is complete: we have a variety of tools now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants