-
Notifications
You must be signed in to change notification settings - Fork 566
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for -use_physical with drcachesim -offline #4014
Comments
derekbruening
added a commit
that referenced
this issue
Jan 13, 2020
In drcachesim, the combination of -offline and -use_physical is not supported at this time. We make that clear in the option docs and with an up-front exit when the two are requested at once. Issue: #4014
derekbruening
added a commit
that referenced
this issue
Jan 13, 2020
In drcachesim, the combination of -offline and -use_physical is not supported at this time. We make that clear in the option docs and with an up-front exit when the two are requested at once. Issue: #4014
derekbruening
added a commit
that referenced
this issue
May 27, 2022
Switches from std::unordered_map in physaddr_t to a drcontainers hashtable to avoid malloc and make things safe for statically-linked drmemtrace. Similarly, switches from std::ostringstream to dr_snprintf in physaddr_t::init() to avoid malloc. Tested on a multi-threaded app which hits the post-init malloc warning without both fixes (test will be added in a forthcoming PR: it cannot be added now as physaddr_t is not thread-safe yet). Issue: #4014
derekbruening
added a commit
that referenced
this issue
May 27, 2022
Switches from std::unordered_map in physaddr_t to a drcontainers hashtable to avoid malloc and make things safe for statically-linked drmemtrace. Similarly, switches from std::ostringstream to dr_snprintf in physaddr_t::init() to avoid malloc. Tested on a multi-threaded app which hits the post-init malloc warning without both fixes (test will be added in a forthcoming PR: it cannot be added now as physaddr_t is not thread-safe yet). Issue: #4014
derekbruening
added a commit
that referenced
this issue
May 27, 2022
The physaddr_t class is not thread-safe and was previously used racily in the drmemtrace code. We fix that by creating a separate instance per thread. A test with multiple threads is added. Issue: #4014
Previously, physaddr_t was just used racily. |
derekbruening
added a commit
that referenced
this issue
May 27, 2022
The physaddr_t class is not thread-safe and was previously used racily in the drmemtrace code. We fix that by creating a separate instance per thread. A test with multiple threads is added. This does result in a per-thread file descriptor being opened, which may not scale well: it will exhaust DR's private file-descriptor space and could possibly hit rlimits. Improving scaling is left as future work. Issue: #4014
derekbruening
added a commit
that referenced
this issue
May 27, 2022
Adds a new reverse-lookup routine drmodtrack_lookup_pc_from_index() which is needed to implement physical address support for offline dr$sim traces. Adds a simple test. Issue: #4014
derekbruening
added a commit
that referenced
this issue
May 31, 2022
Adds a new reverse-lookup routine drmodtrack_lookup_pc_from_index() which is needed to implement physical address support for offline dr$sim traces. Adds a simple test. Issue: #4014
derekbruening
added a commit
that referenced
this issue
Jun 2, 2022
Adds support for different page sizes from 4K for drmemtrace's -use_physical. Adds hugepage support, which does not require anything special: just updated the comment. Improves the error reporting for physical translation. Tested on an AArch64 machine with 64K pages. Issue: #4014
derekbruening
added a commit
that referenced
this issue
Jun 3, 2022
Adds support for different page sizes from 4K for drmemtrace's -use_physical. Adds hugepage support, which does not require anything special: just updated the comment. Improves the error reporting for physical translation. Tested on an AArch64 machine with 64K pages. Issue: #4014
derekbruening
added a commit
that referenced
this issue
Jun 3, 2022
Adds support for physical addresses to offline dr$sim traces. To support simulators wanting both virtual and physical addresses, and to simplify post-processing where the virtual PC values are needed, the regular trace entries remain all virtual. A new marker type TRACE_MARKER_TYPE_PHYSICAL_ADDRESS listing the corresponding physical address is added. The mappings are assumed to not change, allowing just one marker for each newly-observed page. This is done per-thread. An explicit TRACE_MARKER_TYPE_PHYSICAL_ADDRESS_NOT_AVAILABLE marker is inserted on failure to translate, to prevent analyzers from having to infer this due to the lack of the already-sparse markers. Separately emitted pairs of virtual and physical address markers were considered, with raw2trace inserting the physical at the right place, but that presents complexities with buffer handoff and with the first buffer. Instead, the physical are inserted via memmove directly into the buffer. This does not seem to be a performance concern: the translation lookup is the bottleneck. Adds support for the new markers to the view tool. Adds a Linux x86_64 test that runs a tiny asm app and ensures a physical address marker is inserted. The test needs to run as sudo, along with its pre- and post- commands. Currently it is enabled everywhere, so a user running interactive tests will have it pause while it waits for input. This might cause issues with manually running the test suite. A number of items remain for further work: + Performance is poor: the hashtable and caching need improvement. + There is a hardcoded limit on how many markers can be added per buffer. Once this is exceeded, further markers are dropped. We should split the buffer to handle this. + We may want to add a mode that checks for mapping changes. + Missing privileges results in every physical address being 0 instead of showing the failure. We need to check the capabilities to distinguish. + Better testing that we're actually getting physical addresses for online tests. + Better offline testing with larger apps. + Basic blocks that cross a page have only the first one translated. + A file descriptor per thread is used, which will not scale well with DR's descriptor protection and might hit rlimits. Issue: #4014
derekbruening
added a commit
that referenced
this issue
Jun 7, 2022
For virtual-to-physical translation, we cannot use a 3rd-party library such as STL due to static linking constraints. Yet the drcontainers hashtable performs poorly; we need an open-address hashtable. Since DR has one we export it here in a new interface. Adds a simple test and documentation. Issue: #4014
derekbruening
added a commit
that referenced
this issue
Jun 7, 2022
Adds support for physical addresses to offline dr$sim traces. To support simulators wanting both virtual and physical addresses, and to simplify post-processing where the virtual PC values are needed, the regular trace entries remain all virtual. A new marker type TRACE_MARKER_TYPE_PHYSICAL_ADDRESS listing the corresponding physical address is added. The mappings are assumed to not change, allowing just one marker for each newly-observed page. This is done per-thread. An explicit TRACE_MARKER_TYPE_PHYSICAL_ADDRESS_NOT_AVAILABLE marker is inserted on failure to translate, to prevent analyzers from having to infer this due to the lack of the already-sparse markers. Separately emitted pairs of virtual and physical address markers were considered, with raw2trace inserting the physical at the right place, but that presents complexities with buffer handoff and with the first buffer. Instead, the physical are inserted via memmove directly into the buffer. This does not seem to be a performance concern: the translation lookup is the bottleneck. Since the memmoves occur only on the first instance of each page, they are much rarer than all the virtual-to-physical translations. Adds support for the new markers to the view tool. Adds a Linux x86_64 test that runs a tiny asm app and ensures a physical address marker is inserted. The test needs to run as sudo, along with its pre- and post- commands. To avoid a confusing blocking password query in local runs, a new set of tests controlled by a new CMake option RUN_SUDO_TESTS is added. It is set only for automated_ci, where we assume a passwordless sudo. A number of items remain for further work: + Performance is poor: the hashtable and caching need improvement. + There is a hardcoded limit on how many markers can be added per buffer. Once this is exceeded, further markers are dropped. We should split the buffer to handle this. + We may want to add a mode that checks for mapping changes. + Missing privileges results in every physical address being 0 instead of showing the failure. We need to check the capabilities to distinguish. + Better testing that we're actually getting physical addresses for online tests. + Better offline testing with larger apps. + Basic blocks that cross a page have only the first one translated. + A file descriptor per thread is used, which will not scale well with DR's descriptor protection and might hit rlimits. + Online traces still replace all virtual addresses with physical. We should break compatibility and transition them to use these markers, with dr$sim computing the physical addresses from the markers. Issue: #4014
derekbruening
added a commit
that referenced
this issue
Jun 7, 2022
For virtual-to-physical translation, we cannot use a 3rd-party library such as STL due to static linking constraints. Yet the drcontainers hashtable performs poorly; we need an open-address hashtable. Since DR has one we export it here in a new interface. Adds a simple test and documentation. Issue: #4014
derekbruening
added a commit
that referenced
this issue
Jun 8, 2022
Switches from the drcontainers hashtable to the new open-address hashtable provided by DR. This is 2x to 3x faster due to the reduced dereferences from the inlined data. Increases the last-value cache from one entry to an array of 8 entries. This was found to provide improved performance on small benchmarks. Measured on bzip2 local and SPEC2006 runs as they are short enough to allow interactive experimentation. -use_physical still incurs a ~2.5x slowdown, but it was 9x before these changes. The bottleneck is no longer the hashtable but is now spread across all the address iteration and querying code. Issue: #4014
derekbruening
added a commit
that referenced
this issue
Jun 8, 2022
Switches from the drcontainers hashtable to the new open-address hashtable provided by DR. This is 2x to 3x faster due to the reduced dereferences from the inlined data. Increases the last-value cache from one entry to an array of 8 entries. This was found to provide improved performance on small benchmarks. Measured on bzip2 local and SPEC2006 runs as they are short enough to allow interactive experimentation. -use_physical still incurs a ~2.5x slowdown, but it was 9x before these changes. The bottleneck is no longer the hashtable but is now spread across all the address iteration and querying code. Issue: #4014
dolanzhao
pushed a commit
that referenced
this issue
Jun 8, 2022
Adds support for physical addresses to offline dr$sim traces. To support simulators wanting both virtual and physical addresses, and to simplify post-processing where the virtual PC values are needed, the regular trace entries remain all virtual. A new marker type TRACE_MARKER_TYPE_PHYSICAL_ADDRESS listing the corresponding physical address is added. The mappings are assumed to not change, allowing just one marker for each newly-observed page. This is done per-thread. An explicit TRACE_MARKER_TYPE_PHYSICAL_ADDRESS_NOT_AVAILABLE marker is inserted on failure to translate, to prevent analyzers from having to infer this due to the lack of the already-sparse markers. Separately emitted pairs of virtual and physical address markers were considered, with raw2trace inserting the physical at the right place, but that presents complexities with buffer handoff and with the first buffer. Instead, the physical are inserted via memmove directly into the buffer. This does not seem to be a performance concern: the translation lookup is the bottleneck. Since the memmoves occur only on the first instance of each page, they are much rarer than all the virtual-to-physical translations. Adds support for the new markers to the view tool. Adds a Linux x86_64 test that runs a tiny asm app and ensures a physical address marker is inserted. The test needs to run as sudo, along with its pre- and post- commands. To avoid a confusing blocking password query in local runs, a new set of tests controlled by a new CMake option RUN_SUDO_TESTS is added. It is set only for automated_ci, where we assume a passwordless sudo. A number of items remain for further work: + Performance is poor: the hashtable and caching need improvement. + There is a hardcoded limit on how many markers can be added per buffer. Once this is exceeded, further markers are dropped. We should split the buffer to handle this. + We may want to add a mode that checks for mapping changes. + Missing privileges results in every physical address being 0 instead of showing the failure. We need to check the capabilities to distinguish. + Better testing that we're actually getting physical addresses for online tests. + Better offline testing with larger apps. + Basic blocks that cross a page have only the first one translated. + A file descriptor per thread is used, which will not scale well with DR's descriptor protection and might hit rlimits. + Online traces still replace all virtual addresses with physical. We should break compatibility and transition them to use these markers, with dr$sim computing the physical addresses from the markers. Issue: #4014
dolanzhao
pushed a commit
that referenced
this issue
Jun 8, 2022
For virtual-to-physical translation, we cannot use a 3rd-party library such as STL due to static linking constraints. Yet the drcontainers hashtable performs poorly; we need an open-address hashtable. Since DR has one we export it here in a new interface. Adds a simple test and documentation. Issue: #4014
dolanzhao
pushed a commit
that referenced
this issue
Jun 8, 2022
Switches from the drcontainers hashtable to the new open-address hashtable provided by DR. This is 2x to 3x faster due to the reduced dereferences from the inlined data. Increases the last-value cache from one entry to an array of 8 entries. This was found to provide improved performance on small benchmarks. Measured on bzip2 local and SPEC2006 runs as they are short enough to allow interactive experimentation. -use_physical still incurs a ~2.5x slowdown, but it was 9x before these changes. The bottleneck is no longer the hashtable but is now spread across all the address iteration and querying code. Issue: #4014
derekbruening
added a commit
that referenced
this issue
Jun 14, 2022
Solves issues with requiring precise placement of physical markers prior to their corresponding regular virtual address entry (ifetch entries being inserted between the marker and memref by raw2trace; delayed branches needing to carry their marker; page-spanning extra markers needing to land in the right place) by switching to a scheme where markers come in pairs with a physical and a virtual. Such a pair has flexibility on where it can appear in the trace so long as it is prior to the regular entry with that virtual address. This eliminates all work and complexity for raw2trace to move the markers around. Solves issues with capacity where the tracer runs out of pre-allocated space inside the main buffer for marker-heavy output by switching to use a separately-allocated-and-emitted "v2p" buffer for the physical,virtual pairs. This is made possible by the flexible location. Two complexities of this scheme are solved: the buffer is page-aligned and heap-allocated for buffer handoff; the initial thread header is added to the v2p buffer and skipped in the regular buffer to avoid the pairs appearing before the top-level metadata. Refactors header insertion and buffer output into separate functions out of the large memtrace() function. Multi-output was tested on SPEC2K6 gzip which hits the 1-page limit 4 times during its run with a scaled-down test input set workload. The view tool, invariant checker, release notes, and regression test are updated. The invariant checker now also checks that the physical and virtual pair member share their bottom bits. are identical. Includes a significant fix for physical offline tracing: Elision and address displacement optimizations are disabled for physical addresses as the final address is required at tracing time. Issue: #4014
derekbruening
added a commit
that referenced
this issue
Jun 15, 2022
To use the new physical address markers in a trace analysis tool, the page size must be known. Here we add a new marker to the thread headers holding the page size. Sanity tests are added. Issue: #4014
derekbruening
added a commit
that referenced
this issue
Jun 15, 2022
Solves issues with requiring precise placement of physical markers prior to their corresponding regular virtual address entry (ifetch entries being inserted between the marker and memref by raw2trace; delayed branches needing to carry their marker; page-spanning extra markers needing to land in the right place) by switching to a scheme where markers come in pairs with a physical and a virtual. Such a pair has flexibility on where it can appear in the trace so long as it is prior to the regular entry with that virtual address. This eliminates all work and complexity for raw2trace to move the markers around. Solves issues with capacity where the tracer runs out of pre-allocated space inside the main buffer for marker-heavy output by switching to use a separately-allocated-and-emitted "v2p" buffer for the physical,virtual pairs. This is made possible by the flexible location. Two complexities of this scheme are solved: the buffer is page-aligned and heap-allocated for buffer handoff; the initial thread header is added to the v2p buffer and skipped in the regular buffer to avoid the pairs appearing before the top-level metadata. Refactors header insertion and buffer output into separate functions out of the large memtrace() function. Multi-output was tested on SPEC2K6 gzip which hits the 1-page limit 4 times during its run with a scaled-down test input set workload. The view tool, invariant checker, release notes, and regression test are updated. The invariant checker now also checks that the physical and virtual pair member share their bottom bits. are identical. Includes a significant fix for physical offline tracing: Elision and address displacement optimizations are disabled for physical addresses as the final address is required at tracing time. Issue: #4014
derekbruening
added a commit
that referenced
this issue
Jun 15, 2022
Since the kernel lets an unprivileged user read the pagemap file but supplies 0 values for the physical pages (and 0 is a possible legitimate value), we add an explicit check for privileges and fail up front if physical addresses are requested but not available. This is a change in behavior where before execution would continue with a warning. Adds a test of missing privileges. Fixes a sentinel issue where a 0 physical page was stored as 1 in the table: but 1 is a possible valid number as well, so -1 is used instead. Issue: #4014
derekbruening
added a commit
that referenced
this issue
Jun 15, 2022
Solves issues with requiring precise placement of physical markers prior to their corresponding regular virtual address entry (ifetch entries being inserted between the marker and memref by raw2trace; delayed branches needing to carry their marker; page-spanning extra markers needing to land in the right place) by switching to a scheme where markers come in pairs with a physical and a virtual. Such a pair has flexibility on where it can appear in the trace so long as it is prior to the regular entry with that virtual address. This eliminates all work and complexity for raw2trace to move the markers around. Solves issues with capacity where the tracer runs out of pre-allocated space inside the main buffer for marker-heavy output by switching to use a separately-allocated-and-emitted "v2p" buffer for the physical,virtual pairs. This is made possible by the flexible location. Two complexities of this scheme are solved: the buffer is page-aligned and heap-allocated for buffer handoff; the initial thread header is added to the v2p buffer and skipped in the regular buffer to avoid the pairs appearing before the top-level metadata. Refactors header insertion and buffer output into separate functions out of the large memtrace() function. Multi-output was tested on SPEC2K6 gzip which hits the 1-page limit 4 times during its run with a scaled-down test input set workload. The view tool, invariant checker, release notes, and regression test are updated. The invariant checker now also checks that the physical and virtual pair member share their bottom bits. are identical. Includes a significant fix for physical offline tracing: Elision and address displacement optimizations are disabled for physical addresses as the final address is required at tracing time. Issue: #4014
derekbruening
added a commit
that referenced
this issue
Jun 15, 2022
To use the new physical address markers in a trace analysis tool, the page size must be known. Here we add a new marker to the thread headers holding the page size. Sanity tests are added. Issue: #4014
derekbruening
added a commit
that referenced
this issue
Jun 16, 2022
Since the kernel lets an unprivileged user read the pagemap file but supplies 0 values for the physical pages (and 0 is a possible legitimate value), we add an explicit check for privileges and fail up front if physical addresses are requested but not available. This is a change in behavior where before execution would continue with a warning. Adds a test of missing privileges. Adds sudo to the online drcachesim -use_physical tests. Only enables RUN_SUDO_TESTS if CI_TRIGGER is set, since Jenkins does not have password-less sudo. Fixes a sentinel issue where a 0 physical page was stored as 1 in the table: but 1 is a possible valid number as well, so -1 is used instead. Issue: #4014
derekbruening
added a commit
that referenced
this issue
Jun 17, 2022
Adds handling of page-spanning instruction fetch and data accesses. For offline traces these are heuristics as we do not know the precise sizes at tracing time. Since an extra physical page value is not harmful, the heuristics emit a second page if there is any decent change that it is reached. Adds a data page span test by switching the drcacheoff.phys test to use allasm_repstr and adding such an access there. An instruction cross is more difficult to synthesize. Issue: #4014
derekbruening
added a commit
that referenced
this issue
Jun 18, 2022
Adds handling of page-spanning instruction fetch and data accesses. For offline traces these are heuristics as we do not know the precise sizes at tracing time. Since an extra physical page value is not harmful, the heuristics emit a second page if there is any decent chance that it is reached. Adds a data page span test by switching the drcacheoff.phys test to use allasm_repstr and adding such an access there. An instruction cross is more difficult to synthesize. Issue: #4014
This was referenced Jun 21, 2022
derekbruening
added a commit
that referenced
this issue
Jun 21, 2022
Fixes two bugs causing crashes with -use_physical: + Records the drcontext used for hashtable creation to ensure the same one is used at destruction as a different thread can call the exit event. + Orders the drmemtrace thread exit event before the drmodtrack one to ensure drmodtrack access during the final thread buffer output is safe. These fixes were manually tested on large multi-threaded applications where these two crashes showed up before and disappear with the fixes. Issue: #4014
derekbruening
added a commit
that referenced
this issue
Jun 22, 2022
Fixes two bugs causing crashes with -use_physical: + Records the drcontext used for hashtable creation to ensure the same one is used at destruction as a different thread can call the exit event. + Orders the drmemtrace thread exit event before the drmodtrack one to ensure drmodtrack access during the final thread buffer output is safe. These fixes were manually tested on large multi-threaded applications where these two crashes showed up before and disappear with the fixes. Issue: #4014
derekbruening
added a commit
that referenced
this issue
Jun 24, 2022
The view tool was not printing the virtual address for physical-to-virtual translations, nor the address for software prefetches, which made it hard to diagnose physical address failures in general and in particular when the culprit virtual address was from a prefetch as it did not appear in the text output. We add that here. Since some prefetches have long names, we abbreviate them, but still need to widen the name field to align them all. While at it, we add the function marker values to the view tool as well. Example output: 10: T462774 ifetch 7 byte(s) @ 0x00007f6cae908a54 non-branch 11: T462774 read 8 byte(s) @ 0x00007f6cae90ca20 by PC 0x00007f6cae908a54 12: T462774 ifetch 4 byte(s) @ 0x00007f6cae908a5c non-branch 13: T462774 pref-w-L3-NT 8 byte(s) @ 0x00007f6c389xd238 by PC 0x00007f6cae908a5c ... 441522: T48816 <marker: function #9> 441523: T48816 <marker: function return address 0xaaaad5de4944> 441524: T48816 <marker: function argument 0x15187bd4ae1d> 441526: T48816 <marker: function argument 0x1> ... 441546: T48816 <marker: function #9> 441547: T48816 <marker: function return value 0x15187bd4ae1d> Issue: #4014
derekbruening
added a commit
that referenced
this issue
Jun 24, 2022
Adds counts of physical,virtual marker pairs and of unavailable physical address markers to the basic_counts tool. Updates various test outputs. Issue: #4014
derekbruening
added a commit
that referenced
this issue
Jun 24, 2022
Adds counts of physical,virtual marker pairs and of unavailable physical address markers to the basic_counts tool. Updates various test outputs. Issue: #4014
derekbruening
added a commit
that referenced
this issue
Jun 28, 2022
…5544) The view tool was not printing the virtual address for virtual-to-physical translations, nor the address for software prefetches, which made it hard to diagnose physical address failures in general and in particular when the culprit virtual address was from a prefetch as it did not appear in the text output. We add that here. Since some prefetches have long names, we abbreviate them, but still need to widen the name field to align them all. While at it, we add the function marker values to the view tool as well. Example output: 10: T462774 ifetch 7 byte(s) @ 0x00007f6cae908a54 non-branch 11: T462774 read 8 byte(s) @ 0x00007f6cae90ca20 by PC 0x00007f6cae908a54 12: T462774 ifetch 4 byte(s) @ 0x00007f6cae908a5c non-branch 13: T462774 pref-w-L3-NT 8 byte(s) @ 0x00007f6c389xd238 by PC 0x00007f6cae908a5c ... 441522: T48816 <marker: function #9> 441523: T48816 <marker: function return address 0xaaaad5de4944> 441524: T48816 <marker: function argument 0x15187bd4ae1d> 441526: T48816 <marker: function argument 0x1> ... 441546: T48816 <marker: function #9> 441547: T48816 <marker: function return value 0x15187bd4ae1d> Issue: #4014
derekbruening
added a commit
that referenced
this issue
Aug 3, 2022
When -use_physical is set, the cache and TLB simulators read the new virtual-to-physical translation markers and use them to simulate physical addresses. -------------------------------------------------- Tested: Ran manually and looked at logs. Open to suggestions for how to automate testing. $ rm -rf drmemtrace.sim*.dir; ninja && sudo bin64/drrun -stderr_mask 0 -t drcachesim -use_physical -offline -- suite/tests/bin/simple_app && sudo chown -R $USER drmemtrace.sim*.dir && bin64/drrun -t drcachesim -indir drmemtrace.sim*.dir -use_physical -verbose 3 > OUT 2>&1 $ less OUT translating virtual 0x7fed14de0050 to 0xf52ace050 ::3036256.3036256:: @0xf52ace050 instr x3 translating virtual 0x7fed14de0053 to 0xf52ace053 ::3036256.3036256:: @0xf52ace053 instr x5 translating virtual 0x7ffca0af9068 to 0xb3f975068 translating virtual 0x7fed14de0053 to 0xf52ace053 ::3036256.3036256:: @0xf52ace053 write 0xb3f975068 x8 $ bin64/drrun -t drcachesim -indir drmemtrace.sim*.dir -use_physical -simulator_type TLB -verbose 3 > OUT 2>&1 $ les OUT translating virtual 0x7f6f263a3050 to 0xf52ace050 ::3080711.3080711:: @0x7070615f656c706d instr 0xf52ace050 x3 translating virtual 0x7f6f263a3053 to 0xf52ace053 ::3080711.3080711:: @0x7070615f656c706d direct_call 0xf52ace053 x5 translating virtual 0x7ffdeaab3798 to 0xed4c52798 translating virtual 0x7f6f263a3053 to 0xf52ace053 -------------------------------------------------- Issue: #4014
derekbruening
added a commit
that referenced
this issue
Aug 4, 2022
When -use_physical is set, the cache and TLB simulators read the new virtual-to-physical translation markers and use them to simulate physical addresses. Changes drcachesim online mode to leave addresses virtual and insert markers instead, just like offline. Adds a compatibility change note and updates the docs. Includes a fix for -cpu_scheduling where the cached last thread was not reset on a cpu change with no thread change in between. -------------------------------------------------- Tested: Ran manually and looked at logs. Open to suggestions for how to automate testing. $ rm -rf drmemtrace.sim*.dir; ninja && sudo bin64/drrun -stderr_mask 0 -t drcachesim -use_physical -offline -- suite/tests/bin/simple_app && sudo chown -R $USER drmemtrace.sim*.dir && bin64/drrun -t drcachesim -indir drmemtrace.sim*.dir -use_physical -verbose 3 > OUT 2>&1 $ less OUT translating virtual 0x7fed14de0050 to 0xf52ace050 ::3036256.3036256:: @0xf52ace050 instr x3 translating virtual 0x7fed14de0053 to 0xf52ace053 ::3036256.3036256:: @0xf52ace053 instr x5 translating virtual 0x7ffca0af9068 to 0xb3f975068 translating virtual 0x7fed14de0053 to 0xf52ace053 ::3036256.3036256:: @0xf52ace053 write 0xb3f975068 x8 $ bin64/drrun -t drcachesim -indir drmemtrace.sim*.dir -use_physical -simulator_type TLB -verbose 3 > OUT 2>&1 $ les OUT translating virtual 0x7f6f263a3050 to 0xf52ace050 ::3080711.3080711:: @0x7070615f656c706d instr 0xf52ace050 x3 translating virtual 0x7f6f263a3053 to 0xf52ace053 ::3080711.3080711:: @0x7070615f656c706d direct_call 0xf52ace053 x5 translating virtual 0x7ffdeaab3798 to 0xed4c52798 translating virtual 0x7f6f263a3053 to 0xf52ace053 -------------------------------------------------- Issue: #4014
dolanzhao
pushed a commit
that referenced
this issue
Aug 4, 2022
When -use_physical is set, the cache and TLB simulators read the new virtual-to-physical translation markers and use them to simulate physical addresses. Changes drcachesim online mode to leave addresses virtual and insert markers instead, just like offline. Adds a compatibility change note and updates the docs. Includes a fix for -cpu_scheduling where the cached last thread was not reset on a cpu change with no thread change in between. -------------------------------------------------- Tested: Ran manually and looked at logs. Open to suggestions for how to automate testing. $ rm -rf drmemtrace.sim*.dir; ninja && sudo bin64/drrun -stderr_mask 0 -t drcachesim -use_physical -offline -- suite/tests/bin/simple_app && sudo chown -R $USER drmemtrace.sim*.dir && bin64/drrun -t drcachesim -indir drmemtrace.sim*.dir -use_physical -verbose 3 > OUT 2>&1 $ less OUT translating virtual 0x7fed14de0050 to 0xf52ace050 ::3036256.3036256:: @0xf52ace050 instr x3 translating virtual 0x7fed14de0053 to 0xf52ace053 ::3036256.3036256:: @0xf52ace053 instr x5 translating virtual 0x7ffca0af9068 to 0xb3f975068 translating virtual 0x7fed14de0053 to 0xf52ace053 ::3036256.3036256:: @0xf52ace053 write 0xb3f975068 x8 $ bin64/drrun -t drcachesim -indir drmemtrace.sim*.dir -use_physical -simulator_type TLB -verbose 3 > OUT 2>&1 $ les OUT translating virtual 0x7f6f263a3050 to 0xf52ace050 ::3080711.3080711:: @0x7070615f656c706d instr 0xf52ace050 x3 translating virtual 0x7f6f263a3053 to 0xf52ace053 ::3080711.3080711:: @0x7070615f656c706d direct_call 0xf52ace053 x5 translating virtual 0x7ffdeaab3798 to 0xed4c52798 translating virtual 0x7f6f263a3053 to 0xf52ace053 -------------------------------------------------- Issue: #4014
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-use_physical is only supported for online drcachesim today, but that is not made clear in the docs nor in actual usage. The tool lets you run
-offline -use_physical
and ends up with a post-processed trace with physical data addresses but virtual PC fetches, due to an accident.instr_offline_t::get_entry_addr
blindly treats the entry as a data ref, so it tries totranslate bogus addresses composed of module id + offset encodings.
And indeed we get warnings at -verbose 1+:
Those are the instruction entries:
So we have two action items. First, we should have the front-end refuse to combine
-use_physical
and-offline
for now, and update the docs to reflect this.Next, we need to decide whether to try to support this combination. Is it possible that future Linux distros will all shut down access to pagemap? Although for research purposes running as sudo or something may still be feasible.
How would we support this? We'd either have to store extra info for each data ref plus
an entry for each instr like for DGC (#2062), or have a custom solution just for offline
where each block PC has 2 entries: one virtual and one physical. The former will not work
well w/ tools that want operands (like opcode_mix or micro-arch simulators): but that would be the same for DGC, so we might need an option to store the full instr bytes.
Another limitation today (which perhaps should have its own issue) with
-use_physical
relates to static linking which we often use for offline tracing:Xref #2912 but that issue has many confusing entries so it seemed better to start clean here.
The text was updated successfully, but these errors were encountered: