-
Notifications
You must be signed in to change notification settings - Fork 571
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature Request: Add a count for total number of instruction for filtered trace #4857
Comments
To add context, this is about drcachesim/drmemtrace traces, not core traces. |
A prototype (after hitting and fixing #4861) implementing per-miss count markers
The decision is to abandon per-miss entries and just have a single count marker on thread exit. |
… or 2 misses; REMINDER: update 1st commit to Fixes #4857
Adds tracking of the dynamic (pre-filtered) instruction count for each thread. Stores the count in a marker at thread exit (storing prior to each miss entry might be useful but in experiments it adds too much overhead to include without a clear mandate and possibly under an off-by-default runtime option). Adds a check to the filter tests confirming that the instruction count marker is present at thread exit for filtered traces. This is done by invoked test_mode for these traces. To simplify testing for online and offline, the trace type marker is added to online traces. Adds a new tool.drcachesim.filter-asm test which ensures the instr count is the precise count expected for that fully-deterministic tiny static binary. A larger-than-static-asm application was manually tested, confirming the need for the rep string expansion special case in the code to limit to 1 instruction for that block. Fixes #4857
More frequent count entries in the traces is desirable. Re-opening to add the count to each buffer, which should be very low overhead. |
Adds the instruction count to the buffer header for drcachesim filtered traces, instead of just at thread exit, to aid in correlating filtered and unfiltered traces. Updates the documentation and the trace_invariants test. Augments the view tool to print out these markers. Further verified manually on a larger trace with the view tool: <marker: filetype 65> <marker: cache line size 64> <marker: timestamp 13268850698443940> <marker: tid 2539348 on core 1> <marker: instruction count 33326> 0x00007f6f64314090 48 89 e7 mov %rsp, %rdi 0x00007f6f6432dce0 b8 0c 00 00 00 mov $0x0000000c, %eax <marker: timestamp 13268850698727951> <marker: tid 2539348 on core 1> <marker: instruction count 40130> 0x00007f6f6432c906 c7 44 24 3c 00 00 00 movl $0x00000000, 0x3c(%rsp) <...> 0x00007f6f63cc3670 4c 8b 05 f9 27 0f 00 mov <rel> 0x00007f6f63db5e70, %r8 <marker: timestamp 13268850699555438> <marker: tid 2539348 on core 1> <marker: instruction count 126115> Fixes #4857
Adds the instruction count to the buffer header for drcachesim filtered traces, instead of just at thread exit, to aid in correlating filtered and unfiltered traces. Updates the documentation and the trace_invariants test. Augments the view tool to print out these markers. Further verified manually on a larger trace with the view tool: <marker: filetype 65> <marker: cache line size 64> <marker: timestamp 13268850698443940> <marker: tid 2539348 on core 1> <marker: instruction count 33326> 0x00007f6f64314090 48 89 e7 mov %rsp, %rdi 0x00007f6f6432dce0 b8 0c 00 00 00 mov $0x0000000c, %eax <marker: timestamp 13268850698727951> <marker: tid 2539348 on core 1> <marker: instruction count 40130> 0x00007f6f6432c906 c7 44 24 3c 00 00 00 movl $0x00000000, 0x3c(%rsp) <...> 0x00007f6f63cc3670 4c 8b 05 f9 27 0f 00 mov <rel> 0x00007f6f63db5e70, %r8 <marker: timestamp 13268850699555438> <marker: tid 2539348 on core 1> <marker: instruction count 126115> Fixes #4857
When traced in filtered mode, the total number of records does not equal to the number of instructions executed during tracing. It would be very useful to have accurate instruction count for filtered trace. At the minimum, it should provide a total number of instructions executed during the tracing period. Additional details such as # of instructions per thread, or # of instructions since last record (i.e. for every filtered miss) would great with the understanding that these will add additional overhead in speed and size.
This feature would be extremely useful to try to correlate filtered trace with unfiltered trace and would enable filtered trace be more useful for cache studies.
The text was updated successfully, but these errors were encountered: