Skip to content

Commit

Permalink
i#5520 memtrace encodings: Update samples with encodings
Browse files Browse the repository at this point in the history
Updates the x86_64 and aarch64 traces to the new .zip format with
embedded instruction encodings.

Removes the .raw directories entirely and the .so binaries as they are
no longer needed.

Updates all the READMEs for the new traces and for no longer needing
library support for mapping in binaries for decoding.

Issue: DynamoRIO/dynamorio#5520
  • Loading branch information
derekbruening committed Oct 4, 2022
1 parent 8df7f64 commit d257d04
Show file tree
Hide file tree
Showing 95 changed files with 274 additions and 283 deletions.
150 changes: 75 additions & 75 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,18 +1,20 @@
# drmemtrace_samples

Memory trace samples from DynamoRIO's drmemtrace tracer for its [drcachesim analyzer](http://dynamorio.org/page_drcachesim.html).
Memory trace samples from DynamoRIO's drmemtrace tracer for its [trace analysis framework](http://dynamorio.org/page_drcachesim.html).

## Trace format

The memory address tracer we use is part of the [drcachesim open-source
tool](http://dynamorio.org/page_drcachesim.html), which is
part of the [DynamoRIO dynamic binary instrumentation
framework](http://dynamorio.org). Here we summarize the tracing format.
See the [drcachesim documentation](http://dynamorio.org/page_drcachesim.html)
for further information.
See the [tracing and analysis framework
documentation](http://dynamorio.org/page_drcachesim.html) for further
information.

A trace contains a sequence of user-mode instruction and memory fetches for
each thread in a target application. Each 32KB block of thread data has a
each thread in a target application. The insruction encodings are also included.
Each 32KB block of thread data has a
timestamp and records which cpu it executed on, allowing reconstructing the
thread interleaving at that granularity.

Expand All @@ -25,86 +27,84 @@ A human-readable view of a sample trace highlighting thread switches and
a signal handler:
```
$ bin64/drrun -t drcachesim -indir drmemtrace.threadsig.[0-9]*.dir -simulator_type view 2>&1 | less
$ bin64/drrun -t drcachesim -indir drmemtrace.threadsig.[0-9]*.dir -simulator_type view 2>&1 | less
Output format:
<record#>: T<tid> <record details>
------------------------------------------------------------
1: T1774673 <marker: version 4>
2: T1774673 <marker: filetype 0x240>
3: T1774673 <marker: cache line size 64>
4: T1774673 <marker: chunk instruction count 10000000>
5: T1774673 <marker: page size 4096>
<...>
173369: T1774669 ifetch 3 byte(s) @ 0x0000000000402680 f3 48 a5 rep movsq
173370: T1774669 read 8 byte(s) @ 0x00007f8881e37c70 by PC 0x0000000000402680
173371: T1774669 write 8 byte(s) @ 0x00007f8881e37a70 by PC 0x0000000000402680
------------------------------------------------------------
173372: T1774668 <marker: timestamp 13309323446399692>
173373: T1774668 <marker: tid 1774668 on core 3>
173374: T1774668 ifetch 4 byte(s) @ 0x000000000040361a 48 8b 45 f8 mov -0x08(%rbp), %rax
173375: T1774668 read 8 byte(s) @ 0x00007ffdf9d85088 by PC 0x000000000040361a
173376: T1774668 ifetch 1 byte(s) @ 0x000000000040361e 5d pop %rbp
173377: T1774668 read 8 byte(s) @ 0x00007ffdf9d85090 by PC 0x000000000040361e
173378: T1774668 ifetch 1 byte(s) @ 0x000000000040361f c3 ret
173379: T1774668 read 8 byte(s) @ 0x00007ffdf9d85098 by PC 0x000000000040361f
<...>
T468608 0x0000000000467c45 4c 8b 54 24 08 mov 0x08(%rsp), %r10
T468608 read 8 byte(s) @ 0x7fff9f5fd9b0
T468608 0x0000000000467c4a b8 38 00 00 00 mov $0x00000038, %eax
T468608 0x0000000000467c4f 0f 05 syscall
47292: T1774668 ifetch 6 byte(s) @ 0x00000000004053bd 0f 85 e5 01 00 00 jnz $0x00000000004055a8
47293: T1774668 <marker: kernel xfer from 0x4053c3 to handler>
47294: T1774668 <marker: timestamp 13309323446391801>
47295: T1774668 <marker: tid 1774668 on core 3>
47296: T1774668 ifetch 1 byte(s) @ 0x000000000040257d 55 push %rbp
47297: T1774668 write 8 byte(s) @ 0x00007ffdf9d843f0 by PC 0x000000000040257d
47298: T1774668 ifetch 3 byte(s) @ 0x000000000040257e 48 89 e5 mov %rsp, %rbp
47299: T1774668 ifetch 3 byte(s) @ 0x0000000000402581 89 7d fc mov %edi, -0x04(%rbp)
47300: T1774668 write 4 byte(s) @ 0x00007ffdf9d843ec by PC 0x0000000000402581
47301: T1774668 ifetch 4 byte(s) @ 0x0000000000402584 48 89 75 f0 mov %rsi, -0x10(%rbp)
47302: T1774668 write 8 byte(s) @ 0x00007ffdf9d843e0 by PC 0x0000000000402584
47303: T1774668 ifetch 4 byte(s) @ 0x0000000000402588 48 89 55 e8 mov %rdx, -0x18(%rbp)
47304: T1774668 write 8 byte(s) @ 0x00007ffdf9d843d8 by PC 0x0000000000402588
47305: T1774668 ifetch 4 byte(s) @ 0x000000000040258c 83 7d fc 1a cmp -0x04(%rbp), $0x1a
47306: T1774668 read 4 byte(s) @ 0x00007ffdf9d843ec by PC 0x000000000040258c
47307: T1774668 ifetch 2 byte(s) @ 0x0000000000402590 75 0f jnz $0x00000000004025a1
47308: T1774668 ifetch 6 byte(s) @ 0x0000000000402592 8b 05 5c 0f 0e 00 mov <rel> 0x00000000004e34f4, %eax
47309: T1774668 read 4 byte(s) @ 0x00000000004e34f4 by PC 0x0000000000402592
47310: T1774668 ifetch 3 byte(s) @ 0x0000000000402598 83 c0 01 add $0x01, %eax
47311: T1774668 ifetch 6 byte(s) @ 0x000000000040259b 89 05 53 0f 0e 00 mov %eax, <rel> 0x00000000004e34f4
47312: T1774668 write 4 byte(s) @ 0x00000000004e34f4 by PC 0x000000000040259b
47313: T1774668 ifetch 1 byte(s) @ 0x00000000004025a1 90 nop
47314: T1774668 ifetch 1 byte(s) @ 0x00000000004025a2 5d pop %rbp
47315: T1774668 read 8 byte(s) @ 0x00007ffdf9d843f0 by PC 0x00000000004025a2
47316: T1774668 ifetch 1 byte(s) @ 0x00000000004025a3 c3 ret
47317: T1774668 read 8 byte(s) @ 0x00007ffdf9d843f8 by PC 0x00000000004025a3
47318: T1774668 ifetch 7 byte(s) @ 0x0000000000407bb0 48 c7 c0 0f 00 00 00 mov $0x0000000f, %rax
47319: T1774668 ifetch 2 byte(s) @ 0x0000000000407bb7 0f 05 syscall
47320: T1774668 <marker: timestamp 13309323446391808>
47321: T1774668 <marker: tid 1774668 on core 3>
47322: T1774668 <marker: syscall xfer from 0x407bb9>
------------------------------------------------------------
T468610 <marker: timestamp 13270239527782712>
T468610 <marker: tid 468610 on core 0>
T468610 0x0000000000467c51 48 85 c0 test %rax, %rax
T468610 0x0000000000467c54 7c 13 jl $0x0000000000467c69
T468610 0x0000000000467c56 74 01 jz $0x0000000000467c59
T468610 0x0000000000467c59 31 ed xor %ebp, %ebp
T468610 0x0000000000467c5b 58 pop %rax
T468610 read 8 byte(s) @ 0x7f669dc77e70
T468610 0x0000000000467c5c 5f pop %rdi
T468610 read 8 byte(s) @ 0x7f669dc77e78
T468610 0x0000000000467c5d ff d0 call %rax
47323: T1774669 <marker: timestamp 13309323446393338>
47324: T1774669 <marker: tid 1774669 on core 11>
47325: T1774669 ifetch 3 byte(s) @ 0x0000000000467c51 48 85 c0 test %rax, %rax
<...>
T468608 0x0000000000405376 64 c7 04 25 18 00 00 movl $0x00000001, %fs:0x18
T468608 00 01 00 00 00
T468608 write 4 byte(s) @ 0x4eb898
T468608 0x0000000000405382 45 31 c0 xor %r8d, %r8d
T468608 0x0000000000405385 eb 2f jmp $0x00000000004053b6
T468608 <marker: kernel xfer from 0x4053b6 to handler>
T468608 <marker: timestamp 13270239527784929>
T468608 <marker: tid 468608 on core 2>
T468608 0x000000000040257d 55 push %rbp
T468608 write 8 byte(s) @ 0x7fff9f5fd330
T468608 0x000000000040257e 48 89 e5 mov %rsp, %rbp
T468608 0x0000000000402581 89 7d fc mov %edi, -0x04(%rbp)
T468608 write 4 byte(s) @ 0x7fff9f5fd32c
T468608 0x0000000000402584 48 89 75 f0 mov %rsi, -0x10(%rbp)
T468608 write 8 byte(s) @ 0x7fff9f5fd320
T468608 0x0000000000402588 48 89 55 e8 mov %rdx, -0x18(%rbp)
T468608 write 8 byte(s) @ 0x7fff9f5fd318
T468608 0x000000000040258c 83 7d fc 1a cmp -0x04(%rbp), $0x1a
T468608 read 4 byte(s) @ 0x7fff9f5fd32c
T468608 0x0000000000402590 75 0f jnz $0x00000000004025a1
T468608 0x0000000000402592 8b 05 5c 0f 0e 00 mov <rel> 0x00000000004e34f4, %eax
T468608 read 4 byte(s) @ 0x4e34f4
T468608 0x0000000000402598 83 c0 01 add $0x01, %eax
T468608 0x000000000040259b 89 05 53 0f 0e 00 mov %eax, <rel> 0x00000000004e34f4
T468608 write 4 byte(s) @ 0x4e34f4
T468608 0x00000000004025a1 90 nop
T468608 0x00000000004025a2 5d pop %rbp
T468608 read 8 byte(s) @ 0x7fff9f5fd330
T468608 0x00000000004025a3 c3 ret
T468608 read 8 byte(s) @ 0x7fff9f5fd338
T468608 0x0000000000407bb0 48 c7 c0 0f 00 00 00 mov $0x0000000f, %rax
T468608 0x0000000000407bb7 0f 05 syscall
T468608 <marker: timestamp 13270239527784936>
T468608 <marker: tid 468608 on core 2>
T468608 <marker: syscall xfer from 0x407bb9>
T468608 <marker: timestamp 13270239527792923>
T468608 <marker: tid 468608 on core 2>
T468608 0x00000000004053b6 80 bd 7c ff ff ff 00 cmp -0x84(%rbp), $0x00
T468608 read 1 byte(s) @ 0x7fff9f5fda4c
------------------------------------------------------------
47603: T1774668 <marker: timestamp 13309323446395891>
47604: T1774668 <marker: tid 1774668 on core 3>
47605: T1774668 ifetch 4 byte(s) @ 0x00000000004053c3 48 8b 45 c8 mov -0x38(%rbp), %rax
<...>
```

It is a series of instruction fetch, data fetch, and metadata entries. The
fetches contain addresses and sizes. The addresses are all virtual
(it is possible to [gather physical addresses in some
It is a series of user-mode instruction fetch, data fetch, and metadata
entries. The fetches contain addresses and sizes and, for instruction
fetches, instruction encodings. The addresses are all virtual (it is
possible to [gather physical addresses in some
circumstances](https://dynamorio.org/sec_drcachesim_phys.html)).
The metadata "markers" indicate things like which core a thread executed
on, timestamps, an arriving signal causing a PC discontinuity, etc.

## Using a trace for core simulation

For using a trace in a core simulator, you will want to obtain the opcodes.
These are not part of the base trace. They are obtained by decoding the
instruction fetch addresses from the binaries. Library support makes this
straightforward. A sample tool that does this is
[opcode_mix.cpp](https://github.com/DynamoRIO/dynamorio/blob/master/clients/drcachesim/tools/opcode_mix.cpp).
It uses library routines to read the "modules.log" file, which contains the
mappings of the binary and libraries from the traced execution, and map
those binaries into the address space, allowing examining the instruction
bytes. The modules.log file and all of the binaries used by the
application are required, in addition to the trace itself. For the vdso
the raw bytes are embedded in the modules file.

Other aspects of the trace which help core simulation are [discussed in our
documentation](https://dynamorio.org/sec_drcachesim_core.html).
For using a trace in a core simulator, the provided instruction encodings
can be decoded into opcodes and operands as is done with the "view" tool above.
This can be done with DynamoRIO's decoder, as in the provided sample tool
opcode_mix.cpp [5]. Other aspects of the trace which help core simulation
are discussed in our documentation [6].
Loading

0 comments on commit d257d04

Please sign in to comment.