-
Notifications
You must be signed in to change notification settings - Fork 10
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Deduplicate instruction prefixes when disassembling
Previously, the opcode lookup table would encode every possible permutation of allowable prefixes for each instruction as a separate path. This is expensive in both space and time, as observed in #40. The new approach taken in this patch, as described in `Note [x86_64 disassembly]` in `Flexdis86.Disassembler`, is to only encode the VEX prefixe and opcode bytes in the lookup table, leaving out all other forms of prefix bytes entirely. Instead, disassembly will start by eagerly parsing as many prefix bytes as possible, proceeding to parse opcode bytes after the first non-prefix byte is encountered. After identifying the possible instructions from the opcode, we will then narrow down exactly which instruction it is by validating them against the set of parsed prefixes. As noted in `Note [x86_64 disassembly]`, we had to add some special cases for `nop`-like instructions—namely, `endbr32`, endbr64`, `pause`, and `xchg`—to avoid some prefix byte–related ambiguity. The new handling for `xchg` is more accurate than it was before, so this patch fixes #42 as a side effect. This patch also addresses part (1) of #40 in that it should reduce the amount of memory usage that the lookup table takes, although there is potentially more work to be done (see part (2) of #40).
- Loading branch information
1 parent
7109bdc
commit 94c1ff7
Showing
6 changed files
with
609 additions
and
226 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.