Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deduplicate instruction prefixes when disassembling #43

Merged
merged 1 commit into from
Jul 1, 2022

Conversation

RyanGlScott
Copy link
Contributor

@RyanGlScott RyanGlScott commented Jun 29, 2022

Previously, the opcode lookup table would encode every possible permutation of allowable prefixes for each instruction as a separate path. This is expensive in both space and time, as observed in #40. The new approach taken in this patch, as described in Note [x86_64 disassembly] in Flexdis86.Disassembler, is to only encode the VEX prefixe and opcode bytes in the lookup table, leaving out all other forms of prefix bytes entirely. Instead, disassembly will start by eagerly parsing as many prefix bytes as possible, proceeding to parse opcode bytes after the first non-prefix byte is encountered. After identifying the possible instructions from the opcode, we will then narrow down exactly which instruction it is by validating them against the set of parsed prefixes.

As noted in Note [x86_64 disassembly], we had to add some special cases for nop-like instructions—namely, endbr32, endbr64, pause, and xchg—to avoid some prefix byte–related ambiguity. The new handling for xchg is more accurate than it was before, so this patch fixes #42 as a side effect. This patch also addresses part (1) of #40 in that it should reduce the amount of memory that the lookup table uses, although there is potentially more work to be done (see part (2) of #40).

@RyanGlScott RyanGlScott requested a review from travitch June 29, 2022 19:10
@RyanGlScott
Copy link
Contributor Author

Still to come is a comparison of profiling reports for large SAW proofs before and after this patch to compare the memory usage. In the meantime, this patch is in a suitable state for review.

src/Flexdis86/Disassembler.hs Outdated Show resolved Hide resolved
src/Flexdis86/Disassembler.hs Show resolved Hide resolved
src/Flexdis86/Disassembler.hs Outdated Show resolved Hide resolved
src/Flexdis86/Disassembler.hs Outdated Show resolved Hide resolved
src/Flexdis86/Disassembler.hs Outdated Show resolved Hide resolved
src/Flexdis86/Disassembler.hs Outdated Show resolved Hide resolved
src/Flexdis86/Disassembler.hs Outdated Show resolved Hide resolved
Previously, the opcode lookup table would encode every possible permutation of
allowable prefixes for each instruction as a separate path. This is expensive
in both space and time, as observed in #40. The new approach taken in this
patch, as described in `Note [x86_64 disassembly]` in `Flexdis86.Disassembler`,
is to only encode the VEX prefixe and opcode bytes in the lookup table, leaving
out all other forms of prefix bytes entirely. Instead, disassembly will start
by eagerly parsing as many prefix bytes as possible, proceeding to parse opcode
bytes after the first non-prefix byte is encountered. After identifying the
possible instructions from the opcode, we will then narrow down exactly which
instruction it is by validating them against the set of parsed prefixes.

As noted in `Note [x86_64 disassembly]`, we had to add some special cases for
`nop`-like instructions—namely, `endbr32`, endbr64`, `pause`, and `xchg`—to
avoid some prefix byte–related ambiguity. The new handling for `xchg` is
more accurate than it was before, so this patch fixes #42 as a side effect.
This patch also addresses part (1) of #40 in that it should reduce the amount
of memory usage that the lookup table takes, although there is potentially more
work to be done (see part (2) of #40).
@RyanGlScott
Copy link
Contributor Author

Here is a heap profiling report for the same SAW proof as in #40 (comment), but with the patch in #43 applied:

flexdis86-patch.pdf

mkX64Disassembler doesn't even show up in this report's top offenders, which is exciting. I think this is as good of a sign as any that this patch does in fact reduce the memory usage in practice.

@RyanGlScott RyanGlScott merged commit 7cb5fc6 into main Jul 1, 2022
@RyanGlScott RyanGlScott deleted the T40-part-one branch July 1, 2022 18:18
RyanGlScott added a commit to GaloisInc/saw-script that referenced this pull request Jul 1, 2022
This patch brings in the changes from GaloisInc/flexdis86#43, which redesigns
the x86_64 disassembler in `flexdis86` to use a substantially smaller lookup
table, thereby saving a fair bit of resident memory when doing x86-related
proofs.
robdockins pushed a commit to GaloisInc/saw-script that referenced this pull request Jul 1, 2022
This patch brings in the changes from GaloisInc/flexdis86#43, which redesigns
the x86_64 disassembler in `flexdis86` to use a substantially smaller lookup
table, thereby saving a fair bit of resident memory when doing x86-related
proofs.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

flexdis86 gives incorrect size to xchg ax,ax's operands
2 participants