-
Notifications
You must be signed in to change notification settings - Fork 573
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
i#4687: conservatively handle aflags for unrecognized (OP_xx) instructions on AArch64 #4688
Conversation
Conservatively restore aflags before unrecognized instructions in case they read/write the flags. Fixes: #4687
Just a quick thought, can unrecognized instructions also affect badly the restoration of gprs, and not just aflags? For example, failing to restore a register before the execution of an unrecognised instruction that uses that same register as a source? |
Yes, and for some uses cases missing SIMD usage would cause problems (such as clean call optimizations: though maybe today a64 does not perform those? as well as in other types of tools). One mitigating factor may be that the decoder of OP_xx looks at the typical register operand slots and fills in what GPR's it predicts the instruction might use? I'm basing that on #2626 (comment) from @egrimley -- I have not checked whether these OP_xx insructions actually have any operands? Do they? |
Ideally the fix here would be to add the missing instructions to the decoder (#2626). Perhaps we should try to understand whether there is an ETA on that in order to figure out how to proceed. We would like to avoid having to spend time tracking down any more mysterious bugs that turn out to be #2626 again. I believe @AssadHashmi planned to work on #4393 and maybe the plan was to wait for that before adding missing opcodes? |
ext/drreg/drreg.c
Outdated
* instruction, OP_xx. For these instructions conservatively restore aflags in | ||
* case they read/write the aflags. | ||
*/ | ||
IF_AARCH64(instr_get_opcode(inst) == OP_xx ||) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that all of the missing opcodes are vector instructions, I would expect a very small fraction to read or write flags (or GPR's for that matter). I wonder if adding the couple of opcodes that affect flags and GPR's would not be much work. Though it's certainly more work than this workaround. Let's wait for advice from those more familiar with the a64 decoder: @AssadHashmi?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that all of the missing opcodes are vector instructions, I would expect a very small fraction to read or write flags (or GPR's for that matter). I wonder if adding the couple of opcodes that affect flags and GPR's would not be much work. Though it's certainly more work than this workaround. Let's wait for advice from those more familiar with the a64 decoder: @AssadHashmi?
The resourcing planned to address AArch64 decoder gaps has been diverted elsewhere at the present time. Reluctantly, I'd advise carrying on with the workaround. This is far from ideal for AArch64 support but we currently have to work within limited constraints. Hopefully the situation will change later in the year.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thinking more about the implications here: workarounds will have to added to multiple places and not just drreg. The drcachesim tracer does GPR analysis to elide same-base addresses. The memval tool looks at register usage. Dr. Memory, being ported now, looks at everything. (The core does flags analysis for fragment prefixes and stay-on-trace comparisons -- though I believe these have no problem today due to flags-less IBL and no trace support yet for AArch64.)
@AssadHashmi is it the case that OP_xx insructions have GPR operands in place as hinted at here #2626 (comment)? Does that reliably cover all GPR usage in such instructions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@AssadHashmi is it the case that
OP_xx
instructions have GPR operands in place as hinted at here #2626 (comment)?
@derekbruening yes I believe that is the case. @egrimley can confirm.
Does that reliably cover all GPR usage in such instructions?
The majority, if not all, OP_xx
instructions are SIMD. Of those, GPR usage is restricted to an index register. Based on that you should be reliably covered. There may be cases of non-SIMD OP_xx
but even then you should be ok.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So to summarize, the recommendation is:
- Rely on OP_xx decoding marking all possible GPR register usage as both reads and writes, which forces restores, prevents assuming a GPR is dead, voids drcachesim elision, etc. This eliminates correctness concerns, but at a performance cost.
- Change all library and client code that looks at aflags to treat OP_xx as reading and writing them.
I would then propose: let's tweak the decoder to make the 2nd point automatic: to add aflags as read and written to every OP_xx. That might be a very simple change that would then remove the need to find and change any other code and put all the workarounds in a central place. Then we're covered for correctness for flags and GPR's and only SIMD registers have correctness issues; flags and GPR's only have performance issues.
ext/drreg/drreg.c
Outdated
* case they read/write the aflags. | ||
*/ | ||
IF_AARCH64(instr_get_opcode(inst) == OP_xx ||) | ||
TESTANY(EFLAGS_READ_ARITH, instr_get_eflags(inst, DR_QUERY_DEFAULT)) || |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is only handling reading flags: there are also points where drreg needs to update a spilled aflags value after an app instruction writes to the flags. That point would need a workaround as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought that this workaround handles writing flags as well. My understanding is that even if an unrecognized application instruction (xx
) writes to the flags, this workaround will force the flags to be restored before xx
, effectively preventing insertion of a restore after xx
that overwrites the write to the flags by xx
. If there is another save/restore pair by drreg after the 'xx' inst, it will not change the application view of the flags. Are you thinking of something else or am I missing something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Restoring is not unreserving. drreg will assume that the app's flags value is still stored in TLS after an app read but not write of the flags, which will be incorrect after xx
writes to the flags and changes that value. drreg will proceed to use the wrong value when restoring on future reads or unreserves.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see.
As per @derekbruening suggestion I added aflags as read and written to every OP_xx at the point of decoding. This addresses the issue in #4687 and should address any other undiscovered issue regarding usage of aflags by unrecognized instructions. |
Conservatively restore aflags before unrecognized instructions in case they read/write the flags.
Fixes: #4687