Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dwarf: Fix incorrect cfi execution #54

Merged
merged 1 commit into from
Nov 1, 2017
Merged

Conversation

yuyichao
Copy link
Contributor

Ref my mailing list post yesterday http://lists.nongnu.org/archive/html/libunwind-devel/2017-10/msg00005.html. I have since figured out the reason for the signal resume test failure (#51) and I think I also understand what the comment means by "For execution resume". It seems to basically mean signal frame and since that's how this is referred to in the code elsewhere, I changed the comment a little to hopefully make it more clear. Therefore, I'm not fairly confident that this is the right thing to do and shouldn't cause any new test failures afaict.

More detail description of the change from the commit message follows


During unwinding/resuming execution of a normal call frame,
it is not only necessary to use the previous instruction to lookup the unwind info
but also when executing the cfi program. Although the call usually don't modify
any unwinding state, it can happen for noreturn call or when the callee cleanup the stack.
In these cases, the next instruction after the call may have a cfi adjusting the state
(e.g. stack pointer) and such instruction should be executed.

3d9a694 worked around this issue by treating cfi_restore_state
specially. It works when the compiler use that instruction to restore the state, i.e.

    .cfi_remember_state
    je .L0
    push ...
    .cfi_def_cfi_offset <new_value>
    call noreturn
.L0
    .cfi_restore_state

which is what GCC ususally does. However, it is not necessarily the case and clang/LLVM doesn't
do that. Instead LLVM emits the following unwind info which is also perfectly valid but is not
handled by the special case.

    je .L0
    push ...
    .cfi_def_cfi_offset <new_value>
    call noreturn
.L0
    .cfi_def_cfi_offset <old_value>

e9e8ed7 also worked around this issue for another special case.

This patch fix this issue for all cfi types by adjusting the end_ip based on the type of the
current frame instead, similar to what's done in fetch_proc_info.
Since this requires using the same use_prev_instr value after fetch_proc_info returns,
the patch also remove the need_unwind_info parameter to the function and move the code updating
use_prev_instr after all use of the old value are done.

During unwinding/resuming execution of a normal call frame,
it is not only necessary to use the previous instruction to lookup the unwind info
but also when executing the cfi program. Although the call usually don't modify
any unwinding state, it can happen for noreturn call or when the callee cleanup the stack.
In these cases, the next instruction after the call may have a cfi adjusting the state
(e.g. stack pointer) and such instruction should be executed.

3d9a694 worked around this issue by treating `cfi_restore_state`
specially. It works when the compiler use that instruction to restore the state, i.e.

```
    .cfi_remember_state
    je .L0
    push ...
    .cfi_def_cfi_offset <new_value>
    call noreturn
.L0
    .cfi_restore_state
```

which is what GCC ususally does. However, it is not necessarily the case and clang/LLVM doesn't
do that. Instead LLVM emits the following unwind info which is also perfectly valid but is not
handled by the special case.

```
    je .L0
    push ...
    .cfi_def_cfi_offset <new_value>
    call noreturn
.L0
    .cfi_def_cfi_offset <old_value>
```

e9e8ed7 also worked around this issue for another special case.

This patch fix this issue for all cfi types by adjusting the `end_ip` based on the type of the
current frame instead, similar to what's done in `fetch_proc_info`.
Since this requires using the same `use_prev_instr` value after `fetch_proc_info` returns,
the patch also remove the `need_unwind_info` parameter to the function and move the code updating
`use_prev_instr` after all use of the old value are done.
yuyichao added a commit to JuliaLang/julia that referenced this pull request Oct 28, 2017
The issue is the most significant with FPO enabled and on x86 but can in principle happen
in other cases too.

Upstream PR libunwind/libunwind#54
@djwatson
Copy link
Member

@unkadoug Want to take a look? It looks like this will fix reg_states_iterate also.

Copy link
Contributor

@unkadoug unkadoug left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change solves the problem I was having after change 3d9a694.

I don't understand the need for the variable next_use_prev_instr in find_reg_state and in dwarf _reg_states_iterate. It seems that replacing it with c->use_prev_instr would break nothing. Can you help me understand the need for this variable?

@yuyichao
Copy link
Contributor Author

It seems that replacing it with c->use_prev_instr would break nothing.

Doing that will cause the unwinding to use a different version of the value than the one used for looking up unwind info.

@unkadoug
Copy link
Contributor

In find_reg_state, you use next_use_prev_instr to save a vaiue before calling create_state_record_for, which calls parse_fde, which depends on c->use_prev_instr. So you're right in that case. However, I don't believe that create_state_record_for modifies dci->signal frame, so you could just rewrite two lines in find_reg_state as:

ret = create_state_record_for (c, sr, ci->ip);
c->use_prev_instr = ! dci->signal_frame

and eliminate the need for next_use_prev_instr there.

In dwarf_reg_states_iterate, the case for next_use_prev_instr is weaker, since dwarf_reg_states_table_iterate does not depend on c->use_prev_instr. But, suppose you decide that reg_states_table_iterate should depend on c->use_prev_instr (should it?). Then the problem can be addressed the same way, by examining ! dci->signal_frame after invoking the iterate function instead of before, and again making next_use_prev_instr unnecessary.

@yuyichao
Copy link
Contributor Author

I did not check each user of the value and I'm just doing the same thing for all users to make it clear that the update should be done after all users are done.
If there's any functions in there that will not use the previous value now and in the future, that can be documented in the code and have the code changed for that. I certainly don't have that knowledge so I'm just going with the change that's the easiest to reason about...

@unkadoug
Copy link
Contributor

I'll just restate that you don't need that variable, and I think I've explained why. Dave Watson can decide whether or not that matters to him, but I'd want to get rid of it. Other than that, the change looks okay to me, and thank you for offering it.

@djwatson
Copy link
Member

djwatson commented Nov 1, 2017

Yea I noticed the extraneous variable also, but we can clean that up later. This is a great fix. Merging, thanks

@djwatson djwatson merged commit f248ac0 into libunwind:master Nov 1, 2017
@yuyichao yuyichao deleted the cfi_bound branch November 1, 2017 16:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants