Workaround for weird [DW_OP_deref, DW_OP_stack_value] sequences #738
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
I've seen this situation a few times (clang-18, x64, Linux): variable's type is 24-byte struct (std::vector), but its location expression ends with
DW_OP_deref, DW_OP_stack_value
. I.e. the expression tells us that the value of a 24-byte struct is the result of reading 8 bytes (where 8 is address size) from memory; that's indeed what gimli does. I couldn't figure out what's LLVM intended when emitting such expression.This PR adds a workaround for this: if the expression ends with
[DW_OP_deref, DW_OP_stack_value]
, pretend those two instructions are not there. I.e. assume the whole value is available at the address that would have been dereferenced.(If this workaround doesn't belong in gimli, that's ok, I can just do the same in my code instead. Merge this only if it seems useful for other users.)
Appendix: example of clang output with this problem.
Here's debug info about a variable (produced by clang-18, x64, Linux):
So the type is a 24-byte struct.
The first location is
DW_OP_breg4 RSI+0
, which makes sense as this variable is the first argument of the function.DW_OP_breg4
pushes rsi value onto the dwarf stack, then, by convention, the final value at top of the stack is the address of the variable, i.e.&nodes
.The second location starts at pc 0x0000000011db5c1c. The instruction just before that is:
So, the address of the struct is written to the stack at
[rbp-0F8h]
(0F8h = 248), and then the location in dwarf changes. Makes sense.But the new location
DW_OP_breg6 RBP-248, DW_OP_deref_size 0x8, DW_OP_deref, DW_OP_stack_value
seems to say:DW_OP_breg6 RBP-248
- pushRBP-248
(akarbp-0f8h
) to the dwarf stack. We know that[rbp-0F8h]
is the address of the struct, so[rbp-0F8h]
is address of address,&&nodes
. Makes sense.DW_OP_deref_size 0x8
- dereference it, placing[rbp-0f8h]
at the top of dwarf stack. That's&nodes
. Makes sense.