Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Windows Debug Information for local variables - prerequisite record definitions #5638

Open
wants to merge 37 commits into
base: master
Choose a base branch
from

Conversation

stooke
Copy link
Contributor

@stooke stooke commented Dec 16, 2022

This PR implements issue #5635 by adding Windows CodeView records support to define local variables.

Method parameter variables (including 'this') are defined as a demonstration. No other local variable information is implemented by this PR as this PR is not aware of the lifetime of variables or stack frame information. Method parameter variables are available to the debugger only upon entry to a method.

There is a body of unused code (symbol record definitions) in this PR - it will be required for local variable definition in the followup PR.

@oracle-contributor-agreement oracle-contributor-agreement bot added the OCA Verified All contributors have signed the Oracle Contributor Agreement. label Dec 16, 2022
Copy link
Collaborator

@adinn adinn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I cannot comment on the correctness of the PECOFF record layouts or what CodeView expects. However, as far as use of the generic param info is concerned this looks ok.

@stooke stooke force-pushed the pr-windows-varinfo branch from 6e6e24a to 569ab0d Compare January 13, 2023 22:16
@@ -152,14 +191,96 @@ private void buildFunction(CompiledMethodEntry compiledEntry) {
addLineNumberRecords(compiledEntry);
}

void addLocals(CompiledMethodEntry primaryEntry) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stooke I was kind of expecting this code to at least make some use of the debug info location support added in #4505, but that doesn't seem to be the case. Is the functionality exposed via DebugLocalInfo (and friends) insufficient for this use case?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pejovica this PR is a prerequisite to adding that support. This PR mostly implements the CodeView records required, and provides a quick but working proof of concept to examine register-based function parameters. A follow up PR will extend the functionality to use DebugLocalInfo, which was not available when this code was written.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pejovica, I have modified this PR to use the actual parameter names instead of synthetic names (no more 'p1, 'p2', etc), which means this PR as is is fairly useful to Windows developers, which can now set a breakpoint at function start and examine parameters in the Visual Studio IDE. I have also merged the latest Graal head. Please have a look when you get the chance.

I continue to work on the follow-up PR, which will take full advantage of @adinn 's PR #4505.

@adinn
Copy link
Collaborator

adinn commented Jul 31, 2023

@pejovica Is there any reason why this PR has still not been integrated?

Simon (@stooke) has code already lined up that will use location info. However, that code is not included here because he was explicitly asked to implement the Windows behaviour in stages.

Can we please make progress on this PR which has been sitting waiting for a final review for months?

@pejovica
Copy link
Member

pejovica commented Aug 9, 2023

@pejovica Is there any reason why this PR has still not been integrated?

Simon (@stooke) has code already lined up that will use location info. However, that code is not included here because he was explicitly asked to implement the Windows behaviour in stages.

Can we please make progress on this PR which has been sitting waiting for a final review for months?

@adinn Sorry about that. I was interrupted by other work and this PR unfortunately fell off the radar. However, it will need a bit more work before it can be integrated.

In particular, I don't think we should hardcode the same calling convention for all methods as is done in this PR because that is clearly incorrect. Instead, I think the logic in CVSymbolSubsectionBuilder.addLocals really needs to be based on the actual calling convention that was used when compiling the method.

I actually expected that the API introduced in #4505 already took care of this, but that doesn't seem to be the case. As far as I can tell, when creating a synthetic location record, ParamLocationProducer also uses the same calling convention for all methods.

So I think the location info API should be fixed to use the actual calling convention (see SubstrateCallingConvention), and this PR should be reworked to use the location info API. That should simplify the logic in CVSymbolSubsectionBuilder.addLocals quite a bit, but more importantly it should ensure that it is correct.

@adinn
Copy link
Collaborator

adinn commented Aug 9, 2023

In particular, I don't think we should hardcode the same calling convention for all methods as is done in this PR because that is clearly incorrect.

Clearly? ... Err .. I mean ... Clearly, yes! of course!

Instead, I think the logic in CVSymbolSubsectionBuilder.addLocals really needs to be based on the actual calling convention that was used when compiling the method.

I was not aware that that different compiled methods had different calling conventions (for a given architecture, that is).

I actually expected that the API introduced in #4505 already took care of this, but that doesn't seem to be the case. As far as I can tell, when creating a synthetic location record, ParamLocationProducer also uses the same calling convention for all methods.

Yes, for Linux ParamLocationProducer currently only takes CPU (AMD64/AArch64) into account when generating synthetic locations for parameters and this was certainly not questioned when #4505 was reviewed. The current implementation assumes that all compiled methods retrieved from the native image code heap will pass arguments using a single calling convention specific to the CPU type. Is that incorrect on Linux? Or is it just an issue on Windows (or MacOS)?

If there is a need for different calling conventions according to the type of the compiled method then can you clarify 1) what OS/CPU combinations it applies for; also 2) what property of the method leads to it being compiled with a given calling convention?

Is this a requirement for native image code or is it, say, only for code generated by Truffle? or some other type of generated code?

So I think the location info API should be fixed to use the actual calling convention (see SubstrateCallingConvention), and this PR should be reworked to use the location info API. That should simplify the logic in CVSymbolSubsectionBuilder.addLocals quite a bit, but more importantly it should ensure that it is correct.

So, it seems you are suggesting that ParamLocationProducer needs to generalized to operate with reference to a SubstrateCallingConvention for all cases, even Linux, and that a PR to that effect is needed as a prerequisite change before the current PR can proceed. Is that correct? If so then I'll really need answers to the above questions.

@stooke
Copy link
Contributor Author

stooke commented Aug 9, 2023

Looking at the code in my repo (which needs a rebase so is slightly our of date), in SubstrateCallingConventionKind.java, there are only three calling conventions - Java (needed), Native (generated by the underlying C/C++ compilers and ForwardReturnValue, which is never used, only checked for. I think it's perfectly valid, if inelegant, for the debuginfo generator to simply check that the calling convention is Java and complain otherwise.

@adinn
Copy link
Collaborator

adinn commented Aug 9, 2023

@pejovica Like Simon I'm also puzzled as to where you think there is a problem here. The Native calling convention is used in a very few places. One use that can be discounted immediately is when generating calls to a foreign function (which clearly has nothing to do with input parameters to compiled methods passed to the debug info generator).

The only place where I found the Native calling convention being used to describe parameters for a compiler-generated method was in class JNIJavaCallTrampolineHolder. The native methods declared by that class get custom translated to trampoline methods -- simple stubs which execute an indirect jump to a target method. However, the methods involved (such as varargsJavaCallTrampoline) are declared as static and with signature ()V so the generator has no parameters to place i.e. the calling convention is irrelevant.

Is there some other case I have missed where a called method has a non-empty call signature that needs to be interpreted using the Native calling convention?

@olpaw are you aware of a problem here?

@pejovica
Copy link
Member

@adinn @stooke There seems to be some misunderstanding about how calling from native works, but literally anything that can be called from native must be compiled with the Native calling convention for the call to succeed. This includes all methods annotated with the @CEntryPoint annotation, but as you've already noticed we also have a bunch of generated wrappers for Java methods that can be called via JNI (that's what JNICallTrampolineMethods are for). So for example, if I were to set a breakpoint on the entry to main, the parameters argc and argv would be located according to the Native calling convention, meaning that PR as is would display the wrong thing. This sounds like a problem to me.

@adinn
Copy link
Collaborator

adinn commented Aug 11, 2023

@pejovica I take your point and agree that ParamLocationProducer needs to be updated to respect use of the Native calling convention by the relevant subset of the generated method base. That is required to ensure both Windows and Linux record the correct local var locations for the those methods in the associated debug info. I will create an issue and raise a PR to fix that.

However, I don't understand why you appear to be treating that as a blocker for this patch. I don't think any of the code here will need to be changed after the above issue is fixed i.e. pushing it now does not imply rework later. I don't believe pushing the patch will risk any serious consequences, such as crashing the runtime or Windows debugger (otherwise I would expect to see the same outcome on Linux). It will mean that Windows debugger users may be misled about input arguments for a small number of methods (as currently happens on Linux). However, in the case of almost all methods the patch will provide valuable information to Windows debugger users (just as it does on Linux) and would be useful to have right now. Not only that but pushing it will also allow Simon to present the follow up patches which shodl significantly improve windows debugging. Is there a reason not to proceed with this PR that I am missing?

@pejovica
Copy link
Member

@pejovica I take your point and agree that ParamLocationProducer needs to be updated to respect use of the Native calling convention by the relevant subset of the generated method base. That is required to ensure both Windows and Linux record the correct local var locations for the those methods in the associated debug info. I will create an issue and raise a PR to fix that.

@adinn Thanks, that would be great.

However, I don't understand why you appear to be treating that as a blocker for this patch. I don't think any of the code here will need to be changed after the above issue is fixed i.e. pushing it now does not imply rework later...

No, I don't think it is a blocker, but I also don't think we need another copy of code dealing with calling conventions. Note that CVSymbolSubsectionBuilder.addLocal effectively duplicates the ParamLocationProducer code, and since it doesn't use DebugLocalValueInfo at all, I don't see how we can avoid reworking this code later. That's why I suggested that this PR should be reworked to use the appropriate API.

@stooke
Copy link
Contributor Author

stooke commented Aug 14, 2023

@pejovica , as discussed before, this code lays the groundwork for the full implementation. The current variable information code in this PR is more a proof of concept that is still incredibly useful to Windows developers.

ParamLocationProducer is only partially useful for Windows as is - the register specifications in CodeView are sensitive to type. Simply saying a short lives in r9 will not work; the correct register is 'r9w' for example. Proper modification of ParamLocationProvider is something that could happen as part of the PR to handle native code calling conventions.

@pejovica
Copy link
Member

@pejovica , as discussed before, this code lays the groundwork for the full implementation. The current variable information code in this PR is more a proof of concept that is still incredibly useful to Windows developers.

Yes, I understand that this is just laying the groundwork for a full implementation and I fully agree that it is very useful, so all I ask is that a sound foundation is laid. Note that before a proper API was introduced, this would have required direct use of our calling convention support, but even then there would be no need to duplicate any of that logic here.

ParamLocationProducer is only partially useful for Windows as is - the register specifications in CodeView are sensitive to type. Simply saying a short lives in r9 will not work; the correct register is 'r9w' for example. Proper modification of ParamLocationProvider is something that could happen as part of the PR to handle native code calling conventions.

This is actually another reason why I think this PR should be based on existing APIs: it will allow us to see if there are any issues that need to be addressed. Because if the existing APIs aren't sufficient to describe method parameters at their entry points, I don't see how they would be sufficient for anything more complicated. However, I don't expect this PR to address such issues, e.g., I'm fine if you leave dealing with the parameter sizes as a to-do task referencing the appropriate issue.

@stooke
Copy link
Contributor Author

stooke commented Aug 25, 2023

@pejovica, I have rebased the repo on Graal head, made some minor improvements, and modified my code to take registers and stack offsets from @adinn's ParamLocationProducer code, thus eliminating duplication. When his PR #7256 is integrated, this code will reap the benefits.

@stooke
Copy link
Contributor Author

stooke commented Nov 9, 2023

@pejovica, hello! I have rebased the repo on Graal head including the integration of @adinn 's PR #7256. When you have some time, could you please take another look at this PR?

@maxandersen
Copy link

@pejovica @thomaswue any update here?

@pejovica
Copy link
Member

@maxandersen No, I've been busy with support for Windows x64 unwind info, so I haven't had much chance to work on this. I plan to come back to this once I'm done with that feature, which is almost finished, so it should be soon.

@stooke
Copy link
Contributor Author

stooke commented Nov 8, 2024

@pejovica , I have rebased the repo and tested it with the latest Graal native. There was a regression as Graal can now use more registers in x64 mode; this has been fixed.

I have also update the copyright dates on the changed files.

@pejovica
Copy link
Member

@stooke Thanks for the rebase and the heads-up! I'll take another look now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
OCA Verified All contributors have signed the Oracle Contributor Agreement. redhat-interest
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants