-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Check HFA alignment for valuetype fields #106099
base: main
Are you sure you want to change the base?
Conversation
No pipelines are associated with this pull request. |
/azp list |
/azp run runtime-coreclr crossgen2 outerloop |
Azure Pipelines successfully started running 1 pipeline(s). |
@steveisok Who is working on fixing the crossgen2 outerloop? It appears that determinism in the crossgen2 compiler is broken for x64 to x86 builds, and there is a fundamental problem with composite R2R causing lock order violations? I don't see bugs filed here. @kg I'd like to see a cleaner crossgen2 outerloop pass than what you're seeing here. This lock order problem is NOT related to what you are doing, but if you could modify |
cc @dotnet/jit-contrib if anyone can assist here |
I spent a while figuring out how to run the ARM32 JIT tests and was able to confirm that they don't pass, but the necessary steps to iterate on it are pretty time consuming (>1 hr per iteration), and I don't really know enough about ARM or the JIT to make quick progress here. So it would be awesome if someone could help by digging into the ARM issue, even if they don't have time to fix it. I mentioned this in an email but essentially, it looks like on ARM32 specifically, the HFA argument is passed incorrectly. The HFA return value works correctly, and on ARM64 both HFA arguments and return values appear to work. |
Also re being productive on arm32, I have had pretty decent luck cross-building from an x64 linux host (via docker) and then mounting the repo + artifacts via SSHFS on an arm64 host (RPI4 in my case) and then running arm32 under docker there ... then if you need to rebuild anything then there's no need to copy files around. And I remote to the boxes via VS code so can direct it all from the same machine. The only real pain is that debugging requires a specially built lldb that @janvorli created and getting the right flavor of the lldb SOS plugin for that lldb takes a bit of care. |
Added some instrumentation into morph.cpp locally to examine the arg ABI info for the call into native code: printf(
"%s.%s arg[%u] numRegs=%u byteOffset=%u byteSize=%u argType=%s passedByRef=%u hfaElemKind=%u IsSplitAcrossRegistersAndStack=%u\n",
comp->info.compClassName, comp->info.compMethodName,
GetIndex(&arg),
arg.AbiInfo.NumRegs,
arg.AbiInfo.ByteOffset,
arg.AbiInfo.ByteSize,
varTypeName(arg.AbiInfo.ArgType),
(unsigned)arg.AbiInfo.PassedByRef,
(unsigned)arg.AbiInfo.GetHfaElemKind(),
abiInfo.IsSplitAcrossRegistersAndStack()
);
Based on what godbolt arm32 clang generates for the target function, the numRegs looks suspect. the clang code appears to expect the struct to be passed in |
Dumped the segments for the argument and I'm not sure how to interpret them:
The first argument being an int in register 0 makes sense if it's the return address in r0. but then it's weird that the 2nd argument has two int segments in r2/r3 and then a stack segment. why isn't r1 being used? I would expect to see four segments here if I understand this system right, first 3 for r1/r2/r3 and then a stack segment. i'm not sure what happened to r1 here. |
I feel like it's probably here, but I'm not sure how to confirm it or what the fix would be: runtime/src/coreclr/jit/targetarm.cpp Lines 80 to 85 in c4cc794
|
This issue was a bug in the C side of the test. According to the MS version of the arm32 ABI:
In order to approximate the StructLayout with explicit offsets, I had done This does raise the question of whether the natural alignment of the C# struct should still be 8 bytes even though it has a field with a weird offset. I don't know how to answer the question, but it seems like what we do right now is consistent across targets and can be reproduced in C. The question is what the real-world scenario(s) for this sort of struct look like, I think. |
/azp run runtime-coreclr crossgen2 outerloop |
Azure Pipelines successfully started running 1 pipeline(s). |
Looks like the windows x86 crossgen2 comparison is failing:
|
The R2R composite lanes appear to be failing with the same error across linux/osx on both arm64 and x64, so I don't think my HFA changes can be responsible:
|
Another category of failure on the R2R lanes that looks unrelated:
|
/azp run runtime-coreclr crossgen2 outerloop |
Azure Pipelines successfully started running 1 pipeline(s). |
Bitcast might be fixed (it looks like it had a casing issue?) but there's another similar one where the assembly name seem case-sensitively correct, so this regression is still mysterious:
|
/azp run runtime-coreclr crossgen2 outerloop |
Azure Pipelines successfully started running 1 pipeline(s). |
Feel free to bounce this to someone else Egor, and I just need review on the HFA bits. The rest of it will be backed out once the crossgen lanes are mostly green (I'm working on that.) |
/azp run runtime-coreclr crossgen2 outerloop |
Azure Pipelines successfully started running 1 pipeline(s). |
To my eye crossgen r2r has the same set of failures on x64 and arm64, so it looks like this is good there. |
@EgorBo, please prioritize to review this PR. |
Should fix #80393
I've never really touched CoreCLR so I'm not certain this is right.