-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cleanup BitArray code #38780
Cleanup BitArray code #38780
Conversation
Are the libraries test pipelines w/ ARM machines still run manually? It would be great if someone can trigger the pipeline manually if those aren't run for PR evaluations (I remember them not running by default some time ago). |
Tagging subscribers to this area: @eiriktsarpalis |
I thought they were automatically triggered if anything under |
We had to remove them from PRs because of the reduced hardware availability we have, however we do run them against a checked coreclr when coreclr is touched, so if you don't want to go through the manual trigger dance, you can push a dummy commit that changes any file under coreclr. |
Thanks @safern for confirming. I have triggered a run manually. |
Seeing some failing System.Net tests; those seem unrelated? the arm64 tests didn't seem to run as a result of those failures. |
I just triggered another run. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
25bba24
to
adc6c87
Compare
@Gnbrkm41, this looks good. Could we have a disassembly and/or perf numbers hsowing the improvement, however? |
Some numbers:
Ran on my RPi 3b+; Not really seeing any huge problems standing out... |
437cbb0
to
017fe6d
Compare
Finally got around to get the assembly output. Not exactly sure why I'm seeing |
Thanks for getting the diffs. I don't see that you have added |
I created a regular console app that calls into
Perhaps it's an inlining thing? I can obtain the JitDump for both builds if it helps. |
Sure, can you attach Jitdump them as well? Want to see where it is coming from. |
Had to compress because the raw dumps were totalling 22MB. |
I don't think they're only present in one of them? I find exactly three occurences of |
Interesting. I don't see "Debug.Fail()" anywhere in the asms that you shared. I am guessing the sequence is for movz x0, #0xd1ffab1e
movk x0, #0xd1ffab1e LSL #16
movk x0, #0xd1ffab1e LSL #32
mov w1, #7
mov v9.d[0], v8.d[1]
bl CORINFO_HELP_GETSHARED_NONGCSTATIC_BASE
movz x0, #0xd1ffab1e
movk x0, #0xd1ffab1e LSL #16
movk x0, #0xd1ffab1e LSL #32
ldr x1, [x0]
@tannergooding should double check before we merge it. |
private static readonly Vector128<byte> s_upperShuffleMask_CopyToBoolArray = Vector128.Create(0x02020202_02020202, 0x03030303_03030303).AsByte(); | ||
private static readonly Vector128<byte> s_lowerShuffleMask_CopyToBoolArray = | ||
BitConverter.IsLittleEndian ? Vector128.Create(0, 0x01010101_01010101).AsByte() : Vector128.Create(0x03030303_03030303, 0x02020202_02020202).AsByte(); | ||
private static readonly Vector128<byte> s_upperShuffleMask_CopyToBoolArray = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My guess is that we don't need these anymore - if I remember correctly those PRs make the JIT recognise those Create calls and store the data in memory during codegen, so the code can load the data from memory - so I expect it to result in about the same, or perhaps better code. It's been a while since I've rebased the branch so I could make the change while I'm rebasing and see how it turns out.
This PR has been open for a while, is there any remaining work needed to be done to get it merged? |
I think we want to verify that this doesn't introduce any regressions and as @stephentoub mentioned we could perhaps also clean the code up a bit, following the recent change in the JIT that allows us to get rid of I'll rebase the branch first and see how the disasm / benchmark turns out, as I've been seeing some oddities around the disassembled code (where we were seeing calls to |
* This commit addresses some of the comments raised during the initial PR of the vectorised BitArray code, such as using new API/intrinsics. Also highlights a couple potential improvements.
017fe6d
to
6a6a650
Compare
Unfortunately, I don't expect to be able to work on the PR for some time. If any of you folks have concerns with this being open for several months, please do feel free to close the PR - I could eventually work on this again when I'm able to. |
Thank you @Gnbrkm41 for confirming. Feel free to reopen when you get time. |
This PR addresses some of the comments raised during the initial PR of the vectorised BitArray code, such as using new API/intrinsics. Also highlights a couple potential improvements.
cc @kunalspathak @tannergooding @echesakovMSFT (ARM intrinsics related people I can remember from the top of my head)