-
Notifications
You must be signed in to change notification settings - Fork 17.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cmd/asm: doesn't support PCALIGN on i386/amd64 #56474
Comments
cc @golang/compiler |
I believe that opcode is only implemented on arm64 and ppc64 so far. It shouldn't be hard to do - we already pad on x86 for various alignment restrictions. |
I am not able to help here, as my proficiency in the Go internals is insufficient. I would be enormously grateful though if someone with the required skills decided to contribute the necessary patch. |
I have looked into it and I was able to actually add the support into
I'm just wondering, would this change be acceptable, should I try and submit a PR regarding this feature? How to test it? I have personally no intention about modifying anything in the go language itself. I just think that it would be nice to be able to align stuff at asm level as it's just beneficial to have this kind of control be it hot loops or data for AVX2/AVX512. I think aligning everything makes no sense, only the stuff that you want or that improves based on profiling. The problem I have at the moment is that if you have .s file that implements bunch of functions at asm level, you can change something that would regress something completely unrelated, because some labels shift their offsets and end up in a different cache lines, etc... The difference of this in our software could be up to 1 GB/s throughput on a 16 core machine, which is normally able to process around 37.5GB/s of data. |
Seems reasonable to me. Testing is tricky. Normally we test non-semantics-changing things like this in test/codegen, but that doesn't have any assembly tests currently. Maybe something in cmd/asm/internal/asm? It would be worth at least having a test that runs code that needs padding, just to make sure the nops are correct. Note that if you want alignments more than 32 bytes, just aligning within a function isn't enough. You'll need to increase the alignment of function starts, which will be trickier. |
Can the alignment of a function start be increased automatically when a |
Code alignment up to 32 bytes would already be great progress, but data alignment to a 64-byte boundary is critical to AVX512 code. The lack of it essentially prohibits the use of |
You're welcome to send a patch. Thanks! If you have not, please see the contribution guide for how to contribute. https://go.dev/doc/contribute For testing, there are similar tests for other architectures For alignment larger than the default function alignment, I think you can increase the function alignment like https://cs.opensource.google/go/go/+/master:src/cmd/internal/obj/arm64/asm7.go;l=1080 |
The PCALIGN asm directive was not supported on i386/amd64, causing a compile-time error when used. The same directive is currently supported on arm64 and ppc64 arquitectures. Fixes: golang#56474
The PCALIGN asm directive was not supported on i386/amd64, causing a compile-time error when used. The same directive is currently supported on arm64 and ppc64 arquitectures. Fixes: golang#56474
Change https://go.dev/cl/511662 mentions this issue: |
What version of Go are you using (
go version
)?Does this issue reproduce with the latest release?
yes
What did you do?
Tried to compile the following assembly function:
TEXT test(SB), NOFRAME|NOSPLIT, $0-0
PCALIGN $2
RET
What did you expect to see?
I expected the function to compile correctly with the
RET
aligned to a 2-byte boundary.What did you see instead?
Per my google research, it appears the directive
PCALIGN
is recognized by the assembler, but it is not implemented on the x86/x64 targets. Could you please implement this feature, as it is essential to high-performance assembly code? It should be working both with theTEXT
(function/label alignment) andDATA
(e.g. AVX512 64-byte constants) sections.The text was updated successfully, but these errors were encountered: