-
Notifications
You must be signed in to change notification settings - Fork 12.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[MIPS] Sign-extend subwords when expanding atomic max/min #89246
[MIPS] Sign-extend subwords when expanding atomic max/min #89246
Conversation
Thank you for submitting a Pull Request (PR) to the LLVM Project! This PR will be automatically labeled and the relevant teams will be If you wish to, you can add reviewers by using the "Reviewers" section on this page. If this is not working for you, it is probably because you do not have write If you have received no comments on your PR for a week, you can request a review If you have further questions, they may be answered by the LLVM GitHub User Guide. You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums. |
cc @wzssyqa |
You can test this locally with the following command:git-clang-format --diff 3e64f8a4e74cdcaf5920879c86e7e0a827f6ec13 2e331854112b792feccb4eb2d536c2a27204874a -- llvm/lib/Target/Mips/MipsExpandPseudo.cpp View the diff from clang-format here.diff --git a/llvm/lib/Target/Mips/MipsExpandPseudo.cpp b/llvm/lib/Target/Mips/MipsExpandPseudo.cpp
index 9bfef2a393..89d8a92dca 100644
--- a/llvm/lib/Target/Mips/MipsExpandPseudo.cpp
+++ b/llvm/lib/Target/Mips/MipsExpandPseudo.cpp
@@ -499,8 +499,7 @@ bool MipsExpandPseudo::expandAtomicBinOpSubword(
BuildMI(loopMBB, DL, TII->get(Mips::AND), Incr)
.addReg(Incr)
.addReg(Mask);
- BuildMI(loopMBB, DL, TII->get(Mips::CLZ), Scratch4)
- .addReg(Mask);
+ BuildMI(loopMBB, DL, TII->get(Mips::CLZ), Scratch4).addReg(Mask);
BuildMI(loopMBB, DL, TII->get(Mips::SLLV), OldVal)
.addReg(OldVal)
.addReg(Scratch4);
|
@@ -1118,6 +1118,8 @@ define i16 @test_max_16(ptr nocapture %ptr, i16 signext %val) { | |||
; MIPSEL-NEXT: srav $7, $7, $10 | |||
; MIPSEL-NEXT: seh $2, $2 | |||
; MIPSEL-NEXT: seh $7, $7 | |||
; MIPSEL-NEXT: sllv $2, $2, $10 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am sorry that I don't understand it well.
seh
does be sign-extended. The result of seh
is good enough for slt
.
Why do we need to extend them to 32bit value?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And sllv
here may make the result incorrect.
The return value should be a signed int16, while with sllv
it will be (sign int16
)<<$10.
Note, $10 here contains the offset of a int16 in the a word, it may be 0
or 16
.
I guess the reason we do it is that we have only ll
, while no llb/llh
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am sorry that I don't understand it well.
seh
does be sign-extended. The result ofseh
is good enough forslt
. Why do we need to extend them to 32bit value?
Because slt
compares signed integers. When comparing subwords, we need to take the sign of the subwords into consideration. When the subword isn't at the MSB spot, we get the result we didn't expect.
And sllv here may make the result incorrect.
The return value should be a signed int16, while with sllv it will be (sign int16)<<$10.
Note, $10 here contains the offset of a int16 in the a word, it may be 0 or 16.
Correct, $10 contains the offset. The code after my changes needs the subwords to be placed with the provided offset inside a word. That is what #77072 didn't do: the subword was shifted to the LSB spot and left there, causing unexpected behavior in the subsequent code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. Thanks.
While it seems there is another problem introduced by the previous patch (not your current):
if ptr
is something like
struct xx {
int16 a;
int16 b;
}
our code will overwrite another halfword.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another question, does atomicrmw
need to support unaligned access?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another question, does
atomicrmw
need to support unaligned access?
I don't believe so. According to the documentation, alignment
field is always present for in-memory IR and default alignment is provided when the alignment field isn't present.
However, I'm unsure how your questions tie in with this PR.
Just for reference: there is
Maybe they are not needed at all. The ABI requires the arguments passed in registers to be well sign-extended. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems to make the result wrong.
In order for the following `SLT` instruction to work properly, we need to sign-extend appropriate subwords. In addition, subwords must remain in the same position from before sign-extension. Resolves llvm#61881. Also, downstream bugs rust-lang/rust#100650 and rust-lang/rust#123772 are fixed.
2e33185
to
9d39f61
Compare
Maybe this asm code is helpful. |
And I find that the current code cannot work with big-endian.
|
@jdmitrovic-syrmia I figure out a patch. I have some test on both big-endian and little endian. |
The slt instruction requires both parameters to be sign extended. |
Sure. So I sign-extend the |
You mean |
I mean the caller of |
This patch fixes some problem for big-endian. |
I test with this C code. |
@jdmitrovic-syrmia can you have a review #89575 |
Rework of the #77072 PR.
In order for the following
SLT
instruction to work properly, we need to sign-extend appropriate subwords.In addition, subwords must remain in the same position from before sign-extension.
Resolves #61881. Also, downstream bugs rust-lang/rust#100650 and rust-lang/rust#123772 are fixed.