Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Mips] In LowerShift*Parts, xor with bits-1 instead of -1. #71149

Merged
merged 2 commits into from
Nov 3, 2023

Conversation

topperc
Copy link
Collaborator

@topperc topperc commented Nov 3, 2023

If we start with an i128 shift, the initial shift amount would usually have zeros in bit 8 and above. xoring the shift amount with -1 will set those upper bits to 1. If DAGCombiner is able to prove those bits are now 1, then the shift that uses the xor will be replaced with undef. Which we don't want.

Reduce the xor constant to VT.bits-1 where VT is half the size of the larger shift type. This avoids toggling the upper bits. The hardware shift instruction only uses the lower bits of the shift amount. I assume the code used NOT because the hardware doesn't use the upper bits, but that isn't compatible with the LLVM poison semantics.

Fixes #71142.

I don't know who can review Mips anymore so I'm just adding mostly people who know about SelectionDAG and/or lowering shift parts.

If we start with an i128 shift, the initial shift amount would
usually have zeros in bit 8 and above. xoring the shift amount with
-1 will set those upper bits to 1. If DAGCombiner is able to prove those
bits are now 1, then the shift that uses the xor will be replaced
with undef. Which we don't want.

Reduce the xor constant to VT.bits-1 where VT is half the size of
the larger shift type. This avoids toggling the upper bits. The
hardware shift instruction only uses the lower bits of the shift
amount. I assume the code used NOT because the hardware doesn't
use the upper bits, but that isn't compatible with the LLVM poison
semantics.

Fixes llvm#71142.
@RKSimon
Copy link
Collaborator

RKSimon commented Nov 3, 2023

FTR #71154 does something similar but this looks more complete

Copy link
Collaborator

@RKSimon RKSimon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM (with a couple of minors)

@@ -2593,12 +2593,13 @@ SDValue MipsTargetLowering::lowerShiftLeftParts(SDValue Op,
SDValue Shamt = Op.getOperand(2);
// if shamt < (VT.bits):
// lo = (shl lo, shamt)
// hi = (or (shl hi, shamt) (srl (srl lo, 1), ~shamt))
// hi = (or (shl hi, shamt) (srl (srl lo, 1), (VT.bits-1) ^ shamt))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

consistent style: (xor (VT.bits-1), shamt)

@@ -2623,7 +2624,7 @@ SDValue MipsTargetLowering::lowerShiftRightParts(SDValue Op, SelectionDAG &DAG,
MVT VT = Subtarget.isGP64bit() ? MVT::i64 : MVT::i32;

// if shamt < (VT.bits):
// lo = (or (shl (shl hi, 1), ~shamt) (srl lo, shamt))
// lo = (or (shl (shl hi, 1), (VT.bits-1) ^ shamt) (srl lo, shamt))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

consistent style: (xor (VT.bits-1), shamt)

@topperc
Copy link
Collaborator Author

topperc commented Nov 3, 2023

There's another bug in this code that neither patch addresses.

I believe this violates the boolean rules. ISD::SELECT should only receive 0/1 as a condition. There needs to be a setcc here.

  SDValue Cond = DAG.getNode(ISD::AND, DL, MVT::i32, Shamt,                      
                             DAG.getConstant(VT.getSizeInBits(), DL, MVT::i32)); 
  Lo = DAG.getNode(ISD::SELECT, DL, VT, Cond,                                    
                   DAG.getConstant(0, DL, VT), ShiftLeftLo); 

@topperc topperc merged commit 8d24d39 into llvm:main Nov 3, 2023
@topperc topperc deleted the pr/mips-shift branch November 3, 2023 17:08
martn3 pushed a commit to martn3/llvm-project that referenced this pull request Nov 7, 2023
If we start with an i128 shift, the initial shift amount would usually
have zeros in bit 8 and above. xoring the shift amount with -1 will set
those upper bits to 1. If DAGCombiner is able to prove those bits are
now 1, then the shift that uses the xor will be replaced with undef.
Which we don't want.

Reduce the xor constant to VT.bits-1 where VT is half the size of the
larger shift type. This avoids toggling the upper bits. The hardware
shift instruction only uses the lower bits of the shift amount. I assume
the code used NOT because the hardware doesn't use the upper bits, but
that isn't compatible with the LLVM poison semantics.

Fixes llvm#71142.

(cherry picked from commit
llvm@8d24d39)
@yingopq
Copy link
Contributor

yingopq commented Nov 8, 2023

There's another bug in this code that neither patch addresses.

I believe this violates the boolean rules. ISD::SELECT should only receive 0/1 as a condition. There needs to be a setcc here.

  SDValue Cond = DAG.getNode(ISD::AND, DL, MVT::i32, Shamt,                      
                             DAG.getConstant(VT.getSizeInBits(), DL, MVT::i32)); 
  Lo = DAG.getNode(ISD::SELECT, DL, VT, Cond,                                    
                   DAG.getConstant(0, DL, VT), ShiftLeftLo); 

@topperc There has a question about shift number 129, now we deal with it as shift number 1 and the result of shl i128 %a, %b was ffffffffffffffff fffffffffffffffe.
If we add a setcc like the following diff, the above result was fffffffffffffffe 0000000000000000 like RISCV RISCVTargetLowering::lowerShiftLeftParts. Whether we're going to change the behavior of the shift number 129?

+//  SDValue Cond = DAG.getNode(ISD::AND, DL, MVT::i32, Shamt,
+//                             DAG.getConstant(VT.getSizeInBits(), DL, MVT::i32));
+
+  SDValue MinusXLen = DAG.getConstant(-(int)VT.getSizeInBits(), DL, MVT::i32);
+  SDValue ShamtMinusXLen = DAG.getNode(ISD::ADD, DL, MVT::i32, Shamt, MinusXLen);
+  SDValue Zero = DAG.getConstant(0, DL, MVT::i32);
+  SDValue CC = DAG.getSetCC(DL, MVT::i32, ShamtMinusXLen, Zero, ISD::SETLT);
+  Lo = DAG.getNode(ISD::SELECT, DL, VT, CC, ShiftLeftLo, DAG.getConstant(0, DL, VT));
+  Hi = DAG.getNode(ISD::SELECT, DL, VT, CC, Or, ShiftLeftLo);

tru pushed a commit that referenced this pull request Nov 13, 2023
If we start with an i128 shift, the initial shift amount would usually
have zeros in bit 8 and above. xoring the shift amount with -1 will set
those upper bits to 1. If DAGCombiner is able to prove those bits are
now 1, then the shift that uses the xor will be replaced with undef.
Which we don't want.

Reduce the xor constant to VT.bits-1 where VT is half the size of the
larger shift type. This avoids toggling the upper bits. The hardware
shift instruction only uses the lower bits of the shift amount. I assume
the code used NOT because the hardware doesn't use the upper bits, but
that isn't compatible with the LLVM poison semantics.

Fixes #71142.

(cherry picked from commit 8d24d39)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Mips] Incorrect code generated to perform i128 shifting by a truncated amount
3 participants