forked from brson/llvm
-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix android configuration in case of target triple = linux-androideabi #1
Open
aydinkim
wants to merge
1
commit into
luqmana:rust-llvm-2013-12-27
Choose a base branch
from
aydinkim:rust-llvm-2013-12-27
base: rust-llvm-2013-12-27
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
fix android configuration in case of target triple = linux-androideabi #1
aydinkim
wants to merge
1
commit into
luqmana:rust-llvm-2013-12-27
from
aydinkim:rust-llvm-2013-12-27
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
luqmana
pushed a commit
that referenced
this pull request
May 20, 2014
This commit adds the pre-UAL aliases of fconsts and fconstd for vmov.f32 and vmov.f64. They use an InstAlias rather than a MnemonicAlias to properly support the predicate operand. We need to support encoded 8-bit constants in order to implement the pre-UAL fconsts/fconstd aliases for vmov.f32/vmov.f64, so this commit also fixes parsing of encoded floating point constants used in vmov.f32/vmov.f64 instructions. Now we can support assembly code like this: fconsts s0, #0x70 which is equivalent to vmov.f32 s0, #1.0. Most of the code was already in place to support this feature. Previously the code was trying to accept encoded 8-bit float constants for the vmov.f32/vmov.f64 instructions. It looks like the support for parsing encoded floats was lost in a refactoring in commit r148556 and we did not have any tests in place to catch it. The change in this commit is to keep the parsed value as a 32-bit float instead of a 64-bit double because that is what the isFPImm() function expects to find. There is no loss of precision by using a 32-bit float here because we are still limited to an 8-bit encoded value in the end. Additionally, we explicitly reject encoded 8-bit floats for vmovf.32/64. This is the same as the current behavior, but we now do it explicitly rather than accidently. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@198697 91177308-0d34-0410-b5e6-96231b3b80d8
luqmana
pushed a commit
that referenced
this pull request
May 20, 2014
1) Fix a specific bug when certain conversion functions are called in a program compiled as mips16 with hard float and the program is linked as c++. There are two libraries that are reversed in the link order with gcc/g++ and clang/clang++ for mips16 in this case and the proper stubs will then not be called. These stubs are normally handled in the Mips16HardFloat pass but in this case we don't know at that time that we need to generate the stubs. This must all be handled later in code generation and we have moved this functionality to MipsAsmPrinter. When linked as C (gcc or clang) the proper stubs are linked in from libc. 2) Set up the infrastructure to handle 90% of what is in the Mips16HardFloat pass in this new area of MipsAsmPrinter. This is a more logical place to handle this and we have known for some time that we needed to move the code later and not implement it using inline asm as we do now but it was not clear exactly where to do this and what mechanism should be used. Now it's clear to us how to do this and this patch contains the infrastructure to move most of this to MipsAsmPrinter but the actual moving will be done in a follow on patch. The same infrastructure is used to fix this current bug as described in #1. This change was requested by the list during the original putback of the Mips16HardFloat pass but was not practical for us do at that time. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@201426 91177308-0d34-0410-b5e6-96231b3b80d8
luqmana
pushed a commit
that referenced
this pull request
May 20, 2014
…ify-libcall optimize a call to a llvm intrinsic to something that invovles a call to a C library call, make sure it sets the right calling convention on the call. e.g. extern double pow(double, double); double t(double x) { return pow(10, x); } Compiles to something like this for AAPCS-VFP: define arm_aapcs_vfpcc double @t(double %x) #0 { entry: %0 = call double @llvm.pow.f64(double 1.000000e+01, double %x) ret double %0 } declare double @llvm.pow.f64(double, double) #1 Simplify libcall (part of instcombine) will turn the above into: define arm_aapcs_vfpcc double @t(double %x) #0 { entry: %__exp10 = call double @__exp10(double %x) #1 ret double %__exp10 } declare double @__exp10(double) The pre-instcombine code works because calls to LLVM builtins are special. Instruction selection will chose the right calling convention for the call. However, the code after instcombine is wrong. The call to __exp10 will use the C calling convention. I can think of 3 options to fix this. 1. Make "C" calling convention just work since the target should know what CC is being used. This doesn't work because each function can use different CC with the "pcs" attribute. 2. Have Clang add the right CC keyword on the calls to LLVM builtin. This will work but it doesn't match the LLVM IR specification which states these are "Standard C Library Intrinsics". 3. Fix simplify libcall so the resulting calls to the C routines will have the proper CC keyword. e.g. %__exp10 = call arm_aapcs_vfpcc double @__exp10(double %x) #1 This works and is the solution I implemented here. Both solutions brson#2 and brson#3 would work. After carefully considering the pros and cons, I decided to implement brson#3 for the following reasons. 1. It doesn't change the "spec" of the intrinsics. 2. It's a self-contained fix. There are a couple of potential downsides. 1. There could be other places in the optimizer that is broken in the same way that's not addressed by this. 2. There could be other calling conventions that need to be propagated by simplify-libcall that's not handled. But for now, this is the fix that I'm most comfortable with. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203488 91177308-0d34-0410-b5e6-96231b3b80d8
luqmana
pushed a commit
that referenced
this pull request
May 20, 2014
For pattern like ((x >> C1) & Mask) << C2, DAG combiner may convert it into (x >> (C1-C2)) & (Mask << C2), which makes pattern matching of ubfx more difficult. For example: Given %shr = lshr i64 %x, 4 %and = and i64 %shr, 15 %arrayidx = getelementptr inbounds [8 x [64 x i64]]* @arr, i64 0, %i64 2, i64 %and %0 = load i64* %arrayidx With current shift folding, it takes 3 instrs to compute base address: lsr x8, x0, #1 and x8, x8, #0x78 add x8, x9, x8 If using ubfx, it only needs 2 instrs: ubfx x8, x0, brson#4, brson#4 add x8, x9, x8, lsl brson#3 This fixes bug 19589 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207702 91177308-0d34-0410-b5e6-96231b3b80d8
luqmana
pushed a commit
that referenced
this pull request
Sep 9, 2014
Since the result of a SETCC for AArch64 is 0 or -1 in each lane, we can move unary operations, in this case [su]int_to_fp through the mask operation and constant fold the operation away. Generally speaking: UNARYOP(AND(VECTOR_CMP(x,y), constant)) --> AND(VECTOR_CMP(x,y), constant2) where constant2 is UNARYOP(constant). This implements the transform where UNARYOP is [su]int_to_fp. For example, consider the simple function: define <4 x float> @foo(<4 x float> %val, <4 x float> %test) nounwind { %cmp = fcmp oeq <4 x float> %val, %test %ext = zext <4 x i1> %cmp to <4 x i32> %result = sitofp <4 x i32> %ext to <4 x float> ret <4 x float> %result } Before this change, the code is generated as: fcmeq.4s v0, v0, v1 movi.4s v1, #0x1 // Integer splat value. and.16b v0, v0, v1 // Mask lanes based on the comparison. scvtf.4s v0, v0 // Convert each lane to f32. ret After, the code is improved to: fcmeq.4s v0, v0, v1 fmov.4s v1, #1.00000000 // f32 splat value. and.16b v0, v0, v1 // Mask lanes based on the comparison. ret The svvtf.4s has been constant folded away and the floating point 1.0f vector lanes are materialized directly via fmov.4s. Rather than do the folding manually in the target code, teach getNode() in the generic SelectionDAG to handle folding constant operands of vector [su]int_to_fp nodes. It is reasonable (as noted in a FIXME) to do additional constant folding there as well, but I don't have test cases for those operations, so leaving them for another time when it becomes appropriate. rdar://17693791 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213341 91177308-0d34-0410-b5e6-96231b3b80d8
luqmana
pushed a commit
that referenced
this pull request
Sep 9, 2014
…dvantage of the isRegSequence property. This is a follow-up of r215394 and r215404, which respectively introduces the isRegSequence property and uses it for ARM. Thanks to the property introduced by the previous commits, this patch is able to optimize the following sequence: vmov d0, r2, r3 vmov d1, r0, r1 vmov r0, s0 vmov r1, s2 udiv r0, r1, r0 vmov r1, s1 vmov r2, s3 udiv r1, r2, r1 vmov.32 d16[0], r0 vmov.32 d16[1], r1 vmov r0, r1, d16 bx lr into: udiv r0, r0, r2 udiv r1, r1, r3 vmov.32 d16[0], r0 vmov.32 d16[1], r1 vmov r0, r1, d16 bx lr This patch refactors how the copy optimizations are done in the peephole optimizer. Prior to this patch, we had one copy-related optimization that replaced a copy or bitcast by a generic, more suitable (in terms of register file), copy. With this patch, the peephole optimizer features two copy-related optimizations: 1. One for rewriting generic copies to generic copies: PeepholeOptimizer::optimizeCoalescableCopy. 2. One for replacing non-generic copies with generic copies: PeepholeOptimizer::optimizeUncoalescableCopy. The goals of these two optimizations are slightly different: one rewrite the operand of the instruction (#1), the other kills off the non-generic instruction and replace it by a (sequence of) generic instruction(s). Both optimizations rely on the ValueTracker introduced in r212100. The ValueTracker has been refactored to use the information from the TargetInstrInfo for non-generic instruction. As part of the refactoring, we switched the tracking from the index of the definition to the actual register (virtual or physical). This one change is to provide better consistency with register related APIs and to ease the use of the TargetInstrInfo. Moreover, this patch introduces a new helper class CopyRewriter used to ease the rewriting of generic copies (i.e., #1). Finally, this patch adds a dead code elimination pass right after the peephole optimizer to get rid of dead code that may appear after rewriting. This is related to <rdar://problem/12702965>. Review: http://reviews.llvm.org/D4874 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216088 91177308-0d34-0410-b5e6-96231b3b80d8
luqmana
pushed a commit
that referenced
this pull request
Oct 4, 2014
This patch makes the ARM backend transform 3 operand instructions such as 'adds/subs' to the 2 operand version of the same instruction if the first two register operands are the same. Example: 'adds r0, r0, #1' will is transformed to 'adds r0, #1'. Currently for some instructions such as 'adds' if you try to assemble 'adds r0, r0, brson#8' for thumb v6m the assembler would throw an error message because the immediate cannot be encoded using 3 bits. The backend should be smart enough to transform the instruction to 'adds r0, brson#8', which allows for larger immediate constants. Patch by Ranjeet Singh. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218521 91177308-0d34-0410-b5e6-96231b3b80d8
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
fix android build error ( servo => arm-linux-androideabi)