-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Code inefficiencies in loop array indexing #35618
Comments
Related: #34810 |
With the |
As this is an optimization issue, moving to Future. |
I started working on this so I'm re-assigning to myself if you don't mind. |
With a simple
for
loop around array access of anint
array, various inefficiencies are seen in x64 and arm64 generated code, when using different loop/array index types.Related: Induction Variable widening #7312
Consider:
For x64, we have a sign extend in the loop because the index type is 32-bit
int
but the register is 64-bit. We should be able to eliminate this sign extension because the max array size is unsigned 31 bits (?), so the sign bits will never be used. We eliminate the bounds check presumably because of a comparison against the array Length, and Length will never have the sign bits set.For arm64, we have the sign extension, but we also have inefficient array index calculation because it doesn't have
base + scale*index + offset
addressing mode. We currently are careful to only add a fully computed index offset to the baseref
type, to avoid creating intermediatebyref
types: there have been bugs before with certain (negative) index expressions that led the JIT to create illegal byrefs (pointing out of the object). The index expression here is:If we could hoist the array base address computation out of the loop (
x0 + #16
), eliminate the sign extend (see above), and use theLSL
addressing mode, we could have:We don't even need to eliminate the sign extend, as we can use the base + offset (Extended Register) form:
Hoisting the
x0 + #16
out of the loop isn't required here: simply changing the way the index expression is generated plus enhancing support for addressing modes would be a first step; hoisting the loop-invariantx0 + #16
would be an additional win.x64 assembly
arm64 assembly
If the loop / array index variable is changed from
int
tolong
, there are some unexpected effects: (1) the JIT sign-extends the array size. The array size can never be negative, so this seems unnecessary. (2) the JIT doesn't eliminate the bounds check in the loop body.The sign extend is eliminated, however, as expected.
x64 assembly
arm64 assembly
Changing the index variable to
uint
retains the bounds check and array length sign extend, as forlong
. Oddly, it also adds a sign extend of the index variable before the addressing mode.x64 assembly
arm64 assembly
Changing the index variable to
ulong
adds an overflow check to the loop as well as a range check. Why is the overflow check necessary?x64 assembly
arm64 assembly
category:cq
theme:loop-opt
skill-level:intermediate
cost:medium
The text was updated successfully, but these errors were encountered: