Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate escape analysis for array allocations #43107

Closed

Conversation

pchintalapudi
Copy link
Member

@pchintalapudi pchintalapudi commented Nov 16, 2021

Many objects in Julia are allocated on the stack or removed altogether through the use of escape analysis, allowing for further optimizations and dead code elimination. This draft PR proposes extending escape analysis to array allocations for 1D, 2D, 3D, and ND cases. So far, the following optimizations have been implemented for arrays:

  1. Allocation hoisting -- tracked by Hoist array allocations out of loops #43777
    1. Array allocations can be moved out of loops if they are loop invariant and there are no escaping instructions
  2. Allocation removal -- tracked by Remove array allocations that are trivially dead #43548
    1. Array allocations can be altogether removed along with dependent instructions if they never escape and they are never loaded from
  3. Length/dimension propagation -- tracked by Propagate array dimensions from allocations #43487
    1. Array length does not have to be loaded from memory if the length is never clobbered and the array does not escape, and such loads can simply be replaced with the length operand passed into the allocation function
  4. Unsigned array dimensions -- tracked by Propagate array dimensions from allocations #43487
    1. Signed comparisons around -1, 0, and 1 based on unknown array lengths can be either eliminated altogether or reduced to simple equality checks against 0
  5. Array stack-based splitting -- rejected, useful cases are covered by LLVM's store->load forwarding transformations
    1. Some arrays can be split onto the stack and treated as a collection of independent memory locations
    2. Note that arrays > ~10 elements cannot be split due to initialization using not-unrolled iteration
  6. Array stack-based allocation -- tracked by Move small arrays to the stack #43573
    1. Some arrays can be entirely moved onto the stack as a C-style array
  7. Array data load coalescing -- tracked by Hoist data pointer loads from array allocations #43547
    1. Loads of the array data pointer can be hoisted into the basic block that the array was allocated in if the array does not escape
  8. array_owned/merge_own skipping -- tracked by Eliminate write barrier branches from codegen and late-gc-lowering #43384
    1. If the array was freshly allocated we do not need to check if the array has shared data with another array inside codegen itself
    2. Similar logic applies for write barriers in late-gc-lowering -- these write barriers branch off values that are liable to change during GC and cannot be optimized. However, they are inserted late enough in the pass pipeline that they don't affect upstream optimizations.

This depends on #43057 for breaking out escape analysis from alloc-opt.

Performance analysis and compile-time regressions are currently being computed.

@pchintalapudi pchintalapudi force-pushed the pc/array-simple-optimize branch from ae492d0 to e87787d Compare November 24, 2021 21:17
@pchintalapudi pchintalapudi force-pushed the pc/array-simple-optimize branch from 97115e7 to 19cb27e Compare December 1, 2021 01:17
@JeffBezanson JeffBezanson added compiler:codegen Generation of LLVM IR and native code performance Must go faster labels Dec 8, 2021
@pchintalapudi pchintalapudi deleted the pc/array-simple-optimize branch March 6, 2022 22:07
@pchintalapudi pchintalapudi restored the pc/array-simple-optimize branch March 6, 2022 22:07
@pchintalapudi pchintalapudi deleted the pc/array-simple-optimize branch February 22, 2023 15:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler:codegen Generation of LLVM IR and native code performance Must go faster
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants