-
-
Notifications
You must be signed in to change notification settings - Fork 43
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement some form of escape analysis to allocate values on the stack #776
Comments
A related and older idea is to bump allocate small objects using a custom allocator (#542). The idea here was that by using our own allocator we can avoid the cost of going through |
As for what constitutes as an escape: a conservative approach would be to just consider any move (except for a drop) an escape. This does mean that moving stack value S₁ into stack value S₂ results in S₁ being heap allocated, but it might be good enough as a start. |
The way we generate MIR poses a challenge, such that we can't just look at
Results in something like this:
This means that if we look at just |
Assuming the moves are all in the same scopes, it doesn't really matter how many moves there are, since we can copy the data if needed. Whether copying actually happens depends largely on how LLVM decides to compile/optimize the code. Things do get tricky if we borrow in between such moves, because the move would invalidate the borrow. |
Given the following instructions:
R₁ should be stack allocated, but the decision to do that depends on R₂, R₃ and others not escaping the scope R₁ is defined in. So given a move chain of For this to work, for every register R that's assigned the result of an |
For #677 I have a WIP parser that uses the commonly used Stack allocating these values would likely greatly improve performance. However, I'm not sure it's enough to just rely on escape analysis. That is, for that to work the allocations need to be inlined, and given the complexity of the parser that's not a guarantee. An alternative might be to not rely on escape analysis at all, but instead look at the type definition. That is, if the type size is e.g. less than or equal to 64 bytes, and all its sub values are trivial to copy, we just always stack allocate it and copy it upon borrowing. This does mean we need some sort of |
I've spent some time building a bump allocator for small objects based on Immix, as found on the For the work in progress I'm now wondering if we perhaps could take a simpler approach. That is, instead of relying on escape analysis we instead promote types to value types based on their definition rather than their usage. For example, types such as The downside is that "value type promotion" only works for simple types such as |
Another issue with value type promotion: if we just blindly promote a type to a value type (assuming it stores other value types), then we might break circular types (i.e. a linked list). |
For the bump allocation scheme, I think we can make the following changes:
|
One option is to flip things around: values go on the stack by default, and a hypothetical To make that safe, we'd have to check the borrow count upon moving a stack value such that we don't invalidate any. In theory we can elide such checks if we can statically determine no borrows are present at the time of the move, but this won't be able to remove all such checks. Moving Implementation wise this is much simpler compared to escape analysis and value type promotion, and we drastically reduce the pressure on the system allocator. This in turn is likely to improve performance significantly. The downsides are:
|
The above idea won't work well for resizable types such as arrays: if you have an array of N stack values and you create a borrow to a value, then resizing the array will invalidate that borrow. To make that safe as well, we'd have to check the borrow count for every value, which will be terrible performance wise. |
Enums could be promoted to stack based value types if the following conditions are met:
The reason this would be safe is that enums themselves can't be modified in-place, i.e. you can't change the constructor tag or the constructor's arguments. You can modify those in-place technically, but for value types that would result in a copy being modified. Essentially the idea here is that in addition to introducing an For types such as A challenge here is specialization: if we specialize for such specific types we can remove the need for object headers, as generics can use static dispatch. This however may result in more copies of methods that operate on different types with the same memory layout. We'd also need a form of auto boxing when such values are cast to a trait, though we could also just straight up disallow casting If we keep the headers this won't be necessary, but then e.g. For |
#778 is merged, so the bulk of the work is done. I do think it's still worth implementing stack allocations on top of this. This way if a heap allocated type is used but we can trivially determine it should go on the stack, we can do just that automatically. |
Description
Types defined using the
class
keyword are heap allocated. While #750 talks about giving users the option to explicitly opt-in to stack allocating data, I feel we also need a mechanism to determine if we can do this automatically. That is, we need some form of escape analysis and stack allocate values if we determine this is safe to do so.An area of the standard library that can really benefit from this is the
std.iter
module. Iterators are used heavily and incur a heap allocation. When used in tight loops, it's not uncommon for most time to be dominated by the memory management necessary for these iterators.The basic idea is simple: if an allocated value
V
doesn't escape the scopeS
it's allocated into (meaninglifetime(V) <= lifetime(S)
), we can allocate it onto the stack instead.The implementation however is a bit more tricky. For example, we can only stack allocate values if they are allocated directly in their scope and not through a method call. This means that good inlining is crucial for escape analysis to have a meaningful impact. In addition, the implementation of escape analysis itself may prove a bit of a challenge.
Note that it's perfectly fine for borrows of stack allocated data to outlive the stack value, as dropping the stack value would still incur a runtime borrow check and produce a panic in such cases. In other words, such cases don't inhibit the ability to stack allocate.
Related issues
Related work
The text was updated successfully, but these errors were encountered: