Consider implementing multiple ownership and scoped borrows, replacing single ownership with runtime panics #596

yorickpeterse · 2023-07-19T22:18:50Z

Description

We currently panic when dropping a value that still has references to it, and maintain reference counts when creating/dropping references. In general this works well, but the runtime panics can be hard to debug, and we won't be able to detect all dangling references at compile time.

An alternative to this approach is something I thought of a while ago but never really wrote down (I think at least):

Values start out as owned and are moved around as they are now (i.e. no reference count changes). Their initial reference count is 1, not 0.

ref T and mut T still exist, but can't outlive the scope they are created in. This involves tracking lifetimes/scopes internally, though we don't need to expose them to the language. In practise this means you'd mostly use these as arguments, as you won't be able to return them.

Next, we introduce the expression clone x or share x (not sure yet, but I'll use share x for the rest of this comment), which simply increments the reference count and returns a new owned pointer. This expression works on both borrows and owned values. When dropping owned values we decrement the reference count, and drop the value if the new count is 0.

In this setup, we essentially have multiple ownership, and the share x expression is used when returning or storing values (e.g. Array.get would use this). While this incurs a reference count increment, this is currently also true when returning ref T or mut T.

The benefit here is that we can still express everything we can express today, but we don't have to worry about runtime panics. The downside is that we have to run destructors in a few more places, which combined with a future inlining pass may result in slightly more code being generated. The total amount of reference count changes may actually be lower than what we have today, because returning references is likely to happen less often compared to taking a reference as an argument, and those don't incur reference counts in this new model.

I believe Swift takes a somewhat similar approach, though they sort of invert things: moving values around incurs a reference count by default, whereas in our model this would be made explicit using the share x expression.

Related work

No response

The text was updated successfully, but these errors were encountered:

yorickpeterse · 2023-07-19T22:28:07Z

To add to this: changing from our current model to this new one shouldn't be too tricky: the biggest change is enforcing that borrows can't outlive their scopes, but even there we could take a simple/naive approach of simply not allowing them in fields, return values, and by not making them compatible with owned type parameters (you'd still be able to pass them to ref T arguments), as that essentially prohibits them from being stored anywhere.

If we enforce lifetimes per scope, it would probably come down to something like this:

Each scope is given a number as its lifetime, in the range of 0..N. Each value created is given the lifetime of its surrounding scope. When returning a type from a scope or moving it (i.e. passing it as an owned value to an argument), we check if the value is or contains a reference. If so, we require that the lifetime number of that reference is greater than the scope the value is moved to. This allows passing references into sub scopes, but not into parent scopes.

References can always be passed to method arguments (their lifetime is 0 because that's the outer most scope lifetime in a method), but we don't allow returning references from methods. For self we could use some sort of special lifetime, and simply not allow sticking references in it.

This is the rough idea, though I probably overlooked some important cases. Point being: references/borrows are useful if you want to take some data (e.g. ref ToString) and use that to create new data, without needing a reference count increment. The moment you want to store something, you use share x and off you go.

yorickpeterse · 2023-07-19T22:35:42Z

fn move methods would still work, though instead of just dropping self unchecked at the end (unless self is moved), they too need to first decrement and then drop if needed, as multiple owners may exist.

One potentially tricky thing is that an fn move method may assign a new field a value and then move that value somewhere, thinking it only affects the current receiver. But if multiple owners exist, those copies would see the new value. To handle that, when compiling fn move we'd have to disallow assigning the fields new values.

Another issue is that of resource cleanup: if a value stores e.g. a file descriptor, then an fn move method can close it, breaking other existing owners as those would expect the file descriptor to still be valid. This isn't possible in our current model, fn and fn mut methods have fields exposed as references, and fn move methods are only available to the single owned reference.

yorickpeterse · 2024-11-27T17:01:24Z

In light of #778 and #776, it might be worth looking into this deeper. For example, inferring enums and tuples as value types will prove tricky, as e.g. tuples allow assignments of their fields, which won't work for immutable value types (or at the very least produce inconsistent results).

At its core is the following problem: if we put data on the stack, we need a compile-time mechanism to ensure no borrows exist when said data is moved. Making types immutable value types that are copied is one approach, albeit a limited one.

Swift's use of reference counting provides a unique benefit for this situation: they can make tuples and enums value types because when those types are copied, they can increment the reference counts of any heap sub values, essentially resulting in multiple ownership of those values. We only use reference counting for borrows and thus can't apply the same technique.

Of course Swift's approach may result in moving of a value requiring a bunch of reference count increments. For example, for (A, B, C, D) copying the data around incurs at least 4 reference count increments. Again in our case we'd only need it when actually copying and not just when the data is moved around (meaning we'd only need to increment when "borrowing" the data).

Another issue is cache friendliness: being able to shove data on the stack is great, but not if you're thrashing the cache by constantly adjusting reference counts. In our current model this happens when borrowing a lot, but in theory we can optimize a lot of that away. For multiple ownership we'd have to introduce some form of (restricted) compile-time borrowing, otherwise we'd be adjusting reference counts far too much.

yorickpeterse · 2024-11-28T00:45:27Z

To add to the above comment: I'm pretty sure using reference counting may result in a cascade of increments when data is moved. That is, when copying outer data we may need to increment inner data recursively, otherwise if we move data out of a copy we end up with an uneven number of reference counts, which could result in an object being released prematurely.

yorickpeterse added feature New things to add to Inko, such as a new standard library module compiler Changes related to the compiler labels Jul 19, 2023

yorickpeterse mentioned this issue Dec 2, 2023

Consider turning Inko into a functional language #665

Closed

yorickpeterse added this to the Future ideas milestone Dec 24, 2023

yorickpeterse removed this from the Future ideas milestone Aug 5, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Consider implementing multiple ownership and scoped borrows, replacing single ownership with runtime panics #596

Consider implementing multiple ownership and scoped borrows, replacing single ownership with runtime panics #596

yorickpeterse commented Jul 19, 2023

yorickpeterse commented Jul 19, 2023

yorickpeterse commented Jul 19, 2023

yorickpeterse commented Nov 27, 2024

yorickpeterse commented Nov 28, 2024

Consider implementing multiple ownership and scoped borrows, replacing single ownership with runtime panics #596

Consider implementing multiple ownership and scoped borrows, replacing single ownership with runtime panics #596

Comments

yorickpeterse commented Jul 19, 2023

Description

Related work

yorickpeterse commented Jul 19, 2023

yorickpeterse commented Jul 19, 2023

yorickpeterse commented Nov 27, 2024

yorickpeterse commented Nov 28, 2024