Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[proposal] Provided Effect Handlers #3946

Open
wants to merge 1 commit into
base: nightly
Choose a base branch
from

Conversation

owenhilyard
Copy link
Contributor

This proposal contains an alternative to an effect system which I think is more suitable for abstracting async, raises, and similar function colors in a systems language where the context may not always be suited to running arbitrary code. This is done by inverting the normal effect system, and having libraries provide the handlers based on information passed to them by the caller. Aside from the ability to manipulate function signatures with parameters (async, raises, and the return type), this can be implemented in Mojo today, but has substantial ergonomics issues. As such, I propose a place to put implicit parameters that not all functions will care about but should be propagated, such as division by zero handling, FP error handling, OOM, error handling behavior and async. With this, only functions which actually perform IO need to deal with async beyond awaiting functions they call (which does nothing for sync functions).

This proposal contains an alternative to an effect system which I think
is more suitable for abstracting `async`, `raises`, and similar function
colors in a systems language where the context may not always be suited
to running arbitrary code. This is done by inverting the normal effect
system, and having libraries provide the handlers based on information
passed to them by the caller. Aside from the ability to manipulate
function signatures with parameters (`async`, `raises`, and the return
type), this can be implemented in Mojo today, but has substantial
ergonomics issues. As such, I propose a place to put implicit parameters
that not all functions will care about but should be propagated, such as
division by zero handling, FP error handling, OOM, error handling
behavior and `async`. With this, only functions which actually perform
IO need to deal with async beyond awaiting functions they call (which
does nothing for sync functions).

Signed-off-by: Owen Hilyard <[email protected]>
@lattner
Copy link
Collaborator

lattner commented Jan 20, 2025

Lets say I want to make a higher order function of some sort, how many different implementations do I need in Mojo? Well, you need 4 functions:

Well yeah, but at least for throws, there is a tentative plan. The plan is to add support for enums, which would allow defining a non-constructable type like the Swift Never type (btw, I hate this name, lets call it something better, but I'll use it here for clarity). Given that, we can make non-raising functions like fn foo(): be equivalent to fn foo() raises Never:.

Given that, we can now abstract over raising in a simple way:

# overloading approach works, as you mention:
fn hof(f: fn()): ...
fn hof(f: fn() raises) raises: ...

# parametric approach should work too:
fn hof2[T: ErrorType](f: fn() raises T) raises T: ...

This is basically the same approach to how ref unified mutable and immutable references into a single model, but allowed us to keep the "simple syntax" for the simple cases. I find this beautiful and simple, because it is a composition of basic features (e.g. parametric methods) working together in a nice way. By comparison, Swift has "rethrows" which is a giant mess and is never good enough.

I agree with you that there is no equivalent unification for async though. That said, I feel like the above is simple and bounded, we just need to "get it implemented" and it will make mojo "obviously better" in a simple step forward.


As to your actual proposal, I don't really understand the full details of how this would work. However, I will ask a more basic question: "is it actually even desirable to unify async and sync functions"?

These sorts of functions typically have far more going on and different behavior that drives a wedge between them for other reasons. For example, in your iterator example, async streams and sync streams are really fundamentally different: one wants things like backpressure, and has an expanded api. The sync streams/iterators are obviously well known and have low level performance concerns at stake.

It is possible we could unify async/sync functions with a similar unification to the above: just make async functions expose an explicit "Future" type of some soft when you want to abstract over them, but I'm not sure that is desirable.

@owenhilyard
Copy link
Contributor Author

Lets say I want to make a higher order function of some sort, how many different implementations do I need in Mojo? Well, you need 4 functions:

Well yeah, but at least for throws, there is a tentative plan. The plan is to add support for enums, which would allow defining a non-constructable type like the Swift Never type (btw, I hate this name, lets call it something better, but I'll use it here for clarity). Given that, we can make non-raising functions like fn foo(): be equivalent to fn foo() raises Never:.

Given that, we can now abstract over raising in a simple way:

# overloading approach works, as you mention:
fn hof(f: fn()): ...
fn hof(f: fn() raises) raises: ...

# parametric approach should work too:
fn hof2[T: ErrorType](f: fn() raises T) raises T: ...

This is basically the same approach to how ref unified mutable and immutable references into a single model, but allowed us to keep the "simple syntax" for the simple cases. I find this beautiful and simple, because it is a composition of basic features (e.g. parametric methods) working together in a nice way. By comparison, Swift has "rethrows" which is a giant mess and is never good enough.

I think this cleans up raises quite a bit as far as abstracting over it.

I agree with you that there is no equivalent unification for async though. That said, I feel like the above is simple and bounded, we just need to "get it implemented" and it will make mojo "obviously better" in a simple step forward.

As to your actual proposal, I don't really understand the full details of how this would work. However, I will ask a more basic question: "is it actually even desirable to unify async and sync functions"?

The reason I think that it's desirable to have some level of async and sync unification is because most functions don't care. If you take an async read, swap the async read out for a sync read and make the compiler delete all of the awaits, most of the time people can use that as a sync read. I think that in the majority of cases only the executor and the functions which yield need to be aware that they are in a coroutine, aside from the need to do the compiler transforms on them. However, if you color sync and async, you force library authors to choose sync, async, or writing at least the interface, possibly more, to the library twice. This is part of why the Rust ecosystem almost forces you into async as soon as you want some form of IO. The ability to write a function which doesn't care at all about sync vs async, and the ability to write a function which is generic over sync vs async, helps to make that problem go away. This allows users to only jump to async when they need to, and stops library authors from having to choose.

These sorts of functions typically have far more going on and different behavior that drives a wedge between them for other reasons. For example, in your iterator example, async streams and sync streams are really fundamentally different: one wants things like backpressure, and has an expanded api. The sync streams/iterators are obviously well known and have low level performance concerns at stake.

This is probably more of a "business logic" feature, since I imagine that things like iterators would essentially have one function call and then have a top-level branch on sync or async. For things which are truly different between sync or async, there will still be a split, but this gives a tool to at least try to manage some of the complexity.

It is possible we could unify async/sync functions with a similar unification to the above: just make async functions expose an explicit "Future" type of some soft when you want to abstract over them, but I'm not sure that is desirable.

I'd need to see more of that to make a judgement. I think it may require reflection in order to look into the type and could make "async generic" code very messy.

The actual goal of this proposal was to help manage some of things which should likely be user-controlled, such as what the error behavior should be. The main cases I can think of are raise, Never, and abort, and this is why I wanted the ability to pass more information down the call chain. For example, most people will want an OOM to just abort, but for some applications that is unacceptable and will want to raise and handle it. A math kernel would likely prefer to ignore div by zero errors, but some people may care and want to either abort or raise if that happens. Async IO is another thing I think should typically be a user decision, since it opts you in to a lot of complexity due to the virality of async, and nobody wants to spin up an entire async runtime just to read some config information from a sqlite file on disk. Given what we've seen with C++, we can't trust most libraries to actually expose things like an allocator template parameter for things a function allocates, and I'm not sure we can trust libraries to be generic over error handling for N different situations, even if we throw out async. That's where the implicit parameters come from. Once I had that, I decided to expand the idea to a place to put parameters which may need to be propagated far down the call stack. This looks a lot like an effect system, but with the part where the user writes the handing code removed since running arbitrary code in arbitrary contexts is dangerous in a systems language and effect handlers often don't have enough information to actually handle problems deep in a program. An alternative approach would be to go all-in on effects and allow them to be easily defined by libraries, so that you can have effect MaxGPUMatmulDivByZero and the library needs to provide as much context as it possibly can, then those are propagated up the calls stack, with default handlers provided as part of the effect definition. This may have a more sound theoretical backing, but I'm not sure if it can be made linear time in the size of the "post-effect-monomorphization" call tree (with the ability to memoize) as my proposed approach can. The effect system approach also doesn't handle raises and async as well.

@lattner
Copy link
Collaborator

lattner commented Jan 21, 2025

Yeah makes sense, I can see some benefit to making the effect system user-extensible to support new kinds of effects.

To your point about "erasing sync vs async differences" because "most code doesn't care", I think that's right, but async code can suspend. Erasing is totally possible, but it means that the code would have to be written in an async correct way, and then sync becomes a special/optimized case. That absolutely makes sense and is possible to support, and is directly analogous to the "raises turns into Never" example: the code has to be written to assume the code is throwing.

One thing though, is that this means that you'd have to write code with await's, it wouldn't get you out of that.

@owenhilyard
Copy link
Contributor Author

Yeah makes sense, I can see some benefit to making the effect system user-extensible to support new kinds of effects.

My only concern with extensibility is how much extra cost that adds to the compiler. It might be a good idea to have the MLIR implementation first under the "change at any time" contract and surface that so that we can run tests and gauge the performance impact of heavy use before we start adding it into the stdlib. Then we can start with defining effects for raises, async, and possibly a few others like block, allocate, OOM, div0, and fperror, those latter two being things I know the kernels team wants. Once we have a better idea of how this will typically be used then API design can happen and it can get opened up more widely.

To your point about "erasing sync vs async differences" because "most code doesn't care", I think that's right, but async code can suspend. Erasing is totally possible, but it means that the code would have to be written in an async correct way, and then sync becomes a special/optimized case. That absolutely makes sense and is possible to support, and is directly analogous to the "raises turns into Never" example: the code has to be written to assume the code is throwing.

One thing though, is that this means that you'd have to write code with await's, it wouldn't get you out of that.

I agree that this means library authors need to write code as if it was async. However, what the Rust ecosystem has shown us is that most library authors will write async code anyway, so this doesn't add much of an extra burden on them aside from writing a sync path in low-level IO functions.

@JoeLoser JoeLoser added the needs-discussion Need discussion in order to move forward label Jan 22, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs-discussion Need discussion in order to move forward
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants