Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can we make forgetting an await be an error? #79

Open
njsmith opened this issue Mar 10, 2017 · 65 comments
Open

Can we make forgetting an await be an error? #79

njsmith opened this issue Mar 10, 2017 · 65 comments

Comments

@njsmith
Copy link
Member

njsmith commented Mar 10, 2017

[This issue was originally an umbrella list of all the potential Python core changes that we might want to advocate for, but a big discussion about await sprouted here so I've moved the umbrella issue to: #103]

Original comment:

Can we do anything about how easy it is to forget an await? In retrospect async functions shouldn't implement __call__ IMO... but probably too late to fix that. Still, it kinda sucks that one of first things in our tutorial is a giant warning about this wart, so worth at least checking if anyone has any ideas...

@brettcannon
Copy link
Contributor

The tricky bit with the "forgetting to call await" is passing an async function around as any old object (including through a return statement) is a totally legitimate thing to do. About the only thing I can think of is if you call an async function and don't store its value then you obviously screwed up since you won't even get side-effects for that and so you could have the interpreter raise a warning/exception at that point instead of at destruction time of the object (which should get you the proper call line in the traceback and will work immediately in e.g. PyPy when they don't use refcounting).

The issue in this case is how to reliably detect that an unstored async function was never started (i.e. is having POP_TOP check whether an async function has started good enough to catch this situation and not have any false-positives?).

@njsmith
Copy link
Member Author

njsmith commented Mar 11, 2017

@brettcannon: in a perfect world IMO we would have made await foo(...) expand to something like yield from foo.__acall__(...).__await__(), so you could get the coroutine object if you wanted, but you'd have to write foo.__acall__(...) instead of just foo(...). Obviously this would break a bunch of asyncio and curio code at this point :-( (though curio is switching to the trio style of never passing coroutine objects around as well). The problem is that there's no situation where you need to pass around coroutine objects unless you're implementing a library like trio, but with the way asyncio started with generator-based-coroutines this wasn't obvious. And aiohttp for example definitely does "clever" things like having a nominally-regular function call an async function and return the coroutine object, but all this does is make the code more confusing and mess up tracebacks.

Nominally async/await is still provisional so I guess technically we could make a switch like this, and we could account for it in @asyncio.coroutine to handle most cases, but the aiohttp-style cases would definitely break, which would be pretty brutal...

Or, I think they would break? It's possible they're still using yield from and would be okay?

@brettcannon
Copy link
Contributor

https://aiohttp.readthedocs.io/ says they are still supporting Python 3.4.

@njsmith
Copy link
Member Author

njsmith commented Mar 12, 2017

Maybe there's some hope then? I dunno :-).

It occurred to me that perhaps I should offer to present this list as a talk at the language summit this year – don't know if that would be useful/of interest, but maybe.

@1st1
Copy link

1st1 commented Mar 12, 2017

Nominally async/await is still provisional

Not anymore :)

can we do anything about how easy it is to forget an await? In retrospect async functions shouldn't implement call IMO...

We discussed this at length on python-dev. Here's a summary: https://www.python.org/dev/peps/pep-0492/#pep-3152. The most important point is this one:

# The following code:

  await fut
  await function_returning_future()
  await asyncio.gather(coro1(arg1, arg2), coro2(arg1, arg2))

# would look like:

  cocall fut()  # or cocall costart(fut)
  cocall (function_returning_future())()
  cocall asyncio.gather(costart(coro1, arg1, arg2),
                        costart(coro2, arg1, arg2))

I'd say changing core async/await semantics at this point is not an option. However, we can make it an opt-it (probably).

Idea

One way to implement this is to add a new attribute to frame objects f_inawait and a new opcode BEFORE_AWAIT to set it. So coroutine functions like the one below

async def foo():
    print('hello')
    await bar()

would produce the following opcode:

  2           0 LOAD_GLOBAL              0 (print)
              2 LOAD_CONST               1 ('hello')
              4 CALL_FUNCTION            1
              6 POP_TOP

  3           8 BEFORE_AWAIT
             10 LOAD_GLOBAL              1 (bar)
             12 CALL_FUNCTION            0
             14 GET_AWAITABLE
             16 LOAD_CONST               0 (None)
             18 YIELD_FROM
             20 POP_TOP
             22 LOAD_CONST               0 (None)
             24 RETURN_VALUE

Where BEFORE_AWAIT would set f_inawait to 1 for the current frame, and GET_AWAITABLE would reset it back to 0.

Now, we have sys.set_corotuine_wrapper() that allows to intercept creation of coroutines. We can use it to raise an error if a coroutine was constructed outside of an await expression:

def coro_wrapper(coro):
    if not sys._getframe(1).f_inawait:
        raise RuntimeError(
            'creating a coroutine outside of await expression')
    return coro

sys.set_coroutine_wrapper(coro_wrapper)

I have a PoC implementation here: https://github.com/1st1/cpython/tree/coro_nocall. And here's a script illustrating how it works: https://gist.github.com/1st1/1cc67287654cc575ea41e8e623ea8c71

If you and @dabeaz think that this would significantly improve trio/curio usability, I can draft a PEP.

Maybe we can even enable this for asyncio code, but I'll need to think more about it. Perhaps we can have some APIs to asyncio for coroutine creation without await and slowly begin to use it in frameworks like aiohttp.

/cc @ncoghlan @ambv @asvetlov

@brettcannon
Copy link
Contributor

I know I would be interested in a talk on this at the language summit. 😄

@njsmith
Copy link
Member Author

njsmith commented Mar 13, 2017

@1st1

If you and @dabeaz think that this would significantly improve trio/curio usability, I can draft a PEP.

I'm not sure if this exact proposal is ideal, but you've at least convinced me that something useful might be doable and so it's worth having the discussion and hashing out our options :-).

I don't have brain to give a proper critique, but as a mini-critique: the two main concerns that pop to mind are: (1) well, it's pretty ugly isn't it, (2) is the frame introspection thing going to kill pypy. Throwing out another idea: maybe await foo(...) can de-sugar to

try:
    corofn = foo.__acall__
except AttributeError:
    corofn = foo
coro = corofn()
await coro

and then there are a few options for how we could arrange for async functions to provide __acall__ while erroring on __call__, like a __future__. Or asyncio could provide a set_coroutine_wrapper class that adds in a __call__ method that forwards to __acall__.

@1st1
Copy link

1st1 commented Mar 13, 2017

(1) well, it's pretty ugly isn't it

I don't find it ugly to be honest. Maybe a bit unusual, but we have many other "unusual" fields on frames and generator objects. Maybe you can find a less ugly variation of this idea?

(2) is the frame introspection thing going to kill pypy.

I envisioned this as a debug-mode (or call it "development mode") only check. In production, turn the debug mode off and buckle up.

Or asyncio could provide a set_coroutine_wrapper class that adds in a call method that forwards to acall.

Now this is pretty ugly and will make asyncio code slower under PyPy ;) Frankly, I don't think we can implement any revision of the__acall__ idea at this point.

FWIW I don't think that my idea with f_inawait is beautiful, but at least it allows to implement the check you want with minimal changes/complications to the interpreter/existing programs. It is simple, so it's very likely we could add it (or a variation of it) to 3.7.

@njsmith
Copy link
Member Author

njsmith commented Mar 13, 2017

I envisioned this as a debug-mode (or call it "development mode") only check. In production, turn the debug mode off and buckle up.

A debug mode would be better than nothing, but if possible I'd really prefer something that we can leave on in general. Imagine if functions in "production mode" responded to getting the wrong number of arguments by silently returning some weird object instead of raising an error – it's kind of the same thing. And debug modes impose a substantial cognitive overhead – you have to document them, every beginner has to remember to turn them on 100% of the time, some of them will fail or forget and we're back to getting confusing errors, etc.

Now this is pretty ugly and will make asyncio code slower under PyPy ;) Frankly, I don't think we can implement any revision of the __acall__ idea at this point.

Conceptually the f_inawait and __acall__ proposals are very similar: in both cases we're passing an extra bit of information down to the coroutine object ("is this call in await context, true/false"), and then it can respond to that bit however. The __acall__ method of passing this information seems more explicit and consistent with how we pass information around in Python normally, that's all...

Re: PyPy, we're talking about a simple wrapper object that has no state and one method that unconditionally calls another method. This is the sort of thing that their JIT eats for breakfast :-). On CPython it's worse because you can't optimize out the allocation of the wrapper object. But if this is a serious problem then instead of a wrapper it could be a flag that we set on the coroutine object (sorta similar to how @types.coroutine works – it could return a wrapper but for efficiency it doesn't). Or we could have a global flag instead, and have async functions do something like:

def __call__(self, *args, **kwargs):
    if not sys._check_special_thread_local_flag():
        raise TypeError("async function called without await")
    return self.__acall__(*args, **kwargs)

I just thought it was a bit nicer to re-use set_coroutine_wrapper rather than introduce a new global flag :-). (Also, for all of these approaches we have flexibility about the default – we could just as easily make it trio/curio that have to set some special flag or whatever we go with.) But in general my point is that the "how do we configure the policy async functions use to handle being called outside await context" part is orthogonal to the "how do async functions tell whether they are in await context or not" part.

@1st1
Copy link

1st1 commented Mar 13, 2017

Now this is pretty ugly and will make asyncio code slower under PyPy ;) Frankly, I don't think we can implement any revision of the__acall__ idea at this point.

To elaborate: any change to PEP 492 semantics will have to be fully backwards compatible for asyncio programs. Any significant drop in performance wouldn't be acceptable either.

What you are proposing is to make parenthesis a part of the await expression. For that you'll need new opcodes. To make your proposal backwards compatible, those new opcodes could execute different versions of async protocol depending on a special flag in the interpreter state. This could really work for code like await a(), or even await a.b(). But what about await a().b()? You'll have to use CALL_FUNCTION at some point, or you will have to reinvent all CALL_* opcodes.

Maybe it's possible to come up with some clever idea here, but the patch (and the PEP) will be bigger than the one for PEP 492.

@njsmith
Copy link
Member Author

njsmith commented Mar 13, 2017

We should probably move this discussion to async-sig or similar?

@1st1
Copy link

1st1 commented Mar 13, 2017

I envisioned this as a debug-mode (or call it "development mode") only check. In production, turn the debug mode off and buckle up.
A debug mode would be better than nothing, but if possible I'd really prefer something that we can leave on in general. Imagine if functions in "production mode" responded to getting the wrong number of arguments by silently returning some weird object instead of raising an error – it's kind of the same thing.

Yeah, I understand it's not an ideal situation. But debug modes isn't something new, a lot of frameworks have them. Generally, people learn that they exist and that they are helpful.

And debug modes impose a substantial cognitive overhead – you have to document them, every beginner has to remember to turn them on 100% of the time, some of them will fail or forget and we're back to getting confusing errors, etc.

Just make the debug mode default. And print to stdout that it's ON, and that it should be turned off in production.

The acall method of passing this information seems more explicit and consistent with how we pass information around in Python normally, that's all...

I think you don't see the full picture of how exactly this can be implemented in CPython. Maybe I'm wrong here, but I don't think this is possible. It would help if you could come up with a more detailed proposal (proto PEP) where you describe:

  1. changes to grammar/AST.
  2. changes to how await expression is compiled.
  3. changes to the __await__ protocol.

I'd be happy to help you with coding the PoC if you can show that your idea is possible to implement.

We should probably move this discussion to async-sig or similar?

I actually like the GH UI much more than my email client :) You can format your code here etc. Maybe you can just send a short email to async-sig inviting interested people to join the discussion here?

@Carreau
Copy link
Contributor

Carreau commented Mar 13, 2017

I've hit the issue a few time in IPython while live coding, I'll be happy to write an extension that at least warn users on the REPL.

@1st1
Copy link

1st1 commented Mar 13, 2017

await foo(...) can de-sugar to

try:
    corofn = foo.__acall__
except AttributeError:
    corofn = foo
coro = corofn()
await coro

And what will you de-sugar await foo().bar(spam=ham()) to? Would await foo().bar(spam=ham()).fut be acceptable?

@1st1
Copy link

1st1 commented Mar 13, 2017

I've hit the issue a few time in IPython while live coding, I'll be happy to write an extension that at least warn users on the REPL.

Yeah, I think we'll have it in 3.7 one way or another :)

@dabeaz
Copy link

dabeaz commented Mar 13, 2017

Honestly, forgetting to put an await on things is not something that I've lost a lot of sleep thinking about. Python already provides a warning message about unawaited coroutines. And if you forget the await, it is usually readily apparent that something is quite wrong in other ways. There are parts of Curio that instantiate coroutines and pass them around to other locations to be awaited later. So, I certainly wouldn't want to see a requirement that forces the await to be there. I'd also add that with decorators and feature such as functools.partial() it gets pretty complicated.

One feature that'd be really useful from the Curio front would be some way to know whether a callable has been triggered from within the context of a coroutine or not. For example, decorators could use something like that to know more about the context in which an operation is taking place.

@1st1
Copy link

1st1 commented Mar 13, 2017

Honestly, forgetting to put an await on things is not something that I've lost a lot of sleep thinking about. Python already provides a warning message about unawaited coroutines. And if you forget the await, it is usually readily apparent that something is quite wrong in other ways.

Yeah, this has been my experience too. People do forget to put await sometimes when they begin writing async code, but usually understand why the program doesn't work pretty quickly.

If we can find a way to make missing awaits trigger an actual error - why not, but I'd be -1 if it requires a huge new PEP and a change to async/await semantics.

One feature that'd be really useful from the Curio front would be some way to know whether a callable has been triggered from within the context of a coroutine or not.

You can use sys.set_coroutine_wrapper to wrap all coroutines into some awaitable object that enables the tracking. Would only recommend to do that for some debug mode.

@njsmith
Copy link
Member Author

njsmith commented Mar 13, 2017

@1st1: I suspect that this isn't the ideal implementation, but to demonstrate that my idea can be precisely specified: for parsing, we could parse just like we do now, and then do a post-processing pass over the AST where we replace all terms of the form Await(value=Call(...)) with a term of the form AwaitCall(...). I believe that gives exactly the semantics I'd propose. And a possible bytecode implementation would be to add a bytecode RESOLVE_AWAITCALL which does (excuse the horrible pseudocode, hopefully the intent is clear):

tmp = POP()
try:
    PUSH(tmp.__acall__)
except AttributeError:
    PUSH(tmp)

and then AwaitCall(...) compiles into the same bytecode as Call(...) would, except with a RESOLVE_AWAITCALL inserted at the appropriate place.

@dabeaz

Honestly, forgetting to put an await on things is not something that I've lost a lot of sleep thinking about.

Out of curiosity, since I know you do a fair amount of training, have you done any async/await training yet? Part of the trigger for this thread was that when I wrote the trio tutorial aiming at teaching folks who hadn't been exposed to async/await before, I felt obliged to put in a big section up front warning them about this pitfall, since my impression is that it bites basically everyone repeatedly when they're learning. Also it doesn't help that in pypy the warning is delayed for an arbitrarily long time. (In particular, if you're at the REPL you probably just won't get a warning at all).

There are parts of Curio that instantiate coroutines and pass them around to other locations to be awaited later. So, I certainly wouldn't want to see a requirement that forces the await to be there.

From this comment, I was under the impression that curio users would no longer be instantiating coroutines and passing them around. Is that right?

If curio internally wants to do that, then of course that's fine – any proposal is going to have to have some way to get at a coroutine object, or we could never start our coroutine runners at all :-). The idea is just that since this is something that only coroutine runners need to do, it should have a different spelling that users won't hit by accident.

I'd also add that with decorators and feature such as functools.partial() it gets pretty complicated.

Ugh, that's a great point. I guess in any proposal, wrappers like functools.partial() would need some explicit strategy to pass on the calling context to the wrapped function. In the __acall__ approach I can see how this would work in principle, though it's a bit cumbersome: partial objects would need an __acall__ method that defers to the wrapped object's __acall__ method. @1st1, in the frame introspection approach, how would you handle functools.partial?

@1st1
Copy link

1st1 commented Mar 13, 2017

@1st1, in the frame introspection approach, how would you handle functools.partial?

Looks like it's already working in my PoC :)

async def baz():
    print('hi')


async def bar():
    await baz()  # try removing "await"


async def foo():
    coro = functools.partial(bar)
    await coro()

^-- works just fine.

@1st1
Copy link

1st1 commented Mar 13, 2017

AST where we replace all terms of the form Await(value=Call(...)) with a term of the form AwaitCall(...).

But you wouldn't handle cases like this then:

await one_coroutine(another_coroutine())

@njsmith
Copy link
Member Author

njsmith commented Mar 13, 2017

@1st1: I'm guessing that's a side-effect of functools.partial being implemented in C [and thus being invisible to the frame traceback machinery]? I'm not sure if that's a point in favor or against, given that we prefer not to have magical user-visible changes in behavior between C and Python implementations :-). But in any case, ok, how would you implement something like functools.partial in Python?

await one_coroutine(another_coroutine())

Err, that's supposed to be an error, right, and both proposals on the table would treat it as such? What do you mean by "wouldn't handle"?

@njsmith njsmith changed the title Put together a Python 3.7 wishlist Can we make forgetting an await be an error? Mar 13, 2017
@1st1
Copy link

1st1 commented Mar 13, 2017

But in any case, ok, how would you implement something like functools.partial in Python?

Instead of looking at the last frame you can traverse until you see f_inawait. Isn't ideal though.

TBH I didn't think that you want both things at the same time:

  • better reporting about forgotten await (prohibit calling coroutines without await);
  • and a way to call coroutines without await (for partials).

I thought you want something as radical as PEP 3152, so my PoC does exactly that: it tries to enforce instantiating coroutines only in await expressions. functools.partial works by an accident.

What do you mean by "wouldn't handle"?

I mean that __acall__ will be used only by the await expression. The idea was to throw an error when a user simply calls a coroutine but doesn't await on it. But if you want to support partials (and code that creates similar wrappers) both our proposals wouldn't work.

You can only pick one thing:

  • always raise an error if a coroutine is called without an await.
  • allow coroutines to be called, to support wrappers/partials/decorators/etc (status quo).

@njsmith
Copy link
Member Author

njsmith commented Mar 13, 2017

@1st1: I don't care about partial magically working; I just want it to be possible for it to work :-). Basically I think it should be possible for wrappers/partials/decorators to work, but it should require some intentional action on the part of the person writing the wrapper. (And of course functools.partial should take this action.)

Instead of looking at the last frame you can traverse until you see f_inawait. Isn't ideal though.

I don't think this works, because in practically any context where you can potentially forget an await, you'd have some frame in your callstack that has f_inawait set.

@1st1
Copy link

1st1 commented Mar 14, 2017

Basically I think it should be possible for wrappers/partials/decorators to work, but it should require some intentional action on the part of the person writing the wrapper. (And of course functools.partial should take this action.)

At this point this all starts to sound very complicated to me. Given that:

  • asyncio won't use this new protocol in the foreseeable future;
  • we aren't certain that it's crucial to have this feature for curio/trio users; so far everybody's been doing great and I haven't seen any await-related complaints on SO/reddit/HN;
  • I'm not sure it would be a good thing that there will be some subtle differences in how you write/use asyncio coroutines vs curio/trio coroutines;
  • to implement this, besides your idea with AwaitCall(..) we'll need to add a global flag to the interpreter state to be able to dynamically change how async/await protocol works; this is kind of unprecedented in Python;

I'd say that all the work we'll need to do to implement this isn't really worth it (at least in 3.7).

@dabeaz
Copy link

dabeaz commented Mar 14, 2017

Personally, I feel like things are already rather complicated dealing with three different function types (synchronous, generators, coroutines). However, the fact that all of these things are basically "first class" and can be passed around freely opens up a lot of freedom regarding their use. All things equal, I'm not sure placing restrictions on how these functions get used in different contexts would make the situation better. If anything, it might make everything even more complicated.

Honestly, I don't have any deep thoughts regarding this with respect to Curio--I feel like the current behavior is something I can work with. It would be nice to have an easier way to know if you've been called by a coroutine for the purpose of some advanced decorator writing (Curio currently finds out via frame hacking). However, that's really a slightly different issue than this.

Out of curiosity, since I know you do a fair amount of training, have you done any async/await training yet? Part of the trigger for this thread was that when I wrote the trio tutorial aiming at teaching folks

I do not teach people async/await in training classes--although I very briefly mention it near the end of my advanced course. I'm not sure how fully people appreciate just how far beyond the day-to-day experience of most developers this async/await stuff is.

@ncoghlan
Copy link

As an argument against doing any special-casing here, I'll note that a similar problem exists for non-reusable context managers, and the "solution" for that is just a section in the contextlib documentation going over the problems that can arise if you attempt to reuse a context manager that doesn't support it: https://docs.python.org/3/library/contextlib.html#single-use-reusable-and-reentrant-context-managers

If we'd tried to bake any rules about ensuring that defined contexts were actually used into the language itself, then things like contextlib.ExitStack would have been much harder to write.

Heck, for folks coming from languages like MATLAB and Ruby with implicit call support, we even sometimes get complaints about statements like print "silently doing nothing" instead of being an alias for print().

That's just the nature of the beast - by making things first class entities that can be referenced without being invoked, we also make it possible for people to forget to invoke them.

Missing an await in that regard isn't really any different from missing a with or omitting trailing parentheses, it's just much newer than either of the latter.

@dabeaz
Copy link

dabeaz commented Mar 14, 2017

Oh, that's a good on print without parens. Actually, a very common mistake I see in training classes is people forgetting to put parens on methods taking no arguments (e.g., f.close). So, forgetting to include some critical piece of syntax (with, (), etc.) is certainly something that affects many other parts of Python. Nature of the beast maybe.

@njsmith
Copy link
Member Author

njsmith commented May 27, 2017

@Carreau's proposal has the nice property that the errors are prompt and don't depend on the garbage collector. It turns out that this is a real problem: I was trying to figure out how to at least teach trio's test harness to detect these warnings so it can raise a real error, and in many cases it's easy enough to do by intercepting the warning message. And then we can make it more robust by forcing the cycle collector to run at the end of each test to make sure that warnings are emitted. But... consider code like data = sock.recv(100); await sock.sendall(data)TypeError: sendall expected a <bytes>, not a <coroutine>. Here the coroutine is stored in a local variable, so the exception traceback will pin the coroutine object in memory until the test framework is done doing whatever it wants to do with the traceback, and it's tricky to write a test plugin that can change the outcome of a test after the test framework has already processed and discarded the traceback. But that's what we need to do if we want to correctly classify this test. Writing test plugins is already an exercise in delicate monkeypatching and this makes it extra super tricky.

The proposal is possible to implement now using sys.setcoroutinewrapper. The problem is that creating a wrapper object for every coroutine call probably adds some noticeable overhead, so it's hard to justify as an always-on feature... maybe the overhead is ok in a test suite though? I suppose I should measure instead of speculating, but it's certainly going to be something.

@1st1: Based on this I think I should change my request from our discussion last week: instead of having a global counter of how many times the coroutine '...' was never awaited warning was emitted, I think what I actually want is a global count of how many coroutine objects have been created but never awaited. (So __call__ increments, and the first send/throw/__next__ decrements.) At moments when everything is supposed to be finished running (e.g., the end of a test) then this should give the same results as my original idea, except that the new version is robust against the issue with tracebacks pinning coroutine objects in memory. And during normal execution, it works just as well as something for me to check occasionally inside the run loop, while being much more prompt about detecting errors. (E.g., it can actually give prompt and reliable errors when using a REPL under PyPy, which is hopeless when using the garbage collector.)

(I think by "global" I mean "thread local".)

@ncoghlan
Copy link

@Carreau An API worth looking at to potentially improve debuggability when running under tracemalloc: https://docs.python.org/3/library/tracemalloc.html#tracemalloc.get_object_traceback

That will let you report where the un-awaited object was allocated in addition to the location where you noticed that it hadn't been awaited yet.

@njsmith
Copy link
Member Author

njsmith commented May 27, 2017

I just filed a bug w/ CPython requesting the optimized version of this, see: https://bugs.python.org/issue30491

@Carreau
Copy link
Contributor

Carreau commented May 27, 2017

@ncoghlan sweet, I'll look at it.

I gave a shot at an implementation with set_coro_wrapper in #176.

@njsmith
Copy link
Member Author

njsmith commented Jun 6, 2017

Some notes because in a year or whatever I'll probably look back at this and want to remember:

I talked about this at the language summit, and talked to Yury about it afterwards (slides). The more ambitious proposal in those slides is similar but not identical to one in the thread up above:

  • leave await foo alone
  • make await foo(*args, **kwargs) sugar for:
    if hasattr(type(foo), "__acall__"):
        awaitable = type(foo).__acall__(foo, *args, **kwargs)
    else:
        awaitable = foo(*args, **kwargs)
    return await awaitable
  • trio can use a @decorator (or maybe eventually a __future__) to make its async functions provide __awaitcall__ but give an error on __call__
  • asyncio use a @decorator to make its async functions provide an __awaitcall__ that returns the coroutine object (so await foo() is fast), an a __call__ that returns a Task (so people don't need to worry about calling ensure_future and stuff, you can think of async functions as being things that return Futures when called normally, which is just like how traditional APIs worked. And now async functions and traditional Future-returning functions can be used in exactly the same ways, and await foo() and foo().add_callback(...) are maximally efficient regardless of how foo is implemented)
  • we'd also of course need to update functools.partial to accept both __acall__ and __call__ and pass them through

See, it's cleverly designed to have something for everyone!

As I understood it, Yury had two main objections to this:

  1. asyncio users don't really want async functions to return Tasks, they actually just want bare coroutine objects that they can pass to asyncio.gather or whatever. ...on further thought this doesn't make a lot of sense to me though, because asyncio.gather will just wrap any any bare coroutine objects in Tasks anyway, so my proposal doesn't hurt anything. Likely I misunderstood something. (Or maybe he just meant that this would only be a minor improvement for asyncio rather than a major one. I personally definitely find it confusing that asyncio has at least 3 different kinds of async functions – ones that return Futures, ones that return coroutine objects, and ones that return Futures but that are documented as returning coroutine objects because they want to reserve the right to change them in the future – and collapsing these down to just have one type of async function seems like a win to me. But maybe it's only a minor win, I dunno.)

  2. More importantly, he's concerned that implementing this would be really difficult, because (a) modifying the parser is annoying, (b) there are a whole bunch of opcodes for calling functions (with or without * or ** args, an optimized CALL_METHOD, etc. etc.), and this adds a whole new type of calling operation, so we'd need to double the number of call opcodes and duplicate a bunch of code and ugh.

So if/when someone decides to take this up again, I think these are the main things that would need answers.

I guess the obvious thing to try to resolve objection (2) would be the RESOLVE_AWAITCALL opcode suggested up-thread. I'm not sure if Yury has some reason in mind why it wouldn't work.

@ncoghlan
Copy link

ncoghlan commented Jun 6, 2017

There's a more fundamental objection to special casing await foo(): it breaks the notion of "in the absence of scope changes, any subexpression can be replaced with a name referring to that subexpression without altering the semantics of the overall expression"

That is:

await foo()

would no longer mean the same thing as:

extracted = foo()
await extracted

and that's simply not OK, even if you come up with some clever implementation hacks to enable them to be different.

@njsmith
Copy link
Member Author

njsmith commented Jun 6, 2017

@ncoghlan: OK, but... that's a circular argument. My proposal only breaks substitutability of subexpressions if we decide that in await foo(1, 2, 3), the foo(1, 2, 3) part is a subexpression. But that's exactly the point under discussion.

Analogy: In modern Python, this:

foo(1, 2, 3)

does not mean the same thing as:

extracted = 1, 2, 3
foo(extracted)

It didn't have to be this way – in fact, once upon a time these were equivalent! If you search Misc/HISTORY for "sleepless night" you can see Guido agonizing about whether to change it. But the current behavior is fine and doesn't violate your dictum (even though it looks like it should), because we decided to define ...(..., ..., ...) as being a single piece of syntax, rather than being composed out of call parentheses ...(...) plus tuple constructor commas ..., ..., .... In fact it's obvious now that this change made the language better.

My contention is that Python would also be a better language – easier to understand, teach, use, etc. – if we were to decide that in the expression await foo(1, 2, 3), the await ...(..., ..., ...) part was defined as a single piece of syntax, instead of being composed out of await ... plus call parentheses ...(..., ..., ...). It's not a trivial point – the trio tutorial adopts this stance when teaching async/await, and I had multiple people seeking me out at PyCon specifically to thank me for that tutorial, saying it was the only async/await explanation they'd ever understood. And the tutorial's explanation isn't oversimplified at all; it really teaches you everything you need to know to write arbitrary programs using trio. Yet it's still 2x longer than it needs to be, because literally half of the explanation is about how to cope with the problems caused by foo(1, 2, 3) being a subexpression. (See slide 34.)

You can certainly disagree with my claim about how Python should work, or agree with it but feel that it's too late to do anything. But you can't say that that a proposal to change the definition of subexpressions is bad because it violates substitutability of subexpressions :-)

@ncoghlan
Copy link

ncoghlan commented Jun 6, 2017

OK, I think I see your argument now, and given the leading await you should be able to make it work even within the constraints of the LL(1) parsing restriction (it isn't substantially different from def ...(..., ..., ...) in that regard, especially if you restrict the intervening subexpression to being a name lookup).

That said, while it doesn't read as well, you could likely more easily experiment with an await from foo(1, 2, 3) formulation that switched on the from keyword, rather than the await subexpression being a call construct.

@dabeaz
Copy link

dabeaz commented Jun 6, 2017

One the features I most like about Python is its flexibility. I would not want Python to be modified in a way that forces me to put an await on every instantiation of a coroutine. Specifically, I want this to work:

extracted = foo()
...
await extracted

@ncoghlan
Copy link

ncoghlan commented Jun 6, 2017

@dabeaz @njsmith isn't proposing breaking that, he's just proposing to have it mean something different from await foo(), just as extracted = 1, 2, 3; foo(extracted) already means something slightly different from foo(1, 2, 3).

Extending the analogy to other forms:

  • foo(1, 2, 3) can be expanded as extracted = 1, 2, 3; foo(*extracted)
  • extracted = 1, 2, 3; foo(extracted) is actually equivalent to foo((1, 2, 3))

Comparable equivalents for @njsmith's proposal might then look like:

  • await foo(1, 2, 3) could be expanded as extracted = partial(foo, 1, 2, 3); await from extracted
  • extracted = foo(1, 2, 3); await extracted would become equivalent to await (foo(1, 2, 3))

And putting it that way, I do think the right place to start is to figure out what the "arg expansion" equivalent for @njsmith's proposal would actually look like, and I think await from expr with a suitably updated functools.partial implementation is a promising candidate (due to the yield from precedent).

@njsmith
Copy link
Member Author

njsmith commented Jun 6, 2017

I see that my plan of leaving those notes at the bottom of this thread so I can find them again easily when I revisit this next year is not happening :-)

@dabeaz: even in my ideal world, the only thing that code would have to change is that you'd write extracted = foo.__acall__() (or however we ended up spelling it) to explicitly signal that you intentionally left off the await and really do want the coroutine object rather than the result of calling foo. But the proposal described above doesn't even go that far: all it proposes is to let people writing async functions opt in to requiring the explicit .__acall__() if they want, so it wouldn't affect you at all unless you decided to add this to curio as a feature. (And I guess if it's implemented and is wildly successful then we might get consensus to do a long deprecation period and then switch the default to requiring the explicit __acall__, and even then libraries could still opt-out. But I don't think you need to worry about that right now!)

Really, the proposal is just to provide a simple and reliable way to let async functions know whether or not they got called with await, and then do whatever they want with this information. It's actually more flexible than what we have now, and tbh seems like the sort of thing you might be able to find some terrifying use for...

@ncoghlan
Copy link

ncoghlan commented Jun 6, 2017

@njsmith It's probably the opposite of comforting, but my writing PEP 432 was originally motivated by getting __main__ to play nice with the pure Python import system in 3.3, and we've only just begun to move forward with the implementation as a private startup refactoring for 3.7 :)

@njsmith
Copy link
Member Author

njsmith commented Aug 27, 2017

Stopgap measure design notes

I was thinking some more about a possible stopgap measure that we might be able to sneak into 3.7 to make something like #176 fast. Basically the idea would be to get just enough help from the interpreter to make checking for unawaited coroutines fast enough that we can afford to do it at every context switch.

(There was some previous discussion of these ideas in #176 starting here: #176 (comment))

In more detail, the requirements would be:

  • At a "barrier" (= yield point or catch-all exception handler), reliably and quickly detect whether there are any unawaited coroutines created since the last barrier

  • If so, get a list of them so we can give a nice error message

  • Also, we want to be able to disable this checking on a per-task basis, because some tasks might be "hosting" asyncio code (umbrella bug: asyncio interoperability #171). So we need to be able to "consume" the list of unawaited coroutines -- if one task creates some and we think that's OK and leave them be, then we don't want to re-detect those coroutines the next time we check.

There are two fairly natural ways to do this that come to mind. They both involve the same basic strategy: adding two pointers to each coroutine object, which we can use to create a double-linked list holding the coroutines of interest (with the list head stored in the threadstate). The nice thing about an intrusive double-linked list like this is that it's a collection where adding and removing are both O(1) and very cheap (just updating a few pointers).

API option 1

Keep a thread-local list of live, unawaited coroutines. So coroutines always insert themselves into the list when created, and then remove themselves again when they're either iterated or deallocated.

Then we need:

  • A way to get and clear the list of coroutines: sys.get_and_clear_unawaited_coroutines()

  • A way to register a hook that's called whenever an unawaited coroutine is garbage collected: sys.set_unawaited_coroutine_gc_hook, sys.get_unawaited_coroutine_gc_hook

def unawaited_coroutine_gc_hook(coro):
    _gced_unawaited_coros.add(coro)

def barrier():
    live_unawaited_coros = sys.get_and_clear_unawaited_coros()
    if _gced_unawaited_coros or live_unawaited_coros:
        # do expensive stuff here, checking if this task is hosting asyncio, etc.
        ...

(We need to have both hooks to handle the two cases of (a) a coroutine is created and then immediately garbage collected in between barriers, (b) a coroutine is created and then still live when we hit the barrier.)

For speed, we might want to also have a sys.has_unawaited_coros() that returns a bool, so that we don't need to bother calling get_and_clear_unawaited_coros every time and pay the price of allocating an empty list object just to throw it away. %timeit bool([]) says 127 ns and %timeit get_coroutine_wrapper() (used here as a proxy for a trivial function that just returns a value from the threadstate) says 37 ns, so it's faster but they're both fast enough I don't know whether it matters without some kind of testing.

A minor awkward question here is whether the regular unawaited coro warning would be issued at the same time our hook is called, or whether we'd have some way to suppress it, or...

Oh wait, there's also a more worrisome problem: if we see a coroutine is unawaited and live, and decide that it's OK... what happens if it later gets garbage collected and is still unawaited? We'll see it again, but now in some random other context. I guess what we could do is that when we detect live unawaited coroutines and we're in asyncio-friendly mode, we put them into a WeakSet (or even a WeakDict to track which task created them), and then our gc hook checks to see if they're in the WeakSet. Except... I think __del__ might run after weakrefs are cleared? If so then this whole approach is probably sunk. Or maybe we could somehow mark the coroutines that we've seen before? You can't attach arbitrary attributes to coroutines. I suppose we could do some nasty hack involving the coroutine's locals... those should still be accessible from __del__.

API option 2

In the design above, the need to handle GCed unawaited coroutines is the cause of lots of different kinds of awkwardness. What we could do instead is make our tracking list a strong reference, so that coroutines insert themselves when created, and then stay there either until they're iterated or else we remove them manually.

Obviously this can't be enabled by default, because it would mean that instead of warning about unawaited coroutines, the interpreter would just leak them. So the API would be something like:

sys.enable_unawaited_coroutine_tracking(), sys.disable_unawaited_coroutine_tracking(), sys.get_and_clear_unawaited_coroutines()

And then everything else is pretty straightforward and obvious. (Though we still have the question about whether we'd want a sys.has_unawaited_coroutines() for speed.)

PyPy's opinion

I asked @arigo about how these options look from the point of view of PyPy. He said that if sys.get_and_clear_unawaited_coros() was written carefully, then it would probably be possible for the JIT to inline and optimize out the unnecessary empty list, so that's a useful data point. (Though I guess it might still be better to save it the trouble of needing to, esp. since you're not always in JIT mode.)

He also had a strong preference for the second API on the general grounds that doing anything from __del__ methods is just asking for trouble and creates all kinds of problems for alternative interpreter implementations. Obviously they support it, but it's not cheap. In this particular case it's not a huge practical difference currently, b/c coroutines already have a __del__ method (the one that prints the coroutine '...' was never awaited message, or potentially throws GeneratorExit – though PyPy is clever enough to avoid generating a __del__ method just for the latter if there are no yield points with exception handlers). But he felt that option 2 was just simple from an abstract "let's try to make it easier rather than harder to write a python implementation" perspective. (And who knows, maybe there's some chance we could someday get rid of coroutine '...' was never awaited.)

Other possible consumers

Nick points out that this kind of list structure might also be useful for more general introspection, similar to threading.enumerate(): #176 (comment)

Curio, possibly. Not sure, would have to check with Dave.

I think pytest-asyncio and similar libraries might find this useful to easily and reliably catch unawaited coroutines and attribute them to the right test. [Edit: asked pytest-asyncio here: https://github.com/pytest-dev/pytest-asyncio/issues/67 ]

@dabeaz
Copy link

dabeaz commented Aug 27, 2017

Forgetting to put await on coroutines is not a problem that I'm concerned about. It is unlikely that I would use this in Curio.

@ncoghlan
Copy link

As far as the __del__ comments go, coroutine objects are hashable, so WeakSet seems like it would be a better fit for this purpose than doing custom state manipulation in __del__.

I'll also note that everything except the "already GC'ed before the next check for unawaited coroutines" case can already be handled by calling sys.set_coroutine_wrapper with something like (not tested, but should be in the right general direction):

_coro_tracking = threading.local()
_coro_tracking.created = weakref.WeakSet()
_coro_tracking.started = weakref.WeakSet()

def track_coroutines(coro_def):
    @functools.wraps(coro_def)
    def make_tracked_coro(*args, **kwds):
        inner_coro = coro_def(*args, **kwds)
        _coro_tracking.created.add(inner_coro)
        @types.coroutine
        def tracked_coro():
            _coro_tracking.started.add(inner_coro)
            _coro_tracking.created.remove(inner_coro)
            yield from inner_coro
        return tracked_coro(*args, **kwds)
    return make_tracked_coro

That way, the interaction with the GC would just be normal WeakSet interaction, while the interaction with coroutine definitions would use the existing wrapper machinery (which trio is presumably hooking into already).

If you instead want something more like your second case, then switch the created set to being a regular set such that created coroutines can't be GC'ed before they're seen by the event loop. Whether or not to configure that by default would be up to the author of the wrapper function.

@njsmith
Copy link
Member Author

njsmith commented Aug 29, 2017

@ncoghlan Yeah, but my concern is that we won't be able to make the speed workable like that. Since this adds per-call and per-context-switch overhead it's probably worthwhile trying to get it as low as possible. But one reason for working through the thoughts here before taking something to bpo is to figure out exactly what semantics we'd want so we can prototype it with set_coroutine_wrapper and see what happens :-).

@ncoghlan
Copy link

yield from should get your regular context switch overhead back to near-zero, so you're mainly looking at a bit of extra function call overhead per coroutine created. You also have the option of putting the enablement of the wrapper behind an if __debug__: guard.

Anyway, probably the most useful feedback is that your suggested API option 2 is likely the most viable approach, where the default behaviour is to use a WeakSet (so only non-GC'ed references are checked, which is sufficient for the result = unawaited_coro() case), and you have an opt-in toggle to switch to using a regular set instead (allowing you to also check for the unawaited_coro() case).

@njsmith
Copy link
Member Author

njsmith commented Sep 3, 2017

Meanwhile back in the subthread about possibly making await ...(...) a single piece of syntax:

OK, I think I see your argument now, and given the leading await you should be able to make it work even within the constraints of the LL(1) parsing restriction

I was nervous about this b/c I don't know a lot about parsing, but actually it looks trivial: the grammar rule that currently handles await expressions is atom_expr: [AWAIT] atom trailer*, where AWAIT is the await token and trailer* is the function call parentheses (among other things). So in fact await ...(...) already is a single piece of syntax as far as the parser is concerned; it's only split into await ... and ...(...) during the concrete syntax → AST transformation.

@njsmith
Copy link
Member Author

njsmith commented Sep 16, 2017

Another reason it would be useful to allow foo to tell whether it was called synchronously foo(...) or asynchronously await foo(...): it would make it possible to transition a function from synchronous to asynchronous or vice-versa with an intermediate period where both work and one emits a DeprecationWarning. Right now this is mostly impossible outside of some special cases. (I think you can do it iff you have a function that has no return value and is going async→sync.)

(I ran into this with trio's bind method, see #241, but I expect it will be much much more common in downstream libraries.)

@njsmith
Copy link
Member Author

njsmith commented Jan 29, 2018

Note to self: think through a version of the __awaitcall__ idea where there's something like a context-local set_coroutine_wrapper, but it's only invoked by __call__, not __awaitcall__, so await asyncfn(...) stays just as fast, but the coroutine runner can make async def functions called normally return Futures on asyncio and Deferreds on twisted – thus bringing their async/await usability in line with that of C#/Javascript – and trio can make async def functions called normally raise an error.

Some things to watch out for: wrapper functions, functools.partial; preserving await fut + await sync_fn_returning_future(); preserving asyncio.run(asyncfn()); fast access to __awaitcall__ from C and Python (the latter requiring the preservation of funcall optimizations) – maybe __awaitcall__ has an unusual type slot like PyObject* (*awaitcall)(void)?

@vlad0337187
Copy link

vlad0337187 commented Oct 17, 2018

I think, that in more than 90% of code, we just write await before launching async functions.
Think, it would be better to just omit await and wait for result from async functions automatically.

If task needs to be spawned - nursery can be used.

I think, that it could be implemented somehow. If call to async function returns just coroutine, maybe, there could be created self-executed coroutines to launch just after they were created.

@maxfischer2781
Copy link

maxfischer2781 commented Nov 5, 2019

I think it is important to note that a major reason why most async is just a regular await coro() may be that the ecosystem for async is seriously underdeveloped at this point. In other words, the bare await coro() may not stay as dominant as it is now.

For functions, methods and generators, it is accepted practice to treat them as first-class objects -- passing them around, wrapping them, storing them. People generally learn pretty quickly that func, obj.meth and gen() are "things" that fit well into partial, sorted, enumerate, producing yet again "things". This is helped by the standard library shipping with all of these wrappers and things that make it natural to experience functions, methods and generators as "things" en par with other "things".

Right now, everything async is very clearly a different breed of "thing". The most obvious case is that enumerate, map, filter do not work on async iterables, which makes common sync patterns very painful to do async. There is the subtle case that coroutines do not work as map/reduce/... operations, or property setters, they don't have literals (lambda), and so on -- which means there is simply not much else to do with coroutines than await them.

The only case where async and sync are on equal footing is () -- which trio also relies on for every case where partial is recommended. Further special-casing await ...() versus await ... and ...() will likely make it even more difficult to treat coroutines like things equal to other constructs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

10 participants