-
-
Notifications
You must be signed in to change notification settings - Fork 345
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Can we make forgetting an await be an error? #79
Comments
The tricky bit with the "forgetting to call The issue in this case is how to reliably detect that an unstored async function was never started (i.e. is having |
@brettcannon: in a perfect world IMO we would have made Nominally async/await is still provisional so I guess technically we could make a switch like this, and we could account for it in Or, I think they would break? It's possible they're still using |
https://aiohttp.readthedocs.io/ says they are still supporting Python 3.4. |
Maybe there's some hope then? I dunno :-). It occurred to me that perhaps I should offer to present this list as a talk at the language summit this year – don't know if that would be useful/of interest, but maybe. |
Not anymore :)
We discussed this at length on python-dev. Here's a summary: https://www.python.org/dev/peps/pep-0492/#pep-3152. The most important point is this one: # The following code:
await fut
await function_returning_future()
await asyncio.gather(coro1(arg1, arg2), coro2(arg1, arg2))
# would look like:
cocall fut() # or cocall costart(fut)
cocall (function_returning_future())()
cocall asyncio.gather(costart(coro1, arg1, arg2),
costart(coro2, arg1, arg2)) I'd say changing core async/await semantics at this point is not an option. However, we can make it an opt-it (probably). IdeaOne way to implement this is to add a new attribute to frame objects async def foo():
print('hello')
await bar() would produce the following opcode:
Where Now, we have def coro_wrapper(coro):
if not sys._getframe(1).f_inawait:
raise RuntimeError(
'creating a coroutine outside of await expression')
return coro
sys.set_coroutine_wrapper(coro_wrapper) I have a PoC implementation here: https://github.com/1st1/cpython/tree/coro_nocall. And here's a script illustrating how it works: https://gist.github.com/1st1/1cc67287654cc575ea41e8e623ea8c71 If you and @dabeaz think that this would significantly improve trio/curio usability, I can draft a PEP. Maybe we can even enable this for asyncio code, but I'll need to think more about it. Perhaps we can have some APIs to asyncio for coroutine creation without await and slowly begin to use it in frameworks like aiohttp. |
I know I would be interested in a talk on this at the language summit. 😄 |
I'm not sure if this exact proposal is ideal, but you've at least convinced me that something useful might be doable and so it's worth having the discussion and hashing out our options :-). I don't have brain to give a proper critique, but as a mini-critique: the two main concerns that pop to mind are: (1) well, it's pretty ugly isn't it, (2) is the frame introspection thing going to kill pypy. Throwing out another idea: maybe try:
corofn = foo.__acall__
except AttributeError:
corofn = foo
coro = corofn()
await coro and then there are a few options for how we could arrange for async functions to provide |
I don't find it ugly to be honest. Maybe a bit unusual, but we have many other "unusual" fields on frames and generator objects. Maybe you can find a less ugly variation of this idea?
I envisioned this as a debug-mode (or call it "development mode") only check. In production, turn the debug mode off and buckle up.
Now this is pretty ugly and will make asyncio code slower under PyPy ;) Frankly, I don't think we can implement any revision of the FWIW I don't think that my idea with |
A debug mode would be better than nothing, but if possible I'd really prefer something that we can leave on in general. Imagine if functions in "production mode" responded to getting the wrong number of arguments by silently returning some weird object instead of raising an error – it's kind of the same thing. And debug modes impose a substantial cognitive overhead – you have to document them, every beginner has to remember to turn them on 100% of the time, some of them will fail or forget and we're back to getting confusing errors, etc.
Conceptually the Re: PyPy, we're talking about a simple wrapper object that has no state and one method that unconditionally calls another method. This is the sort of thing that their JIT eats for breakfast :-). On CPython it's worse because you can't optimize out the allocation of the wrapper object. But if this is a serious problem then instead of a wrapper it could be a flag that we set on the coroutine object (sorta similar to how def __call__(self, *args, **kwargs):
if not sys._check_special_thread_local_flag():
raise TypeError("async function called without await")
return self.__acall__(*args, **kwargs) I just thought it was a bit nicer to re-use |
To elaborate: any change to PEP 492 semantics will have to be fully backwards compatible for asyncio programs. Any significant drop in performance wouldn't be acceptable either. What you are proposing is to make parenthesis a part of the Maybe it's possible to come up with some clever idea here, but the patch (and the PEP) will be bigger than the one for PEP 492. |
We should probably move this discussion to async-sig or similar? |
Yeah, I understand it's not an ideal situation. But debug modes isn't something new, a lot of frameworks have them. Generally, people learn that they exist and that they are helpful.
Just make the debug mode default. And print to stdout that it's ON, and that it should be turned off in production.
I think you don't see the full picture of how exactly this can be implemented in CPython. Maybe I'm wrong here, but I don't think this is possible. It would help if you could come up with a more detailed proposal (proto PEP) where you describe:
I'd be happy to help you with coding the PoC if you can show that your idea is possible to implement.
I actually like the GH UI much more than my email client :) You can format your code here etc. Maybe you can just send a short email to async-sig inviting interested people to join the discussion here? |
I've hit the issue a few time in IPython while live coding, I'll be happy to write an extension that at least warn users on the REPL. |
try:
corofn = foo.__acall__
except AttributeError:
corofn = foo
coro = corofn()
await coro And what will you de-sugar |
Yeah, I think we'll have it in 3.7 one way or another :) |
Honestly, forgetting to put an One feature that'd be really useful from the Curio front would be some way to know whether a callable has been triggered from within the context of a coroutine or not. For example, decorators could use something like that to know more about the context in which an operation is taking place. |
Yeah, this has been my experience too. People do forget to put If we can find a way to make missing awaits trigger an actual error - why not, but I'd be -1 if it requires a huge new PEP and a change to async/await semantics.
You can use |
@1st1: I suspect that this isn't the ideal implementation, but to demonstrate that my idea can be precisely specified: for parsing, we could parse just like we do now, and then do a post-processing pass over the AST where we replace all terms of the form tmp = POP()
try:
PUSH(tmp.__acall__)
except AttributeError:
PUSH(tmp) and then
Out of curiosity, since I know you do a fair amount of training, have you done any async/await training yet? Part of the trigger for this thread was that when I wrote the trio tutorial aiming at teaching folks who hadn't been exposed to async/await before, I felt obliged to put in a big section up front warning them about this pitfall, since my impression is that it bites basically everyone repeatedly when they're learning. Also it doesn't help that in pypy the warning is delayed for an arbitrarily long time. (In particular, if you're at the REPL you probably just won't get a warning at all).
From this comment, I was under the impression that curio users would no longer be instantiating coroutines and passing them around. Is that right? If curio internally wants to do that, then of course that's fine – any proposal is going to have to have some way to get at a coroutine object, or we could never start our coroutine runners at all :-). The idea is just that since this is something that only coroutine runners need to do, it should have a different spelling that users won't hit by accident.
Ugh, that's a great point. I guess in any proposal, wrappers like |
Looks like it's already working in my PoC :) async def baz():
print('hi')
async def bar():
await baz() # try removing "await"
async def foo():
coro = functools.partial(bar)
await coro() ^-- works just fine. |
But you wouldn't handle cases like this then: await one_coroutine(another_coroutine()) |
@1st1: I'm guessing that's a side-effect of
Err, that's supposed to be an error, right, and both proposals on the table would treat it as such? What do you mean by "wouldn't handle"? |
Instead of looking at the last frame you can traverse until you see TBH I didn't think that you want both things at the same time:
I thought you want something as radical as PEP 3152, so my PoC does exactly that: it tries to enforce instantiating coroutines only in await expressions.
I mean that You can only pick one thing:
|
@1st1: I don't care about
I don't think this works, because in practically any context where you can potentially forget an |
At this point this all starts to sound very complicated to me. Given that:
I'd say that all the work we'll need to do to implement this isn't really worth it (at least in 3.7). |
Personally, I feel like things are already rather complicated dealing with three different function types (synchronous, generators, coroutines). However, the fact that all of these things are basically "first class" and can be passed around freely opens up a lot of freedom regarding their use. All things equal, I'm not sure placing restrictions on how these functions get used in different contexts would make the situation better. If anything, it might make everything even more complicated. Honestly, I don't have any deep thoughts regarding this with respect to Curio--I feel like the current behavior is something I can work with. It would be nice to have an easier way to know if you've been called by a coroutine for the purpose of some advanced decorator writing (Curio currently finds out via frame hacking). However, that's really a slightly different issue than this.
I do not teach people async/await in training classes--although I very briefly mention it near the end of my advanced course. I'm not sure how fully people appreciate just how far beyond the day-to-day experience of most developers this async/await stuff is. |
As an argument against doing any special-casing here, I'll note that a similar problem exists for non-reusable context managers, and the "solution" for that is just a section in the contextlib documentation going over the problems that can arise if you attempt to reuse a context manager that doesn't support it: https://docs.python.org/3/library/contextlib.html#single-use-reusable-and-reentrant-context-managers If we'd tried to bake any rules about ensuring that defined contexts were actually used into the language itself, then things like Heck, for folks coming from languages like MATLAB and Ruby with implicit call support, we even sometimes get complaints about statements like That's just the nature of the beast - by making things first class entities that can be referenced without being invoked, we also make it possible for people to forget to invoke them. Missing an |
Oh, that's a good on |
@Carreau's proposal has the nice property that the errors are prompt and don't depend on the garbage collector. It turns out that this is a real problem: I was trying to figure out how to at least teach trio's test harness to detect these warnings so it can raise a real error, and in many cases it's easy enough to do by intercepting the warning message. And then we can make it more robust by forcing the cycle collector to run at the end of each test to make sure that warnings are emitted. But... consider code like The proposal is possible to implement now using @1st1: Based on this I think I should change my request from our discussion last week: instead of having a global counter of how many times the (I think by "global" I mean "thread local".) |
@Carreau An API worth looking at to potentially improve debuggability when running under That will let you report where the un-awaited object was allocated in addition to the location where you noticed that it hadn't been awaited yet. |
I just filed a bug w/ CPython requesting the optimized version of this, see: https://bugs.python.org/issue30491 |
Some notes because in a year or whatever I'll probably look back at this and want to remember: I talked about this at the language summit, and talked to Yury about it afterwards (slides). The more ambitious proposal in those slides is similar but not identical to one in the thread up above:
See, it's cleverly designed to have something for everyone! As I understood it, Yury had two main objections to this:
So if/when someone decides to take this up again, I think these are the main things that would need answers. I guess the obvious thing to try to resolve objection (2) would be the |
There's a more fundamental objection to special casing That is:
would no longer mean the same thing as:
and that's simply not OK, even if you come up with some clever implementation hacks to enable them to be different. |
@ncoghlan: OK, but... that's a circular argument. My proposal only breaks substitutability of subexpressions if we decide that in Analogy: In modern Python, this: foo(1, 2, 3) does not mean the same thing as: extracted = 1, 2, 3
foo(extracted) It didn't have to be this way – in fact, once upon a time these were equivalent! If you search Misc/HISTORY for "sleepless night" you can see Guido agonizing about whether to change it. But the current behavior is fine and doesn't violate your dictum (even though it looks like it should), because we decided to define My contention is that Python would also be a better language – easier to understand, teach, use, etc. – if we were to decide that in the expression You can certainly disagree with my claim about how Python should work, or agree with it but feel that it's too late to do anything. But you can't say that that a proposal to change the definition of subexpressions is bad because it violates substitutability of subexpressions :-) |
OK, I think I see your argument now, and given the leading That said, while it doesn't read as well, you could likely more easily experiment with an |
One the features I most like about Python is its flexibility. I would not want Python to be modified in a way that forces me to put an await on every instantiation of a coroutine. Specifically, I want this to work:
|
@dabeaz @njsmith isn't proposing breaking that, he's just proposing to have it mean something different from Extending the analogy to other forms:
Comparable equivalents for @njsmith's proposal might then look like:
And putting it that way, I do think the right place to start is to figure out what the "arg expansion" equivalent for @njsmith's proposal would actually look like, and I think |
I see that my plan of leaving those notes at the bottom of this thread so I can find them again easily when I revisit this next year is not happening :-) @dabeaz: even in my ideal world, the only thing that code would have to change is that you'd write Really, the proposal is just to provide a simple and reliable way to let async functions know whether or not they got called with |
@njsmith It's probably the opposite of comforting, but my writing PEP 432 was originally motivated by getting |
Stopgap measure design notesI was thinking some more about a possible stopgap measure that we might be able to sneak into 3.7 to make something like #176 fast. Basically the idea would be to get just enough help from the interpreter to make checking for unawaited coroutines fast enough that we can afford to do it at every context switch. (There was some previous discussion of these ideas in #176 starting here: #176 (comment)) In more detail, the requirements would be:
There are two fairly natural ways to do this that come to mind. They both involve the same basic strategy: adding two pointers to each coroutine object, which we can use to create a double-linked list holding the coroutines of interest (with the list head stored in the threadstate). The nice thing about an intrusive double-linked list like this is that it's a collection where adding and removing are both O(1) and very cheap (just updating a few pointers). API option 1Keep a thread-local list of live, unawaited coroutines. So coroutines always insert themselves into the list when created, and then remove themselves again when they're either iterated or deallocated. Then we need:
def unawaited_coroutine_gc_hook(coro):
_gced_unawaited_coros.add(coro)
def barrier():
live_unawaited_coros = sys.get_and_clear_unawaited_coros()
if _gced_unawaited_coros or live_unawaited_coros:
# do expensive stuff here, checking if this task is hosting asyncio, etc.
... (We need to have both hooks to handle the two cases of (a) a coroutine is created and then immediately garbage collected in between barriers, (b) a coroutine is created and then still live when we hit the barrier.) For speed, we might want to also have a A minor awkward question here is whether the regular unawaited coro warning would be issued at the same time our hook is called, or whether we'd have some way to suppress it, or... Oh wait, there's also a more worrisome problem: if we see a coroutine is unawaited and live, and decide that it's OK... what happens if it later gets garbage collected and is still unawaited? We'll see it again, but now in some random other context. I guess what we could do is that when we detect live unawaited coroutines and we're in asyncio-friendly mode, we put them into a API option 2In the design above, the need to handle GCed unawaited coroutines is the cause of lots of different kinds of awkwardness. What we could do instead is make our tracking list a strong reference, so that coroutines insert themselves when created, and then stay there either until they're iterated or else we remove them manually. Obviously this can't be enabled by default, because it would mean that instead of warning about unawaited coroutines, the interpreter would just leak them. So the API would be something like:
And then everything else is pretty straightforward and obvious. (Though we still have the question about whether we'd want a PyPy's opinionI asked @arigo about how these options look from the point of view of PyPy. He said that if He also had a strong preference for the second API on the general grounds that doing anything from Other possible consumersNick points out that this kind of list structure might also be useful for more general introspection, similar to Curio, possibly. Not sure, would have to check with Dave. I think pytest-asyncio and similar libraries might find this useful to easily and reliably catch unawaited coroutines and attribute them to the right test. [Edit: asked pytest-asyncio here: https://github.com/pytest-dev/pytest-asyncio/issues/67 ] |
Forgetting to put await on coroutines is not a problem that I'm concerned about. It is unlikely that I would use this in Curio. |
As far as the I'll also note that everything except the "already GC'ed before the next check for unawaited coroutines" case can already be handled by calling
That way, the interaction with the GC would just be normal If you instead want something more like your second case, then switch the |
@ncoghlan Yeah, but my concern is that we won't be able to make the speed workable like that. Since this adds per-call and per-context-switch overhead it's probably worthwhile trying to get it as low as possible. But one reason for working through the thoughts here before taking something to bpo is to figure out exactly what semantics we'd want so we can prototype it with |
Anyway, probably the most useful feedback is that your suggested API option 2 is likely the most viable approach, where the default behaviour is to use a WeakSet (so only non-GC'ed references are checked, which is sufficient for the |
Meanwhile back in the subthread about possibly making
I was nervous about this b/c I don't know a lot about parsing, but actually it looks trivial: the grammar rule that currently handles |
Another reason it would be useful to allow (I ran into this with trio's |
Note to self: think through a version of the Some things to watch out for: wrapper functions, |
I think, that in more than 90% of code, we just write If task needs to be spawned - nursery can be used. I think, that it could be implemented somehow. If call to async function returns just coroutine, maybe, there could be created self-executed coroutines to launch just after they were created. |
I think it is important to note that a major reason why most For functions, methods and generators, it is accepted practice to treat them as first-class objects -- passing them around, wrapping them, storing them. People generally learn pretty quickly that Right now, everything The only case where |
[This issue was originally an umbrella list of all the potential Python core changes that we might want to advocate for, but a big discussion about
await
sprouted here so I've moved the umbrella issue to: #103]Original comment:
Can we do anything about how easy it is to forget an await? In retrospect async functions shouldn't implement
__call__
IMO... but probably too late to fix that. Still, it kinda sucks that one of first things in our tutorial is a giant warning about this wart, so worth at least checking if anyone has any ideas...The text was updated successfully, but these errors were encountered: