-
-
Notifications
You must be signed in to change notification settings - Fork 346
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add collection of worked examples to tutorial #472
Comments
An |
Oh yeah, good idea! (And we should use the terms |
Oh, see also #421, which is a partial duplicate and has some more discussion of the nursery-based examples. |
Oh duh, here's another one: an example of implementing a custom protocol, by combining a sansio protocol with the stream interface. (Probably some simple line-oriented or netstring-oriented thing. This is one of the motivations for why I started working on |
A walkthrough for converting a sync protocol to Trio might also make sense. WRT trio-asyncio: maybe simply refer to it. I do need to add some example that shows how to convert from asyncio to trio-asyncio to trio, and how that improves the code. ;-) |
Would love to see that. 😁 |
Something like @jab's HTTP CONNECT proxy from #489 (comment) might be interesting too. (Possibly rewritten to use h11 ;-).) |
@oremanj's Maybe examples of testing and debugging would be good too. (Testing might just refer to the pytest-trio docs. Which we still need to write...) |
There was some more discussion of this on Gitter today, which resulted in rough drafts of reasonable implementations for gather and as_completed both: https://gitter.im/python-trio/general?at=5ae22ef11130fe3d361e4e25 @N-Coder pointed out that there are a number of useful "asyncio cookbook" type articles floating around, and probably trio would benefit from something that serves a similar role. I think that's the same idea in this thread, but the examples are potentially helpful: |
#527 is a common question; we should have something for it. |
As mentioned in #537, a UDP example would be good. This would also be a good way to demonstrate using The example in that comment thread is kind of boring. Doing a dns query or ntp query would be more interesting. [edit: see notes-to-self/ntp-example.py for an ntp query example.] |
blocking-read-hack.py: This demonstrates a really weird approach to solving python-triogh-174. See: python-trio#174 (comment) ntp-example.py: A fully-worked example of using UDP from Trio, inspired by python-trio#472 (comment) This should move into the tutorial eventually.
It would be good to have an example that discusses the subtleties of Maybe this would fit in with an example that wraps a stream in a higher-level protocol object, so we have to write our own |
Interaction between |
Channel examples – we might move the ones that are currently in It would also be good to have an example of async def send_all(value, send_channels):
async with trio.open_nursery() as nursery:
for send_channel in send_channels:
nursery.start_soon(send_channel.send, value) But then there are complications to consider around cancellation, and error-handling, and back-pressure... |
Using a buffered memory channel to implement a fixed-size database connection pool |
Re previous message: @ziirish wrote a first draft: https://gist.github.com/ziirish/ab022e440a31a35e8847a1f4c1a3af1d |
Zero-downtime upgrade (via socket activation, socket passing, unix-domain socket + atomic rename?) |
example of how to "hide" a nursery inside a context manager, using [Edit: And also, what to do in case you need to support |
Maybe some information about how to do [Note: I think this means |
As requested by @thedrow (e.g. #931), it would be great to have a simple worked example of wrapping a callback/fd-based C library and adapting it to Trio style, demonstrating I'm not sure what the best way to do this would be. Callback-based C libraries tend to be complicated and have idiosyncratic APIs. Which to some extent is useful for an example, because we want to show people how to handle their own complicated and idiosyncratic API, but it can also be problematic, because we don't want to force people to go spend a bunch of time learning about details of some random library they don't care about. We could write our own toy library just for the example, in C or Rust or whatever. We could pick an existing library that we think would be good pedagogically. Which one? Ideally: fairly straightforward interface, accomplishes a familiar task, already has a thin Python wrapper or it's trivial to make one through cffi. Some possibilities:
|
@thedrow we already have an issue for |
Some examples of commonly-desired "custom supervisors" would be useful, e.g. the dual-nurseries trick in #569. |
As mentioned here in gitter earlier today, when I was working the |
Here's a sketch for a web spider, including a tricky solution to figuring out when a circular channel flow is finished: https://gist.github.com/njsmith/432663a79266ece1ec9461df0062098d |
Hey @njsmith I just tested your spider, and it seems there is an issue regarding the closing of the send channel clones. Here is a mwe with a proposed fix:
If I remove the |
You're not closing the original |
Anyway, why is your code so complicated? Simplified:
Note that you don't need (and don't want) an infinite queue; limiting it to the number of workers is more than sufficient. |
@smurfix Thanks for the explanation. Could you please elaborate on why it is a bad idea to first fill the queue and then start the workers? I'm building a concurrent scraper that should roughly speaking be given a batch of urls and then scrape them concurrently. Because of retries some links may get added back to the queue. |
OK, yeah, if you're scraping then your original code makes more sense. ;-) The point is that you should never use an infinite queue. Infinite queues tend to fill memory. Also, the point of a queue is to supply back-pressure to the writers (i.e. your scraper) to slow down because the workers can't keep up. This, incidentally, significantly improves your chances of not getting blocked by the scraped sites. OK, now you have 103 sites in your initial list, 10 workers, and a 20-or-however-many job queue. If you fill the queue first, the job doing that will stall, and since there's no worker running yet you get a deadlock. |
Thank you for pointing out the issue with the infinite queue. Edit: My use case involves a low number of thousands of links at most, so I can afford to keep the entire queue in memory ie. I don't need a separate queue for workers to feed on and another to fill the first one. |
@smurfix you do need something like an infinite queue for a traditional recursive web scraper, since each consumer task may produce an arbitrary number of new work items. So any finite queue limit could produce a deadlock, when all the consumer tasks are blocked waiting for another consumer task to pull from the queue... It sounds like I misunderstood @snedeljkovic's problem, and I was assuming a single starting URL, while they actually have the full list of urls up front. So in my gist, I took a kind of shortcut that works for the single URL case, of sending in the original send channel without cloning it, but I didn't point out the tricky bit there, so it wasn't obvious how to correctly generalize to a case with multiple starting urls. That's useful to know for future docs – we need to cover the multiple URLs case and explain the logic behind starting it up correctly. @snedeljkovic It sounds like you figured out what you need to know? If you still have questions, please let us know – but let's move to a new issue or a thread on trio.discourse.group, so we can focus on your questions properly and keep this thread for tutorial update ideas. |
@njsmith Yeah, I realized that but got confused. (It's been a long day.) In any case, boiling this down to the essentials. you need
Doing this with a lot of cloned senders is perhaps not the most efficient way, thus I have encapsulated the idea in a (somewhat minimal) class, and packaged this version of @snedeljkovic's code in a gist: https://gist.github.com/smurfix/c8efac838e6b39bedc744a6ff8ca4405 Might be a good starting point for a tutorial chapter. |
It's good to keep a basic tenet in mind: don't code to the tools you have – that leads to bad design. Code to the tools you need, and build them with the tools you have. Recurse if necessary. ;-) |
"How do I"...
|
Quick note about getting started with writing a protocol and wanting to have a way of having pluggable transports (from the Gitter chat just now). For some reason, I couldn't bend my head around this for a while and this might help others:
|
Here's an example that could be turned into a good cookbook tutorial: why TCP flow control can cause deadlocks, and a way to avoid deadlocks when pipelining: https://gist.github.com/njsmith/05c61f6e06ca6a23bef732fbf5e832e6 |
#1141 really helped me a lot and I think it would be great to include an example of thinking about providing a |
From chat
|
I'd like to see a table mapping as many Go idioms to Trio idioms as possible, for those like me who have worked with Go.
|
@merlinz01 It might make sense indeed to have a table with a few different languages for comparison. I'm not too familiar with Go, but on the Trio side:
|
Thanks for the corrections! It's only a couple days since I learned about Trio, so I don't have all the Trio idioms mastered. For the num_workers = 5
ch := make(chan MyFunctionResult, num_workers)
// Start the workers
for x := range num_workers {
go send_the_result_over_chan(ch)
}
// Wait for the results
for x := range num_workers {
res := <-ch
if res.err != nil {
[handle error]
}
}
close(ch) vs. num_workers = 5
try:
# Start the workers
with trio.open_nursery() as nursery:
for x in range(num_workers):
nursery.start_soon(return_the_result)
# implicitly wait for the results
except* Exception:
[handle error] For the select example, I think this is more accurate: select {
case <-ch:
XXX
default:
YYY
} vs. try:
recv.receive_nowait()
except Trio.WouldBlock:
YYY
else:
XXX Go is super for networking and concurrency, but as it is a compiled language with monolithic binaries it is not quite as flexible. For my project I needed Python's extensibility and dynamic code compilation, so I'm using Trio's async functionality. I suppose there would be other comparisons besides this that would also be helpful to programmers coming from other languages. |
We should refactor the tutorial into an initial part that's similar to what we have now, or the middle part of my talk, and then a collection of examples that also serve as excuses to go into more depth on particular topics.
I expect the list will grow over time, but here are some ideas. (Actually the main reason I'm filing this is to have a place collect these so I don't lose them.)
The current tracing demo should move here. It's a good intro to trio introspection and to co-op concurrency, but having it in the main tutorial like it is now is a big blob of text for folks to wade through if they already know this stuff. (We can/should link to it from the async/await intro though.)
Happy eyeballs (for people who saw the talk but want a text version; as a demo of passing nursery objects around; ...)
Multiplexing rpc (Add "one obvious way" for implementing the common multiplexed request/response pattern #467)
Catch-all exception handler
Custom nursery, like a race function or ignore-errors nursery (maybe both)
Some standard stuff like echo server, proxy, fast web spider, ... Whatever doesn't end up in the main tutorial. (We could have echo server and TCP proxy as two examples, then show how to run them both within a single process as an example of implementing multi-protocol servers... and also to show off how
proxy_one_way
can be re-used for both! Maybe the proxy should demonstrate mirroringlocalhost:12345
tohttpbin:80
, so people can try it out with their web browsers?)trio-asyncio example?
nursery.start
Possibly some of these could be combined or form sequences, eg echo server -> catch all handler -> nursery.start
The text was updated successfully, but these errors were encountered: