From 1ff22a174296a9eeafc0d10bee8abbc2210b4d83 Mon Sep 17 00:00:00 2001 From: ksamuel Date: Sat, 9 Nov 2019 00:02:25 +0100 Subject: [PATCH 1/5] Simplify client hello world, explain verbosity Simplify the hello world example in the README and doc index Add a new page to the doc (http_request_lifecycle) to explain why aiohttp's client API requires several steps for performing a request. Compare it with the "requests" library. Gives some information about the use of the ClientSession object. --- CHANGES/4315.doc | 1 + CONTRIBUTORS.txt | 1 + README.rst | 29 ++++++--- docs/client.rst | 1 + docs/glossary.rst | 7 ++ docs/http_request_lifecycle.rst | 111 ++++++++++++++++++++++++++++++++ docs/index.rst | 46 ++++++++----- 7 files changed, 171 insertions(+), 25 deletions(-) create mode 100644 CHANGES/4315.doc create mode 100644 docs/http_request_lifecycle.rst diff --git a/CHANGES/4315.doc b/CHANGES/4315.doc new file mode 100644 index 00000000000..3f5efd95e0a --- /dev/null +++ b/CHANGES/4315.doc @@ -0,0 +1 @@ +Simplify README hello word example and add a documentation page for people comming from requests. diff --git a/CONTRIBUTORS.txt b/CONTRIBUTORS.txt index 6432a006a1e..d1acc0177ac 100644 --- a/CONTRIBUTORS.txt +++ b/CONTRIBUTORS.txt @@ -125,6 +125,7 @@ Igor Pavlov Ilya Chichak Ingmar Steen Ivan Larin +Kevin Samuel Jacob Champion Jaesung Lee Jake Davis diff --git a/README.rst b/README.rst index 4da04cc1739..2999ac80074 100644 --- a/README.rst +++ b/README.rst @@ -35,7 +35,6 @@ Async http client/server framework :alt: Chat on Gitter - Key Features ============ @@ -58,19 +57,29 @@ To get something from the web: import aiohttp import asyncio - async def fetch(session, url): - async with session.get(url) as response: - return await response.text() - async def main(): + async with aiohttp.ClientSession() as session: - html = await fetch(session, 'http://python.org') - print(html) + async with session.get('http://python.org') as response: + + print("Status:", response.status) + print("Content-type:", response.headers['content-type']) + + html = await response.text() + print("Body:", html[:15], "...") + + loop = asyncio.get_event_loop() + loop.run_until_complete(main()) + +This prints: + +.. code-block:: - if __name__ == '__main__': - loop = asyncio.get_event_loop() - loop.run_until_complete(main()) + Status: 200 + Content-type: text/html; charset=utf-8 + Body: ... +Comming from `requests`_ ? Read :ref:`why we need so many lines `. Server ------ diff --git a/docs/client.rst b/docs/client.rst index 588addc93d6..0c57de57472 100644 --- a/docs/client.rst +++ b/docs/client.rst @@ -15,3 +15,4 @@ The page contains all information about aiohttp Client API: Advanced Usage Reference Tracing Reference + The aiohttp Request Lifecycle diff --git a/docs/glossary.rst b/docs/glossary.rst index ea0f22950d0..bc5e1169c33 100644 --- a/docs/glossary.rst +++ b/docs/glossary.rst @@ -85,6 +85,13 @@ A mechanism for encoding information in a Uniform Resource Locator (URL) if URL parts don't fit in safe characters space. + requests + + Currently the most popular synchronous library to make + HTTP requests in Python. + + https://requests.readthedocs.io + requoting Applying :term:`percent-encoding` to non-safe symbols and decode diff --git a/docs/http_request_lifecycle.rst b/docs/http_request_lifecycle.rst new file mode 100644 index 00000000000..766923df508 --- /dev/null +++ b/docs/http_request_lifecycle.rst @@ -0,0 +1,111 @@ + + +.. _aiohttp-request-lifecycle: + + +The aiohttp Request Lifecycle +============================= + + +Why is aiohttp client API that way? +-------------------------------------- + + +The first time you use aiohttp, you'll notice that a simple HTTP request is performed not with one, but with up to three steps: + + +.. code-block:: python + + + async with aiohttp.ClientSession() as session: + async with session.get('http://python.org') as response: + print(await response.text()) + + +It's especially unexpected when coming from other libraries such as the very popular :term:`requests`, where the "hello world" looks like this: + + +.. code-block:: python + + + response = requests.get('http://python.org') + print(response.text()) + + +So why is the aiohttp snippet so verbose? + + +Because aiohttp is asynchronous, its API is designed to make the most out of non-blocking network operations. In a code like this, requests will block three times, and does it transparently, while aiohttp gives the event loop three opportunities to switch context: + + +- When doing the `.get()`, both libraries send a GET request to the remote server. For aiohttp, this means asynchronous I/O, which is here marked with an `async with` that gives you the guaranty that not only it doesn't block, but that it's cleanly finalized. +- When doing `response.text` in requests, you just read an attribute. The call to `.get()` already preloaded and decoded the entire response payload, in a blocking manner. aiohttp loads only the headers when `.get()` is executed, letting you decide to pay the cost of loading the body afterward, in a second asynchronous operation. Hence the `await response.text()`. +- `async with aiohttp.ClientSession()` does not perform I/O when entering the block, but at the end of it, it will ensure all remaining resources are closed correctly. Again, this is done asynchronously and must be marked as such. The session is also a performance tool, as it manages a pool of connections for you, allowing you to reuse them instead of opening and closing a new one at each request. You can even `manage the pool size by passing a connector object `_. + +Using a session as a best practice +----------------------------------- + +The requests library does in fact also provides a session system. Indeed, it lets you do: + +.. code-block:: python + + with requests.session() as session: + response = session.get('http://python.org') + print(response.text) + +It just not the default behavior, nor is it advertised early in the documentation. Because of this, most users take a hit in performances, but can quickly start hacking. And for requests, it's an understandable trade-off, since its goal is to be "HTTP for humans" and simplicity has always been more important than performance in this context. + +However, if one uses aiohttp, one chooses asynchronous programming, a paradigm that makes the opposite trade-off: more verbosity for better performances. And so the library default behavior reflects this, encouraging you to use performant best practices from the start. + +How to use the ClientSession ? +------------------------------- + +By default the :class:`aiohttp.ClientSession` object will hold a connector with a maximum of 100 connections, putting the rest in a queue. This is quite a big number, this means you must be connected to a hundred different servers (not pages!) concurrently before even having to consider if your task needs resource adjustment. + + +In fact, you can picture the session object as a user starting and closing a browser: it wouldn't make sense to do that every time you want to load a new tab. + +So you are expected to reuse a session object and make many requests from it. For most scripts and average-sized softwares, this means you can create a single session, and reuse it for the entire execution of the program. You can even pass the session around as a parameter in functions. E.G, the typical "hello world": + +.. code-block:: python + + import aiohttp + import asyncio + + async def main(): + async with aiohttp.ClientSession() as session: + async with session.get('http://python.org') as response: + html = await response.text() + print(html) + + loop = asyncio.get_event_loop() + loop.run_until_complete(main()) + + +Can become this: + + +.. code-block:: python + + import aiohttp + import asyncio + + async def fetch(session, url): + async with session.get(url) as response: + return await response.text() + + async def main(): + async with aiohttp.ClientSession() as session: + html = await fetch(session, 'http://python.org') + print(html) + + loop = asyncio.get_event_loop() + loop.run_until_complete(main()) + +On more complex code bases, you can even create a central registry to hold the session object from anywhere in the code, or a higher level `Client` class that holds a reference to it. + +When to create more than one session object then? It arises when you want more granularity with your resources management: + +- you want to group connections by a common configuration. E.G: sessions can set cookies, headers, timeout values, etc. that are shared for all connections they holds. +- you need several threads and want to avoid sharing a mutable object between them. +- you want several connection pools to benefit from different queues and assign priorities. E.G: one session never uses the queue and is for high priority requests, the other one has a small concurrency limit and a very long queue, for non important requests. diff --git a/docs/index.rst b/docs/index.rst index ccafada32f1..00f84682da9 100644 --- a/docs/index.rst +++ b/docs/index.rst @@ -62,25 +62,42 @@ separate commands anymore! Getting Started =============== -Client example:: +Client example +-------------- - import aiohttp - import asyncio +.. code-block:: python - async def fetch(session, url): - async with session.get(url) as response: - return await response.text() + import aiohttp + import asyncio - async def main(): - async with aiohttp.ClientSession() as session: - html = await fetch(session, 'http://python.org') - print(html) + async def main(): - if __name__ == '__main__': - loop = asyncio.get_event_loop() - loop.run_until_complete(main()) + async with aiohttp.ClientSession() as session: + async with session.get('http://python.org') as response: + + print("Status:", response.status) + print("Content-type:", response.headers['content-type']) + + html = await response.text() + print("Body:", html[:15], "...") + + loop = asyncio.get_event_loop() + loop.run_until_complete(main()) + +This prints: -Server example:: +.. code-block:: text + + Status: 200 + Content-type: text/html; charset=utf-8 + Body: ... + +Comming from :term:`requests` ? Read :ref:`why we need so many lines `. + +Server example: +---------------- + +.. code-block:: python from aiohttp import web @@ -100,7 +117,6 @@ Server example:: For more information please visit :ref:`aiohttp-client` and :ref:`aiohttp-web` pages. - What's new in aiohttp 3? ======================== From 03427f794c4637700837457f4c38ac8082a8f3f0 Mon Sep 17 00:00:00 2001 From: ksamuel Date: Sat, 9 Nov 2019 00:12:02 +0100 Subject: [PATCH 2/5] Properly name news fragment with issue number --- CHANGES/{4315.doc => 4272.doc} | 0 1 file changed, 0 insertions(+), 0 deletions(-) rename CHANGES/{4315.doc => 4272.doc} (100%) diff --git a/CHANGES/4315.doc b/CHANGES/4272.doc similarity index 100% rename from CHANGES/4315.doc rename to CHANGES/4272.doc From 6a7ec3189fadbb699d72e5f4bb7a81ebccb52928 Mon Sep 17 00:00:00 2001 From: ksamuel Date: Sat, 9 Nov 2019 10:45:12 +0100 Subject: [PATCH 3/5] Fix README links and CONTRIBUTORS order --- CONTRIBUTORS.txt | 2 +- README.rst | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/CONTRIBUTORS.txt b/CONTRIBUTORS.txt index d1acc0177ac..14a3f58fad6 100644 --- a/CONTRIBUTORS.txt +++ b/CONTRIBUTORS.txt @@ -125,7 +125,6 @@ Igor Pavlov Ilya Chichak Ingmar Steen Ivan Larin -Kevin Samuel Jacob Champion Jaesung Lee Jake Davis @@ -152,6 +151,7 @@ Justas Trimailovas Justin Foo Justin Turner Arthur Kay Zheng +Kevin Samuel Kimmo Parviainen-Jalanko Kirill Klenov Kirill Malovitsa diff --git a/README.rst b/README.rst index 2999ac80074..a852e737d69 100644 --- a/README.rst +++ b/README.rst @@ -79,7 +79,7 @@ This prints: Content-type: text/html; charset=utf-8 Body: ... -Comming from `requests`_ ? Read :ref:`why we need so many lines `. +Comming from `requests `_ ? Read `why we need so many lines `_. Server ------ From 57bc684f9bc45f93a7602b90df0ba1c03d808f1b Mon Sep 17 00:00:00 2001 From: ksamuel Date: Mon, 11 Nov 2019 20:45:09 +0100 Subject: [PATCH 4/5] Use for inline code --- docs/http_request_lifecycle.rst | 9 ++++----- 1 file changed, 4 insertions(+), 5 deletions(-) diff --git a/docs/http_request_lifecycle.rst b/docs/http_request_lifecycle.rst index 766923df508..96e7c47cbb1 100644 --- a/docs/http_request_lifecycle.rst +++ b/docs/http_request_lifecycle.rst @@ -38,9 +38,9 @@ So why is the aiohttp snippet so verbose? Because aiohttp is asynchronous, its API is designed to make the most out of non-blocking network operations. In a code like this, requests will block three times, and does it transparently, while aiohttp gives the event loop three opportunities to switch context: -- When doing the `.get()`, both libraries send a GET request to the remote server. For aiohttp, this means asynchronous I/O, which is here marked with an `async with` that gives you the guaranty that not only it doesn't block, but that it's cleanly finalized. -- When doing `response.text` in requests, you just read an attribute. The call to `.get()` already preloaded and decoded the entire response payload, in a blocking manner. aiohttp loads only the headers when `.get()` is executed, letting you decide to pay the cost of loading the body afterward, in a second asynchronous operation. Hence the `await response.text()`. -- `async with aiohttp.ClientSession()` does not perform I/O when entering the block, but at the end of it, it will ensure all remaining resources are closed correctly. Again, this is done asynchronously and must be marked as such. The session is also a performance tool, as it manages a pool of connections for you, allowing you to reuse them instead of opening and closing a new one at each request. You can even `manage the pool size by passing a connector object `_. +- When doing the ``.get()``, both libraries send a GET request to the remote server. For aiohttp, this means asynchronous I/O, which is here marked with an ``async with`` that gives you the guaranty that not only it doesn't block, but that it's cleanly finalized. +- When doing ``response.text`` in requests, you just read an attribute. The call to ``.get()`` already preloaded and decoded the entire response payload, in a blocking manner. aiohttp loads only the headers when ``.get()`` is executed, letting you decide to pay the cost of loading the body afterward, in a second asynchronous operation. Hence the ``await response.text()``. +- ``async with aiohttp.ClientSession()`` does not perform I/O when entering the block, but at the end of it, it will ensure all remaining resources are closed correctly. Again, this is done asynchronously and must be marked as such. The session is also a performance tool, as it manages a pool of connections for you, allowing you to reuse them instead of opening and closing a new one at each request. You can even `manage the pool size by passing a connector object `_. Using a session as a best practice ----------------------------------- @@ -62,7 +62,6 @@ How to use the ClientSession ? By default the :class:`aiohttp.ClientSession` object will hold a connector with a maximum of 100 connections, putting the rest in a queue. This is quite a big number, this means you must be connected to a hundred different servers (not pages!) concurrently before even having to consider if your task needs resource adjustment. - In fact, you can picture the session object as a user starting and closing a browser: it wouldn't make sense to do that every time you want to load a new tab. So you are expected to reuse a session object and make many requests from it. For most scripts and average-sized softwares, this means you can create a single session, and reuse it for the entire execution of the program. You can even pass the session around as a parameter in functions. E.G, the typical "hello world": @@ -102,7 +101,7 @@ Can become this: loop = asyncio.get_event_loop() loop.run_until_complete(main()) -On more complex code bases, you can even create a central registry to hold the session object from anywhere in the code, or a higher level `Client` class that holds a reference to it. +On more complex code bases, you can even create a central registry to hold the session object from anywhere in the code, or a higher level ``Client`` class that holds a reference to it. When to create more than one session object then? It arises when you want more granularity with your resources management: From 11eaf4a490998865a11e934a26f25c23a795f486 Mon Sep 17 00:00:00 2001 From: ksamuel Date: Mon, 11 Nov 2019 21:18:05 +0100 Subject: [PATCH 5/5] Improve the 'contributing' docs --- CONTRIBUTING.rst | 23 +++++++++++++---------- docs/contributing.rst | 32 +++++++++++++++++--------------- 2 files changed, 30 insertions(+), 25 deletions(-) diff --git a/CONTRIBUTING.rst b/CONTRIBUTING.rst index 919a570bd41..cb5ce3431b5 100644 --- a/CONTRIBUTING.rst +++ b/CONTRIBUTING.rst @@ -12,22 +12,25 @@ I hope everybody knows how to work with git and github nowadays :) Workflow is pretty straightforward: - 1. Clone the GitHub_ repo using ``--recurse-submodules`` argument + 1. Clone the GitHub_ repo using the ``--recurse-submodules`` argument - 2. Make a change + 2. Setup your machine with the required dev environment - 3. Make sure all tests passed + 3. Make a change - 4. Add a file into ``CHANGES`` folder. + 4. Make sure all tests passed - 5. Commit changes to own aiohttp clone + 5. Add a file into the ``CHANGES`` folder, named after the ticket or PR number - 6. Make pull request from github page for your clone against master branch + 6. Commit changes to your own aiohttp clone - 7. Optionally make backport Pull Request(s) for landing a bug fix - into released aiohttp versions. + 7. Make a pull request from the github page of your clone against the master branch -Please open https://docs.aiohttp.org/en/stable/contributing.html -documentation page for getting detailed information about all steps. + 8. Optionally make backport Pull Request(s) for landing a bug fix into released aiohttp versions. + +.. important:: + + Please open the "`contributing `_" + documentation page to get detailed informations about all steps. .. _GitHub: https://github.com/aio-libs/aiohttp diff --git a/docs/contributing.rst b/docs/contributing.rst index 3b9ae511856..b1920173ebd 100644 --- a/docs/contributing.rst +++ b/docs/contributing.rst @@ -6,28 +6,27 @@ Contributing Instructions for contributors ----------------------------- - -In order to make a clone of the GitHub_ repo: open the link and press the -"Fork" button on the upper-right menu of the web page. +In order to make a clone of the GitHub_ repo: open the link and press the "Fork" button on the upper-right menu of the web page. I hope everybody knows how to work with git and github nowadays :) Workflow is pretty straightforward: - 1. Clone the GitHub_ repo using ``--recurse-submodules`` argument + 1. Clone the GitHub_ repo using the ``--recurse-submodules`` argument + + 2. Setup your machine with the required dev environment - 2. Make a change + 3. Make a change - 3. Make sure all tests passed + 4. Make sure all tests passed - 4. Add a file into ``CHANGES`` folder (`Changelog update`_). + 5. Add a file into ``CHANGES`` folder (see `Changelog update`_ for how). - 5. Commit changes to own aiohttp clone + 6. Commit changes to your own aiohttp clone - 6. Make pull request from github page for your clone against master branch + 7. Make a pull request from the github page of your clone against the master branch - 7. Optionally make backport Pull Request(s) for landing a bug fix - into released aiohttp versions. + 8. Optionally make backport Pull Request(s) for landing a bug fix into released aiohttp versions. .. note:: @@ -68,8 +67,7 @@ For *virtualenvwrapper*: $ cd aiohttp $ mkvirtualenv --python=`which python3` aiohttp -There are other tools like *pyvenv* but you know the rule of thumb -now: create a python3 virtual environment and activate it. +There are other tools like *pyvenv* but you know the rule of thumb now: create a python3 virtual environment and activate it. After that please install libraries required for development: @@ -79,13 +77,17 @@ After that please install libraries required for development: .. note:: - If you plan to use ``pdb`` or ``ipdb`` within the test suite, execute: + For now, the development tooling depends on ``make`` and assumes an Unix OS If you wish to contribute to aiohttp from a Windows machine, the easiest way is probably to `configure the WSL `_ so you can use the same instructions. If it's not possible for you or if it doesn't work, please contact us so we can find a solution together. + +.. warning:: + + If you plan to use temporary ``print()``, ``pdb`` or ``ipdb`` within the test suite, execute it with ``-s``: .. code-block:: shell $ py.test tests -s - command to run the tests with disabled output capturing. + in order to run the tests without output capturing. Congratulations, you are ready to run the test suite!