Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix a problem with connection waiters that are never awaited #4562

Merged
merged 5 commits into from
Oct 17, 2020

Conversation

illia-v
Copy link
Contributor

@illia-v illia-v commented Feb 9, 2020

What do these changes do?

When a connector waits for multiple available connections with the same key, the described race condition can happen.

Let's say that we want to create 3 connections but there a no available ones due to a limit. Then:

  1. The first future is added, BaseConnector.connect sets a deque 1 containing the future to a local waiters variable, the future is awaited, the context is switched.
  2. The second future is added, BaseConnector.connect sets the deque 1 containing first and second futures to waiters, the second future is awaited, the context is switched.
  3. One of the previous connections is closed, the first future is popped from the deque 1, its result is set, the context is switched.
  4. Another previous connection is closed, the second future is popped from the deque 1 (the deque is empty), its result is set, the context is switched.
  5. We returned to the context from the first step, the deque 1 is deleted from BaseConnector._waiters, ..., the context is switched.
  6. The third future is added, BaseConnector.connect sets a new deque 2 containing the third future to waiters, the third future is awaited, the context is switched.
  7. Asyncio returned to the context from the second step, it checks that the local waiters variable containing the deque 1 is empty, and deletes self._waiters[key]. But self._waiters[key] is the deque 2 at the moment. Therefore, the waiter represented as the third future will be never released. And the program will hang if no connect timeout is set.

Are there changes in behavior for the user?

No

Related issue number

Maybe #4258 and aio-libs/aiobotocore#738

Checklist

  • I think the code is well written
  • Unit tests for the changes exist
  • Documentation reflects the changes
  • If you provide code modification, please add yourself to CONTRIBUTORS.txt
    • The format is <Name> <Surname>.
    • Please keep alphabetical order, the file is sorted by names.
  • Add a new news fragment into the CHANGES folder
    • name it <issue_id>.<type> for example (588.bugfix)
    • if you don't have an issue_id change it to the pr id after creating the pr
    • ensure type is one of the following:
      • .feature: Signifying a new feature.
      • .bugfix: Signifying a bug fix.
      • .doc: Signifying a documentation improvement.
      • .removal: Signifying a deprecation or removal of public API.
      • .misc: A ticket has been closed, but it is not of interest to users.
    • Make sure to use full sentences with correct case and punctuation, for example: "Fix issue with non-ascii contents in doctest text files."

@psf-chronographer psf-chronographer bot added the bot:chronographer:provided There is a change note present in this PR label Feb 9, 2020
@codecov-io
Copy link

codecov-io commented Feb 9, 2020

Codecov Report

Merging #4562 into master will increase coverage by 0.94%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #4562      +/-   ##
==========================================
+ Coverage   96.59%   97.54%   +0.94%     
==========================================
  Files          43       43              
  Lines        8907     8907              
  Branches     1404     1404              
==========================================
+ Hits         8604     8688      +84     
+ Misses        175      100      -75     
+ Partials      128      119       -9
Impacted Files Coverage Δ
aiohttp/connector.py 96.28% <100%> (+2.26%) ⬆️
aiohttp/worker.py 96.63% <0%> (+2.52%) ⬆️
aiohttp/helpers.py 96.61% <0%> (+2.9%) ⬆️
aiohttp/web_fileresponse.py 97.82% <0%> (+19.02%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update ea8f35e...79f370d. Read the comment docs.

@illia-v illia-v force-pushed the fix-hanging-problem branch from 6805b87 to 051a53c Compare February 9, 2020 18:48
@illia-v
Copy link
Contributor Author

illia-v commented Feb 9, 2020

In [1]: from collections import defaultdict, deque                                                                                                                                                                                                                                                        

In [2]: dictionary = defaultdict(deque)                                                                                                                                                                                                                                                                   

In [3]: dq = dictionary['key']                                                                                                                                                                                                                                                                            

In [4]: dq.append('future')                                                                                                                                                                                                                                                                               

In [5]: dq is dictionary['key']                                                                                                                                                                                                                                                                           
Out[5]: True

In [6]: del dictionary['key']                                                                                                                                                                                                                                                                             

In [7]: dq                                                                                                                                                                                                                                                                                                
Out[7]: deque(['future'])

In [8]: dictionary                                                                                                                                                                                                                                                                                        
Out[8]: defaultdict(collections.deque, {})

In [9]: dictionary['key'].append('future')                                                                                                                                                                                                                                                                

In [10]: dictionary                                                                                                                                                                                                                                                                                       
Out[10]: defaultdict(collections.deque, {'key': deque(['future'])})

In [11]: dq is dictionary['key']                                                                                                                                                                                                                                                                          
Out[11]: False

@thehesiod
Copy link
Contributor

this logic is getting impossible to reason with, how about something like this: thehesiod-forks#4 The issue I have right now is that _release_waiter is not async and doesn't appear to be easy to make async

@thehesiod
Copy link
Contributor

actually I think I came up with a fix in my PR

@illia-v
Copy link
Contributor Author

illia-v commented Feb 17, 2020

@thehesiod the race condition described in the first comment caused the aio-libs/aiobotocore#738 (comment) problem. When I was debugging it, it was pretty clear that a different deque with a same key was deleted from the dictionary. I checked twice, and therefore I don't agree that it is impossible :)

I'll take a look at your PR when I can

@illia-v
Copy link
Contributor Author

illia-v commented Mar 7, 2020

@thehesiod I looked at your pull request and added two minor comments.

Af for me your solution looks better than the current implementation. Also, it seems to fix the problem mentioned in this PR too, I haven't checked it though.

If your changes are merged before the next release, I'm OK with closing this PR.
Otherwise, I'd like this one to be merged before the next release to fix the problem because it changes only one line and should not create any regressions

@thehesiod
Copy link
Contributor

@illia-v I didn't open my PR against aiohttp, up to maintainers of aiohttp to see if they want that or not ( @asvetlov )

Copy link
Contributor

@pfreixes pfreixes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! @asvetlov would you mind giving your blessing for that fix?

if not waiters:
# `waiters` may be deleted from `self._waiters` by another
# coroutine, and `self._waiters[key]` may contain another
# deque, the later should not be deleted.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess that the issue is that we are caching the self._waiters[key] at the local level by using the local variable waiters what about simply doing a check with using the none cached version, like:

if key in self._waiters and not self._waiters[key]:
    ...

Maybe it's more clear.

WDYT?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a good idea. Thanks!

@codecov-commenter
Copy link

codecov-commenter commented Aug 4, 2020

Codecov Report

Merging #4562 into master will increase coverage by 0.04%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #4562      +/-   ##
==========================================
+ Coverage   97.58%   97.62%   +0.04%     
==========================================
  Files          43       43              
  Lines        8932     8929       -3     
  Branches     1406     1406              
==========================================
+ Hits         8716     8717       +1     
+ Misses         96       95       -1     
+ Partials      120      117       -3     
Impacted Files Coverage Δ
aiohttp/connector.py 96.59% <100.00%> (+0.30%) ⬆️
aiohttp/web_fileresponse.py 97.82% <0.00%> (-0.55%) ⬇️
aiohttp/pytest_plugin.py 97.51% <0.00%> (+1.86%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 1d296d5...24238b5. Read the comment docs.

@codecov-io
Copy link

codecov-io commented Oct 17, 2020

Codecov Report

Merging #4562 into master will increase coverage by 0.01%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #4562      +/-   ##
==========================================
+ Coverage   97.63%   97.64%   +0.01%     
==========================================
  Files          43       43              
  Lines        8981     8978       -3     
  Branches     1411     1411              
==========================================
- Hits         8769     8767       -2     
  Misses         98       98              
+ Partials      114      113       -1     
Flag Coverage Δ
#unit 97.64% <100.00%> (+0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
aiohttp/connector.py 96.80% <100.00%> (+0.14%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update b7d003e...15ce1ae. Read the comment docs.

@asvetlov
Copy link
Member

LGTM, thanks!

@thehesiod
Copy link
Contributor

@asvetlov welcome back :)

@asvetlov asvetlov merged commit 3020edc into aio-libs:master Oct 17, 2020
github-actions bot pushed a commit that referenced this pull request Oct 17, 2020
* Fix a problem with connection waiters that are never awaited

* Add a test

* Refactor `aiohttp.connector.BaseConnector.connect` a little bit

Thanks Pau Freixes for a suggestion.

Co-authored-by: Andrew Svetlov <[email protected]>
@github-actions
Copy link
Contributor

💚 Backport successful

The PR was backported to the following branches:

asvetlov added a commit that referenced this pull request Oct 17, 2020
@asvetlov
Copy link
Member

Thanks, @thehesiod

netbsd-srcmastr referenced this pull request in NetBSD/pkgsrc Oct 24, 2020
This fixes py-yarl in pkgsrc being too new for py-aiohttp.


3.7.0 (2020-10-24)
==================

Features
--------

- Response headers are now prepared prior to running ``on_response_prepare`` hooks, directly before headers are sent to the client.
  `#1958 <https://github.com/aio-libs/aiohttp/issues/1958>`_
- Add a ``quote_cookie`` option to ``CookieJar``, a way to skip quotation wrapping of cookies containing special characters.
  `#2571 <https://github.com/aio-libs/aiohttp/issues/2571>`_
- Call ``AccessLogger.log`` with the current exception available from ``sys.exc_info()``.
  `#3557 <https://github.com/aio-libs/aiohttp/issues/3557>`_
- `web.UrlDispatcher.add_routes` and `web.Application.add_routes` return a list
  of registered `AbstractRoute` instances. `AbstractRouteDef.register` (and all
  subclasses) return a list of registered resources registered resource.
  `#3866 <https://github.com/aio-libs/aiohttp/issues/3866>`_
- Added properties of default ClientSession params to ClientSession class so it is available for introspection
  `#3882 <https://github.com/aio-libs/aiohttp/issues/3882>`_
- Don't cancel web handler on peer disconnection, raise `OSError` on reading/writing instead.
  `#4080 <https://github.com/aio-libs/aiohttp/issues/4080>`_
- Implement BaseRequest.get_extra_info() to access a protocol transports' extra info.
  `#4189 <https://github.com/aio-libs/aiohttp/issues/4189>`_
- Added `ClientSession.timeout` property.
  `#4191 <https://github.com/aio-libs/aiohttp/issues/4191>`_
- allow use of SameSite in cookies.
  `#4224 <https://github.com/aio-libs/aiohttp/issues/4224>`_
- Use ``loop.sendfile()`` instead of custom implementation if available.
  `#4269 <https://github.com/aio-libs/aiohttp/issues/4269>`_
- Apply SO_REUSEADDR to test server's socket.
  `#4393 <https://github.com/aio-libs/aiohttp/issues/4393>`_
- Use .raw_host instead of slower .host in client API
  `#4402 <https://github.com/aio-libs/aiohttp/issues/4402>`_
- Allow configuring the buffer size of input stream by passing ``read_bufsize`` argument.
  `#4453 <https://github.com/aio-libs/aiohttp/issues/4453>`_
- Pass tests on Python 3.8 for Windows.
  `#4513 <https://github.com/aio-libs/aiohttp/issues/4513>`_
- Add `method` and `url` attributes to `TraceRequestChunkSentParams` and `TraceResponseChunkReceivedParams`.
  `#4674 <https://github.com/aio-libs/aiohttp/issues/4674>`_
- Add ClientResponse.ok property for checking status code under 400.
  `#4711 <https://github.com/aio-libs/aiohttp/issues/4711>`_
- Don't ceil timeouts that are smaller than 5 seconds.
  `#4850 <https://github.com/aio-libs/aiohttp/issues/4850>`_
- TCPSite now listens by default on all interfaces instead of just IPv4 when `None` is passed in as the host.
  `#4894 <https://github.com/aio-libs/aiohttp/issues/4894>`_
- Bump ``http_parser`` to 2.9.4
  `#5070 <https://github.com/aio-libs/aiohttp/issues/5070>`_


Bugfixes
--------

- Fix keepalive connections not being closed in time
  `#3296 <https://github.com/aio-libs/aiohttp/issues/3296>`_
- Fix failed websocket handshake leaving connection hanging.
  `#3380 <https://github.com/aio-libs/aiohttp/issues/3380>`_
- Fix tasks cancellation order on exit. The run_app task needs to be cancelled first for cleanup hooks to run with all tasks intact.
  `#3805 <https://github.com/aio-libs/aiohttp/issues/3805>`_
- Don't start heartbeat until _writer is set
  `#4062 <https://github.com/aio-libs/aiohttp/issues/4062>`_
- Fix handling of multipart file uploads without a content type.
  `#4089 <https://github.com/aio-libs/aiohttp/issues/4089>`_
- Preserve view handler function attributes across middlewares
  `#4174 <https://github.com/aio-libs/aiohttp/issues/4174>`_
- Fix the string representation of ``ServerDisconnectedError``.
  `#4175 <https://github.com/aio-libs/aiohttp/issues/4175>`_
- Raising RuntimeError when trying to get encoding from not read body
  `#4214 <https://github.com/aio-libs/aiohttp/issues/4214>`_
- Remove warning messages from noop.
  `#4282 <https://github.com/aio-libs/aiohttp/issues/4282>`_
- Raise ClientPayloadError if FormData re-processed.
  `#4345 <https://github.com/aio-libs/aiohttp/issues/4345>`_
- Fix a warning about unfinished task in ``web_protocol.py``
  `#4408 <https://github.com/aio-libs/aiohttp/issues/4408>`_
- Fixed 'deflate' compression. According to RFC 2616 now.
  `#4506 <https://github.com/aio-libs/aiohttp/issues/4506>`_
- Fixed OverflowError on platforms with 32-bit time_t
  `#4515 <https://github.com/aio-libs/aiohttp/issues/4515>`_
- Fixed request.body_exists returns wrong value for methods without body.
  `#4528 <https://github.com/aio-libs/aiohttp/issues/4528>`_
- Fix connecting to link-local IPv6 addresses.
  `#4554 <https://github.com/aio-libs/aiohttp/issues/4554>`_
- Fix a problem with connection waiters that are never awaited.
  `#4562 <https://github.com/aio-libs/aiohttp/issues/4562>`_
- Always make sure transport is not closing before reuse a connection.

  Reuse a protocol based on keepalive in headers is unreliable.
  For example, uWSGI will not support keepalive even it serves a
  HTTP 1.1 request, except explicitly configure uWSGI with a
  ``--http-keepalive`` option.

  Servers designed like uWSGI could cause aiohttp intermittently
  raise a ConnectionResetException when the protocol poll runs
  out and some protocol is reused.
  `#4587 <https://github.com/aio-libs/aiohttp/issues/4587>`_
- Handle the last CRLF correctly even if it is received via separate TCP segment.
  `#4630 <https://github.com/aio-libs/aiohttp/issues/4630>`_
- Fix the register_resource function to validate route name before splitting it so that route name can include python keywords.
  `#4691 <https://github.com/aio-libs/aiohttp/issues/4691>`_
- Improve typing annotations for ``web.Request``, ``aiohttp.ClientResponse`` and
  ``multipart`` module.
  `#4736 <https://github.com/aio-libs/aiohttp/issues/4736>`_
- Fix resolver task is not awaited when connector is cancelled
  `#4795 <https://github.com/aio-libs/aiohttp/issues/4795>`_
- Fix a bug "Aiohttp doesn't return any error on invalid request methods"
  `#4798 <https://github.com/aio-libs/aiohttp/issues/4798>`_
- Fix HEAD requests for static content.
  `#4809 <https://github.com/aio-libs/aiohttp/issues/4809>`_
- Fix incorrect size calculation for memoryview
  `#4890 <https://github.com/aio-libs/aiohttp/issues/4890>`_
- Add HTTPMove to _all__.
  `#4897 <https://github.com/aio-libs/aiohttp/issues/4897>`_
- Fixed the type annotations in the ``tracing`` module.
  `#4912 <https://github.com/aio-libs/aiohttp/issues/4912>`_
- Fix typing for multipart ``__aiter__``.
  `#4931 <https://github.com/aio-libs/aiohttp/issues/4931>`_
- Fix for race condition on connections in BaseConnector that leads to exceeding the connection limit.
  `#4936 <https://github.com/aio-libs/aiohttp/issues/4936>`_
- Add forced UTF-8 encoding for ``application/rdap+json`` responses.
  `#4938 <https://github.com/aio-libs/aiohttp/issues/4938>`_
- Fix inconsistency between Python and C http request parsers in parsing pct-encoded URL.
  `#4972 <https://github.com/aio-libs/aiohttp/issues/4972>`_
- Fix connection closing issue in HEAD request.
  `#5012 <https://github.com/aio-libs/aiohttp/issues/5012>`_
- Fix type hint on BaseRunner.addresses (from ``List[str]`` to ``List[Any]``)
  `#5086 <https://github.com/aio-libs/aiohttp/issues/5086>`_
- Make `web.run_app()` more responsive to Ctrl+C on Windows for Python < 3.8. It slightly
  increases CPU load as a side effect.
  `#5098 <https://github.com/aio-libs/aiohttp/issues/5098>`_


Improved Documentation
----------------------

- Fix example code in client quick-start
  `#3376 <https://github.com/aio-libs/aiohttp/issues/3376>`_
- Updated the docs so there is no contradiction in ``ttl_dns_cache`` default value
  `#3512 <https://github.com/aio-libs/aiohttp/issues/3512>`_
- Add 'Deploy with SSL' to docs.
  `#4201 <https://github.com/aio-libs/aiohttp/issues/4201>`_
- Change typing of the secure argument on StreamResponse.set_cookie from ``Optional[str]`` to ``Optional[bool]``
  `#4204 <https://github.com/aio-libs/aiohttp/issues/4204>`_
- Changes ``ttl_dns_cache`` type from int to Optional[int].
  `#4270 <https://github.com/aio-libs/aiohttp/issues/4270>`_
- Simplify README hello word example and add a documentation page for people coming from requests.
  `#4272 <https://github.com/aio-libs/aiohttp/issues/4272>`_
- Improve some code examples in the documentation involving websockets and starting a simple HTTP site with an AppRunner.
  `#4285 <https://github.com/aio-libs/aiohttp/issues/4285>`_
- Fix typo in code example in Multipart docs
  `#4312 <https://github.com/aio-libs/aiohttp/issues/4312>`_
- Fix code example in Multipart section.
  `#4314 <https://github.com/aio-libs/aiohttp/issues/4314>`_
- Update contributing guide so new contributors read the most recent version of that guide. Update command used to create test coverage reporting.
  `#4810 <https://github.com/aio-libs/aiohttp/issues/4810>`_
- Spelling: Change "canonize" to "canonicalize".
  `#4986 <https://github.com/aio-libs/aiohttp/issues/4986>`_
- Add ``aiohttp-sse-client`` library to third party usage list.
  `#5084 <https://github.com/aio-libs/aiohttp/issues/5084>`_


Misc
----

- `#2856 <https://github.com/aio-libs/aiohttp/issues/2856>`_, `#4218 <https://github.com/aio-libs/aiohttp/issues/4218>`_, `#4250 <https://github.com/aio-libs/aiohttp/issues/4250>`_
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bot:chronographer:provided There is a change note present in this PR
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants