Improve middleware performance #9200

bdraco · 2024-09-19T12:16:05Z

What do these changes do?

The contextmanager in cff60eb added quite a bit of overhead and was only used for fixing the current app when using middleware. I did a github code search and did not find usage of set_current_app outside of aiohttp so I think its safe to remove as its unlikely to be used outside of aiohttp.

closes #9196

Are there changes in behavior for the user?

The set_current_app function has been removed.

Is it a substantial burden for the maintainers to support this?

no

The contextmanager in cff60eb added quite a bit of overhead and was only used for fixing the current app when using middleware. I did a github code search and did not find usage of set_current_app outside of aiohttp so I think its safe to remove as its unlikely to be used outside of aiohttp.

codecov · 2024-09-19T12:29:00Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 98.31%. Comparing base (bf022b3) to head (9804f7a).
Report is 1076 commits behind head on master.

✅ All tests successful. No failed tests found.

Additional details and impacted files

@@           Coverage Diff           @@
##           master    #9200   +/-   ##
=======================================
  Coverage   98.31%   98.31%           
=======================================
  Files         107      107           
  Lines       34483    34507   +24     
  Branches     4093     4100    +7     
=======================================
+ Hits        33901    33927   +26     
+ Misses        411      410    -1     
+ Partials      171      170    -1

Flag	Coverage Δ
CI-GHA	`98.21% <100.00%> (+<0.01%)`	⬆️
OS-Linux	`97.87% <100.00%> (+<0.01%)`	⬆️
OS-Windows	`96.29% <100.00%> (+<0.01%)`	⬆️
OS-macOS	`97.55% <100.00%> (+0.01%)`	⬆️
Py-3.10.11	`97.65% <100.00%> (+<0.01%)`	⬆️
Py-3.10.14	`?`
Py-3.10.15	`97.58% <100.00%> (+0.05%)`	⬆️
Py-3.11.10	`97.47% <100.00%> (?)`
Py-3.11.9	`97.54% <100.00%> (-0.27%)`	⬇️
Py-3.12.5	`?`
Py-3.12.6	`97.93% <100.00%> (+0.28%)`	⬆️
Py-3.9.13	`97.54% <100.00%> (+0.01%)`	⬆️
Py-3.9.19	`?`
Py-3.9.20	`97.48% <100.00%> (+<0.01%)`	⬆️
Py-pypy7.3.16	`97.09% <100.00%> (+<0.01%)`	⬆️
VM-macos	`97.55% <100.00%> (+0.01%)`	⬆️
VM-ubuntu	`97.87% <100.00%> (+<0.01%)`	⬆️
VM-windows	`96.29% <100.00%> (+<0.01%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Dreamsorcerer · 2024-09-19T12:37:13Z

Is it possibly caused by the exception handling or something?
https://github.com/python/cpython/blob/4420cf4dc9ef7bd3c1c9b5465fa9397304bf0110/Lib/contextlib.py#L149

bdraco · 2024-09-19T12:55:48Z

Is it possibly caused by the exception handling or something? python/cpython@4420cf4/Lib/contextlib.py#L149

Yeah it looks like the __enter__ and mostly __exit__ implementation is overkill for this use case.

Dreamsorcerer · 2024-09-19T13:06:07Z

Is it possibly caused by the exception handling or something? python/cpython@4420cf4/Lib/contextlib.py#L149

Yeah it looks like the __enter__ and mostly __exit__ implementation is overkill for this use case.

I don't think enter is an issue. I believe try is now a 0-cost implementation in Python 3.13..

But, exit handles an exception in the expected case. I can't think of any way to make that more efficient though, other than not using the decorator...

bdraco · 2024-09-19T13:10:19Z

The profile shows the most expensive part is __init__ on line 104 with cpython 3.12.4, followed by __exit__ so it looks like the context manager object has to be created every time from the helper function.

Line 108 on cpython main https://github.com/python/cpython/blob/4420cf4dc9ef7bd3c1c9b5465fa9397304bf0110/Lib/contextlib.py#L108

bdraco · 2024-09-19T13:16:05Z

I recall we had a similar problem with rendering templates in Home Assistant (we can render many per second) and the solution was to create the context manager once and use it over and over... that worked because there was no await and everything finished synchronously, but we can't use that solution here.

Dreamsorcerer · 2024-09-19T13:31:17Z

The profile shows the most expensive part is __init__ on line 104 with cpython 3.12.4, followed by __exit__ so it looks like the context manager object has to be created every time from the helper function.

Line 108 on cpython main https://github.com/python/cpython/blob/4420cf4dc9ef7bd3c1c9b5465fa9397304bf0110/Lib/contextlib.py#L108

Looks to me like there's a ContextDecorator which uses _recreate_cm() to avoid redoing the init everytime. It seems to me like the decorators could be improved a little to do that same thing instead of instantiating a new class directly on every call. Not sure how much you'd get from that, but atleast would save fetching the docs attribute etc. in the init.

bdraco · 2024-09-21T06:57:33Z

The profile shows the most expensive part is __init__ on line 104 with cpython 3.12.4, followed by __exit__ so it looks like the context manager object has to be created every time from the helper function.
Line 108 on cpython main python/cpython@4420cf4/Lib/contextlib.py#L108

Looks to me like there's a ContextDecorator which uses _recreate_cm() to avoid redoing the init everytime. It seems to me like the decorators could be improved a little to do that same thing instead of instantiating a new class directly on every call. Not sure how much you'd get from that, but atleast would save fetching the docs attribute etc. in the init.

Looks like thats already supposed to be happening but __init__ is still running every time since its bound to an UrlMappingMatchInfo which gets recreated every time.

https://github.com/python/cpython/blob/fcfe78664bf740a7fb059f9817e584a612e09f54/Lib/contextlib.py#L128
https://github.com/python/cpython/blob/fcfe78664bf740a7fb059f9817e584a612e09f54/Lib/contextlib.py#L301

bdraco · 2024-09-21T06:59:07Z

I played around with it for a bit and came to the conclusion that trying to make the context manager performant is likely not possible without writing it all out which got large quickly. I think what we have now is probably the simplest solution.

Dreamsorcerer · 2024-09-21T13:53:04Z

Looks to me like there's a ContextDecorator which uses _recreate_cm() to avoid redoing the init everytime. It seems to me like the decorators could be improved a little to do that same thing instead of instantiating a new class directly on every call. Not sure how much you'd get from that, but atleast would save fetching the docs attribute etc. in the init.

Looks like thats already supposed to be happening but __init__ is still running every time since its bound to an UrlMappingMatchInfo which gets recreated every time.

https://github.com/python/cpython/blob/fcfe78664bf740a7fb059f9817e584a612e09f54/Lib/contextlib.py#L128 https://github.com/python/cpython/blob/fcfe78664bf740a7fb059f9817e584a612e09f54/Lib/contextlib.py#L301

Maybe I'm misreading that, but doesn't the decorator transform the original function into a new function which returns a new _GeneratorContextManager on every call? Seems to me like the _GeneratorContextManager would need to be created in the outer function and then called in the inner function in order for _recreate_cm() to ever be called..

Dreamsorcerer · 2024-09-21T13:55:52Z

since its bound to an UrlMappingMatchInfo which gets recreated every time

The decorator would run on the function (unbound method), right? So, the decorator is not being run every time.

bdraco · 2024-09-21T14:42:29Z

The decorator would run on the function (unbound method), right? So, the decorator is not being run every time.

Decorator runs once, but context manager is created every time.

Example:

from contextlib import contextmanager
import cProfile

class X:

    @contextmanager
    def my_context(self):
        """My context."""
        yield


pr = cProfile.Profile()
pr.enable()
for _ in range(1000):
    with X().my_context():
        pass
pr.disable()
pr.create_stats()
pr.print_stats()
pr.dump_stats("cm.cprof")

         9001 function calls in 0.001 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
     1000    0.000    0.000    0.000    0.000 contextlib.py:104(__init__)
     1000    0.000    0.000    0.000    0.000 contextlib.py:132(__enter__)
     1000    0.000    0.000    0.000    0.000 contextlib.py:141(__exit__)
     1000    0.000    0.000    0.000    0.000 contextlib.py:299(helper)
     2000    0.000    0.000    0.000    0.000 test_context.py:7(my_context)
     1000    0.000    0.000    0.000    0.000 {built-in method builtins.getattr}
     2000    0.000    0.000    0.000    0.000 {built-in method builtins.next}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}

Dreamsorcerer · 2024-09-21T15:08:07Z

Decorator runs once, but context manager is created every time.

Well, yes, that's my point above..

Seems to me like the function should be written something like:

def contextmanager(func):
    return _GeneratorContextManager(func, args, kwds)(func)

At which point, each call would only be running:

            with self._recreate_cm():
                return func(*args, **kwds)

Dreamsorcerer · 2024-09-21T15:26:34Z

Hacky proof-of-concept:

from contextlib import _GeneratorContextManager, wraps
import cProfile

class CM(_GeneratorContextManager):
    def __init__(self, func):
        self.func = func
        doc = getattr(func, "__doc__", None)
        if doc is None:
            doc = type(self).__doc__
        self.__doc__ = doc

    def _recreate_cm(self, args, kwargs):
        self.gen = self.func(*args, **kwargs)
        return self

    def __enter__(self):
        try:
            return next(self.gen)
        except StopIteration:
            raise RuntimeError("generator didn't yield") from None

def contextmanager(func):
    cm = CM(func)

    @wraps(func)
    def inner(*args, **kwds):
        return cm._recreate_cm(args, kwds)
    return inner

class X:

    @contextmanager
    def my_context(self):
        """My context."""
        yield


pr = cProfile.Profile()
pr.enable()
for _ in range(1000):
    with X().my_context():
        pass
pr.disable()
pr.create_stats()
pr.print_stats()
pr.dump_stats("cm.cprof")

         8001 function calls in 0.002 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
     1000    0.000    0.000    0.001    0.000 contextlib.py:117(__exit__)
     1000    0.000    0.000    0.000    0.000 test.py:12(_recreate_cm)
     1000    0.000    0.000    0.000    0.000 test.py:16(__enter__)
     1000    0.000    0.000    0.000    0.000 test.py:25(inner)
     2000    0.000    0.000    0.000    0.000 test.py:32(my_context)
     2000    0.000    0.000    0.000    0.000 {built-in method builtins.next}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}

Note that I get 1 ms tottime in init on the original, and nothing with 1ms tottime on this version.

Dreamsorcerer · 2024-09-21T15:39:01Z

from contextlib import _GeneratorContextManager, wraps, contextmanager
import cProfile

class CM(_GeneratorContextManager):
    def __init__(self, func):
        self.func = func
        doc = getattr(func, "__doc__", None)
        if doc is None:
            doc = type(self).__doc__
        self.__doc__ = doc

    def _recreate_cm(self, args, kwargs):
        self.gen = self.func(*args, **kwargs)
        return self

    def __enter__(self):
        try:
            return next(self.gen)
        except StopIteration:
            raise RuntimeError("generator didn't yield") from None

def contextmanager2(func):
    cm = CM(func)

    @wraps(func)
    def inner(*args, **kwds):
        return cm._recreate_cm(args, kwds)
    return inner

class X:

    @contextmanager2
    def my_context(self):
        """My context."""
        yield

Switching between contextmanager and contextmanager2 and timing with:

python3 -m timeit -s 'from test import X' 'with X().my_context(): pass'

I get around 1.4 - 2.3 µs with the original and 0.9 - 1.3 µs with the hack. So, original is about 50-80% slower. Though just setting self.gen is probably not correct (it won't be thread-safe for example), so would need to make some change for that..

Dreamsorcerer · 2024-09-21T15:43:15Z

Taking the instantiation out:

python3 -m timeit -s 'from test import X; x = X()' 'with x.my_context(): pass'

I get as low as 0.8µs with the hack and no lower than 1.7 µs with the original, so actually over 100% slower.

Dreamsorcerer · 2024-09-21T16:01:17Z

Hmm, I actually can't see how that __doc__ attribute is actually useful... I don't think it's actually relevant for contextmanager().

So, something like this should be safe and I can still get <1 µs from it.

from contextlib import _GeneratorContextManager, wraps, contextmanager
import cProfile

class _CM(_GeneratorContextManager):
    __slots__ = ("gen",)

    def __init__(self, gen):
        self.gen = gen

    def __enter__(self):
        try:
            return next(self.gen)
        except StopIteration:
            raise RuntimeError("generator didn't yield") from None

class CM:
    def __init__(self, func):
        self.func = func

    def _recreate_cm(self, args, kwargs):
        return _CM(self.func(*args, **kwargs))

def contextmanager2(func):
    cm = CM(func)

    @wraps(func)
    def inner(*args, **kwds):
        return cm._recreate_cm(args, kwds)
    return inner

class X:

    @contextmanager2
    def my_context(self):
        """My context."""
        yield

patchback · 2024-09-22T15:53:47Z

Backport to 3.10: 💔 cherry-picking failed — conflicts found

❌ Failed to cleanly apply 42930b0 on top of patchback/backports/3.10/42930b0c023af52782ec026d0e850f5f0c0bcdcd/pr-9200

Backporting merged PR #9200 into master

Ensure you have a local repo clone of your fork. Unless you cloned it
from the upstream, this would be your origin remote.
Make sure you have an upstream repo added as a remote too. In these
instructions you'll refer to it by the name upstream. If you don't
have it, here's how you can add it:
```
$ git remote add upstream https://github.com/aio-libs/aiohttp.git
```

Ensure you have the latest copy of upstream and prepare a branch
that will hold the backported code:

$ git fetch upstream
$ git checkout -b patchback/backports/3.10/42930b0c023af52782ec026d0e850f5f0c0bcdcd/pr-9200 upstream/3.10

Now, cherry-pick PR Improve middleware performance #9200 contents into that branch:
```
$ git cherry-pick -x 42930b0c023af52782ec026d0e850f5f0c0bcdcd
```
If it'll yell at you with something like fatal: Commit 42930b0c023af52782ec026d0e850f5f0c0bcdcd is a merge but no -m option was given., add -m 1 as follows instead:
```
$ git cherry-pick -m1 -x 42930b0c023af52782ec026d0e850f5f0c0bcdcd
```
At this point, you'll probably encounter some merge conflicts. You must
resolve them in to preserve the patch from PR Improve middleware performance #9200 as close to the
original as possible.

Push this branch to your fork on GitHub:

$ git push origin patchback/backports/3.10/42930b0c023af52782ec026d0e850f5f0c0bcdcd/pr-9200

Create a PR, ensure that the CI is green. If it's not — update it so that
the tests and any other checks pass. This is it!
Now relax and wait for the maintainers to process your pull request
when they have some cycles to do reviews. Don't worry — they'll tell you if
any improvements are necessary when the time comes!

🤖 @patchback
I'm built with octomachinery and
my source is open — https://github.com/sanitizers/patchback-github-app.

patchback · 2024-09-22T15:53:52Z

Backport to 3.11: 💔 cherry-picking failed — conflicts found

❌ Failed to cleanly apply 42930b0 on top of patchback/backports/3.11/42930b0c023af52782ec026d0e850f5f0c0bcdcd/pr-9200

Backporting merged PR #9200 into master

Ensure you have a local repo clone of your fork. Unless you cloned it
from the upstream, this would be your origin remote.
Make sure you have an upstream repo added as a remote too. In these
instructions you'll refer to it by the name upstream. If you don't
have it, here's how you can add it:
```
$ git remote add upstream https://github.com/aio-libs/aiohttp.git
```

Ensure you have the latest copy of upstream and prepare a branch
that will hold the backported code:

$ git fetch upstream
$ git checkout -b patchback/backports/3.11/42930b0c023af52782ec026d0e850f5f0c0bcdcd/pr-9200 upstream/3.11

Now, cherry-pick PR Improve middleware performance #9200 contents into that branch:
```
$ git cherry-pick -x 42930b0c023af52782ec026d0e850f5f0c0bcdcd
```
If it'll yell at you with something like fatal: Commit 42930b0c023af52782ec026d0e850f5f0c0bcdcd is a merge but no -m option was given., add -m 1 as follows instead:
```
$ git cherry-pick -m1 -x 42930b0c023af52782ec026d0e850f5f0c0bcdcd
```
At this point, you'll probably encounter some merge conflicts. You must
resolve them in to preserve the patch from PR Improve middleware performance #9200 as close to the
original as possible.

Push this branch to your fork on GitHub:

$ git push origin patchback/backports/3.11/42930b0c023af52782ec026d0e850f5f0c0bcdcd/pr-9200

Create a PR, ensure that the CI is green. If it's not — update it so that
the tests and any other checks pass. This is it!
Now relax and wait for the maintainers to process your pull request
when they have some cycles to do reviews. Don't worry — they'll tell you if
any improvements are necessary when the time comes!

🤖 @patchback
I'm built with octomachinery and
my source is open — https://github.com/sanitizers/patchback-github-app.

(cherry picked from commit 42930b0)

… forward websocket subprotocols from the request headers https://github.com/aio-libs/aiohttp/blob/master/CHANGES.rst#3106-2024-09-24

bdraco added backport-3.10 backport-3.11 Trigger automatic backporting to the 3.11 release branch by Patchback robot labels Sep 19, 2024

changelog

69b1e20

psf-chronographer bot added the bot:chronographer:provided There is a change note present in this PR label Sep 19, 2024

lint

9804f7a

bdraco marked this pull request as ready for review September 19, 2024 13:16

bdraco requested review from webknjaz and asvetlov as code owners September 19, 2024 13:16

bdraco merged commit 42930b0 into master Sep 22, 2024
34 of 35 checks passed

bdraco deleted the middleware_performance branch September 22, 2024 15:53

bdraco added a commit that referenced this pull request Sep 22, 2024

Improve middleware performance (#9200)

2c10ba4

(cherry picked from commit 42930b0)

bdraco added a commit that referenced this pull request Sep 22, 2024

Improve middleware performance (#9200)

65ffe12

(cherry picked from commit 42930b0)

bdraco added a commit that referenced this pull request Sep 22, 2024

[PR #9200/42930b0 backport][3.11] Improve middleware performance (#9232)

066e69a

bdraco added a commit that referenced this pull request Sep 22, 2024

[PR #9200/42930b0 backport][3.10] Improve middleware performance (#9231)

febf525

meyerj mentioned this pull request Nov 6, 2024

Fix aiohttp_asgi with aiohttp>=3.10.6... mosquito/aiohttp-asgi#11

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve middleware performance #9200

Improve middleware performance #9200

bdraco commented Sep 19, 2024 •

edited

Loading

codecov bot commented Sep 19, 2024 •

edited

Loading

Dreamsorcerer commented Sep 19, 2024

bdraco commented Sep 19, 2024

Dreamsorcerer commented Sep 19, 2024

bdraco commented Sep 19, 2024 •

edited

Loading

bdraco commented Sep 19, 2024

Dreamsorcerer commented Sep 19, 2024

bdraco commented Sep 21, 2024

bdraco commented Sep 21, 2024

Dreamsorcerer commented Sep 21, 2024

Dreamsorcerer commented Sep 21, 2024

bdraco commented Sep 21, 2024

Dreamsorcerer commented Sep 21, 2024

Dreamsorcerer commented Sep 21, 2024 •

edited

Loading

Dreamsorcerer commented Sep 21, 2024

Dreamsorcerer commented Sep 21, 2024 •

edited

Loading

Dreamsorcerer commented Sep 21, 2024 •

edited

Loading

patchback bot commented Sep 22, 2024 •

edited

Loading

patchback bot commented Sep 22, 2024 •

edited

Loading

Improve middleware performance #9200

Improve middleware performance #9200

Conversation

bdraco commented Sep 19, 2024 • edited Loading

What do these changes do?

Are there changes in behavior for the user?

Is it a substantial burden for the maintainers to support this?

codecov bot commented Sep 19, 2024 • edited Loading

Codecov Report

Dreamsorcerer commented Sep 19, 2024

bdraco commented Sep 19, 2024

Dreamsorcerer commented Sep 19, 2024

bdraco commented Sep 19, 2024 • edited Loading

bdraco commented Sep 19, 2024

Dreamsorcerer commented Sep 19, 2024

bdraco commented Sep 21, 2024

bdraco commented Sep 21, 2024

Dreamsorcerer commented Sep 21, 2024

Dreamsorcerer commented Sep 21, 2024

bdraco commented Sep 21, 2024

Dreamsorcerer commented Sep 21, 2024

Dreamsorcerer commented Sep 21, 2024 • edited Loading

Dreamsorcerer commented Sep 21, 2024

Dreamsorcerer commented Sep 21, 2024 • edited Loading

Dreamsorcerer commented Sep 21, 2024 • edited Loading

patchback bot commented Sep 22, 2024 • edited Loading

Backport to 3.10: 💔 cherry-picking failed — conflicts found

Backporting merged PR #9200 into master

patchback bot commented Sep 22, 2024 • edited Loading

Backport to 3.11: 💔 cherry-picking failed — conflicts found

Backporting merged PR #9200 into master

bdraco commented Sep 19, 2024 •

edited

Loading

codecov bot commented Sep 19, 2024 •

edited

Loading

bdraco commented Sep 19, 2024 •

edited

Loading

Dreamsorcerer commented Sep 21, 2024 •

edited

Loading

Dreamsorcerer commented Sep 21, 2024 •

edited

Loading

Dreamsorcerer commented Sep 21, 2024 •

edited

Loading

patchback bot commented Sep 22, 2024 •

edited

Loading

patchback bot commented Sep 22, 2024 •

edited

Loading