Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve middleware performance #9200

Merged
merged 3 commits into from
Sep 22, 2024
Merged

Improve middleware performance #9200

merged 3 commits into from
Sep 22, 2024

Conversation

bdraco
Copy link
Member

@bdraco bdraco commented Sep 19, 2024

What do these changes do?

The contextmanager in cff60eb added quite a bit of overhead and was only used for fixing the current app when using middleware. I did a github code search and did not find usage of set_current_app outside of aiohttp so I think its safe to remove as its unlikely to be used outside of aiohttp.

closes #9196

Are there changes in behavior for the user?

The set_current_app function has been removed.

Is it a substantial burden for the maintainers to support this?

no

The contextmanager in cff60eb
added quite a bit of overhead and was only used for fixing
the current app when using middleware. I did a github
code search and did not find usage of set_current_app
outside of aiohttp so I think its safe to remove as
its unlikely to be used outside of aiohttp.
@bdraco bdraco added backport-3.10 backport-3.11 Trigger automatic backporting to the 3.11 release branch by Patchback robot labels Sep 19, 2024
@psf-chronographer psf-chronographer bot added the bot:chronographer:provided There is a change note present in this PR label Sep 19, 2024
Copy link

codecov bot commented Sep 19, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 98.31%. Comparing base (bf022b3) to head (9804f7a).
Report is 1076 commits behind head on master.

✅ All tests successful. No failed tests found.

Additional details and impacted files
@@           Coverage Diff           @@
##           master    #9200   +/-   ##
=======================================
  Coverage   98.31%   98.31%           
=======================================
  Files         107      107           
  Lines       34483    34507   +24     
  Branches     4093     4100    +7     
=======================================
+ Hits        33901    33927   +26     
+ Misses        411      410    -1     
+ Partials      171      170    -1     
Flag Coverage Δ
CI-GHA 98.21% <100.00%> (+<0.01%) ⬆️
OS-Linux 97.87% <100.00%> (+<0.01%) ⬆️
OS-Windows 96.29% <100.00%> (+<0.01%) ⬆️
OS-macOS 97.55% <100.00%> (+0.01%) ⬆️
Py-3.10.11 97.65% <100.00%> (+<0.01%) ⬆️
Py-3.10.14 ?
Py-3.10.15 97.58% <100.00%> (+0.05%) ⬆️
Py-3.11.10 97.47% <100.00%> (?)
Py-3.11.9 97.54% <100.00%> (-0.27%) ⬇️
Py-3.12.5 ?
Py-3.12.6 97.93% <100.00%> (+0.28%) ⬆️
Py-3.9.13 97.54% <100.00%> (+0.01%) ⬆️
Py-3.9.19 ?
Py-3.9.20 97.48% <100.00%> (+<0.01%) ⬆️
Py-pypy7.3.16 97.09% <100.00%> (+<0.01%) ⬆️
VM-macos 97.55% <100.00%> (+0.01%) ⬆️
VM-ubuntu 97.87% <100.00%> (+<0.01%) ⬆️
VM-windows 96.29% <100.00%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@Dreamsorcerer
Copy link
Member

Is it possibly caused by the exception handling or something?
https://github.com/python/cpython/blob/4420cf4dc9ef7bd3c1c9b5465fa9397304bf0110/Lib/contextlib.py#L149

@bdraco
Copy link
Member Author

bdraco commented Sep 19, 2024

Is it possibly caused by the exception handling or something? python/cpython@4420cf4/Lib/contextlib.py#L149

Yeah it looks like the __enter__ and mostly __exit__ implementation is overkill for this use case.

@Dreamsorcerer
Copy link
Member

Is it possibly caused by the exception handling or something? python/cpython@4420cf4/Lib/contextlib.py#L149

Yeah it looks like the __enter__ and mostly __exit__ implementation is overkill for this use case.

I don't think enter is an issue. I believe try is now a 0-cost implementation in Python 3.13..

But, exit handles an exception in the expected case. I can't think of any way to make that more efficient though, other than not using the decorator...

@bdraco
Copy link
Member Author

bdraco commented Sep 19, 2024

The profile shows the most expensive part is __init__ on line 104 with cpython 3.12.4, followed by __exit__ so it looks like the context manager object has to be created every time from the helper function.

Line 108 on cpython main https://github.com/python/cpython/blob/4420cf4dc9ef7bd3c1c9b5465fa9397304bf0110/Lib/contextlib.py#L108

@bdraco
Copy link
Member Author

bdraco commented Sep 19, 2024

I recall we had a similar problem with rendering templates in Home Assistant (we can render many per second) and the solution was to create the context manager once and use it over and over... that worked because there was no await and everything finished synchronously, but we can't use that solution here.

@bdraco bdraco marked this pull request as ready for review September 19, 2024 13:16
@Dreamsorcerer
Copy link
Member

The profile shows the most expensive part is __init__ on line 104 with cpython 3.12.4, followed by __exit__ so it looks like the context manager object has to be created every time from the helper function.

Line 108 on cpython main https://github.com/python/cpython/blob/4420cf4dc9ef7bd3c1c9b5465fa9397304bf0110/Lib/contextlib.py#L108

Looks to me like there's a ContextDecorator which uses _recreate_cm() to avoid redoing the init everytime. It seems to me like the decorators could be improved a little to do that same thing instead of instantiating a new class directly on every call. Not sure how much you'd get from that, but atleast would save fetching the docs attribute etc. in the init.

@bdraco
Copy link
Member Author

bdraco commented Sep 21, 2024

The profile shows the most expensive part is __init__ on line 104 with cpython 3.12.4, followed by __exit__ so it looks like the context manager object has to be created every time from the helper function.
Line 108 on cpython main python/cpython@4420cf4/Lib/contextlib.py#L108

Looks to me like there's a ContextDecorator which uses _recreate_cm() to avoid redoing the init everytime. It seems to me like the decorators could be improved a little to do that same thing instead of instantiating a new class directly on every call. Not sure how much you'd get from that, but atleast would save fetching the docs attribute etc. in the init.

Looks like thats already supposed to be happening but __init__ is still running every time since its bound to an UrlMappingMatchInfo which gets recreated every time.

https://github.com/python/cpython/blob/fcfe78664bf740a7fb059f9817e584a612e09f54/Lib/contextlib.py#L128
https://github.com/python/cpython/blob/fcfe78664bf740a7fb059f9817e584a612e09f54/Lib/contextlib.py#L301

@bdraco
Copy link
Member Author

bdraco commented Sep 21, 2024

I played around with it for a bit and came to the conclusion that trying to make the context manager performant is likely not possible without writing it all out which got large quickly. I think what we have now is probably the simplest solution.

@Dreamsorcerer
Copy link
Member

Looks to me like there's a ContextDecorator which uses _recreate_cm() to avoid redoing the init everytime. It seems to me like the decorators could be improved a little to do that same thing instead of instantiating a new class directly on every call. Not sure how much you'd get from that, but atleast would save fetching the docs attribute etc. in the init.

Looks like thats already supposed to be happening but __init__ is still running every time since its bound to an UrlMappingMatchInfo which gets recreated every time.

https://github.com/python/cpython/blob/fcfe78664bf740a7fb059f9817e584a612e09f54/Lib/contextlib.py#L128 https://github.com/python/cpython/blob/fcfe78664bf740a7fb059f9817e584a612e09f54/Lib/contextlib.py#L301

Maybe I'm misreading that, but doesn't the decorator transform the original function into a new function which returns a new _GeneratorContextManager on every call? Seems to me like the _GeneratorContextManager would need to be created in the outer function and then called in the inner function in order for _recreate_cm() to ever be called..

@Dreamsorcerer
Copy link
Member

since its bound to an UrlMappingMatchInfo which gets recreated every time

The decorator would run on the function (unbound method), right? So, the decorator is not being run every time.

@bdraco
Copy link
Member Author

bdraco commented Sep 21, 2024

The decorator would run on the function (unbound method), right? So, the decorator is not being run every time.

Decorator runs once, but context manager is created every time.

Example:

from contextlib import contextmanager
import cProfile

class X:

    @contextmanager
    def my_context(self):
        """My context."""
        yield


pr = cProfile.Profile()
pr.enable()
for _ in range(1000):
    with X().my_context():
        pass
pr.disable()
pr.create_stats()
pr.print_stats()
pr.dump_stats("cm.cprof")
         9001 function calls in 0.001 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
     1000    0.000    0.000    0.000    0.000 contextlib.py:104(__init__)
     1000    0.000    0.000    0.000    0.000 contextlib.py:132(__enter__)
     1000    0.000    0.000    0.000    0.000 contextlib.py:141(__exit__)
     1000    0.000    0.000    0.000    0.000 contextlib.py:299(helper)
     2000    0.000    0.000    0.000    0.000 test_context.py:7(my_context)
     1000    0.000    0.000    0.000    0.000 {built-in method builtins.getattr}
     2000    0.000    0.000    0.000    0.000 {built-in method builtins.next}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}


@Dreamsorcerer
Copy link
Member

Decorator runs once, but context manager is created every time.

Well, yes, that's my point above..

Seems to me like the function should be written something like:

def contextmanager(func):
    return _GeneratorContextManager(func, args, kwds)(func)

At which point, each call would only be running:

            with self._recreate_cm():
                return func(*args, **kwds)

@Dreamsorcerer
Copy link
Member

Dreamsorcerer commented Sep 21, 2024

Hacky proof-of-concept:

from contextlib import _GeneratorContextManager, wraps
import cProfile

class CM(_GeneratorContextManager):
    def __init__(self, func):
        self.func = func
        doc = getattr(func, "__doc__", None)
        if doc is None:
            doc = type(self).__doc__
        self.__doc__ = doc

    def _recreate_cm(self, args, kwargs):
        self.gen = self.func(*args, **kwargs)
        return self

    def __enter__(self):
        try:
            return next(self.gen)
        except StopIteration:
            raise RuntimeError("generator didn't yield") from None

def contextmanager(func):
    cm = CM(func)

    @wraps(func)
    def inner(*args, **kwds):
        return cm._recreate_cm(args, kwds)
    return inner

class X:

    @contextmanager
    def my_context(self):
        """My context."""
        yield


pr = cProfile.Profile()
pr.enable()
for _ in range(1000):
    with X().my_context():
        pass
pr.disable()
pr.create_stats()
pr.print_stats()
pr.dump_stats("cm.cprof")
         8001 function calls in 0.002 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
     1000    0.000    0.000    0.001    0.000 contextlib.py:117(__exit__)
     1000    0.000    0.000    0.000    0.000 test.py:12(_recreate_cm)
     1000    0.000    0.000    0.000    0.000 test.py:16(__enter__)
     1000    0.000    0.000    0.000    0.000 test.py:25(inner)
     2000    0.000    0.000    0.000    0.000 test.py:32(my_context)
     2000    0.000    0.000    0.000    0.000 {built-in method builtins.next}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}

Note that I get 1 ms tottime in init on the original, and nothing with 1ms tottime on this version.

@Dreamsorcerer
Copy link
Member

from contextlib import _GeneratorContextManager, wraps, contextmanager
import cProfile

class CM(_GeneratorContextManager):
    def __init__(self, func):
        self.func = func
        doc = getattr(func, "__doc__", None)
        if doc is None:
            doc = type(self).__doc__
        self.__doc__ = doc

    def _recreate_cm(self, args, kwargs):
        self.gen = self.func(*args, **kwargs)
        return self

    def __enter__(self):
        try:
            return next(self.gen)
        except StopIteration:
            raise RuntimeError("generator didn't yield") from None

def contextmanager2(func):
    cm = CM(func)

    @wraps(func)
    def inner(*args, **kwds):
        return cm._recreate_cm(args, kwds)
    return inner

class X:

    @contextmanager2
    def my_context(self):
        """My context."""
        yield

Switching between contextmanager and contextmanager2 and timing with:

python3 -m timeit -s 'from test import X' 'with X().my_context(): pass'

I get around 1.4 - 2.3 µs with the original and 0.9 - 1.3 µs with the hack. So, original is about 50-80% slower. Though just setting self.gen is probably not correct (it won't be thread-safe for example), so would need to make some change for that..

@Dreamsorcerer
Copy link
Member

Dreamsorcerer commented Sep 21, 2024

Taking the instantiation out:

python3 -m timeit -s 'from test import X; x = X()' 'with x.my_context(): pass'

I get as low as 0.8µs with the hack and no lower than 1.7 µs with the original, so actually over 100% slower.

@Dreamsorcerer
Copy link
Member

Dreamsorcerer commented Sep 21, 2024

Hmm, I actually can't see how that __doc__ attribute is actually useful... I don't think it's actually relevant for contextmanager().

So, something like this should be safe and I can still get <1 µs from it.

from contextlib import _GeneratorContextManager, wraps, contextmanager
import cProfile

class _CM(_GeneratorContextManager):
    __slots__ = ("gen",)

    def __init__(self, gen):
        self.gen = gen

    def __enter__(self):
        try:
            return next(self.gen)
        except StopIteration:
            raise RuntimeError("generator didn't yield") from None

class CM:
    def __init__(self, func):
        self.func = func

    def _recreate_cm(self, args, kwargs):
        return _CM(self.func(*args, **kwargs))

def contextmanager2(func):
    cm = CM(func)

    @wraps(func)
    def inner(*args, **kwds):
        return cm._recreate_cm(args, kwds)
    return inner

class X:

    @contextmanager2
    def my_context(self):
        """My context."""
        yield

@bdraco bdraco merged commit 42930b0 into master Sep 22, 2024
34 of 35 checks passed
@bdraco bdraco deleted the middleware_performance branch September 22, 2024 15:53
Copy link
Contributor

patchback bot commented Sep 22, 2024

Backport to 3.10: 💔 cherry-picking failed — conflicts found

❌ Failed to cleanly apply 42930b0 on top of patchback/backports/3.10/42930b0c023af52782ec026d0e850f5f0c0bcdcd/pr-9200

Backporting merged PR #9200 into master

  1. Ensure you have a local repo clone of your fork. Unless you cloned it
    from the upstream, this would be your origin remote.
  2. Make sure you have an upstream repo added as a remote too. In these
    instructions you'll refer to it by the name upstream. If you don't
    have it, here's how you can add it:
    $ git remote add upstream https://github.com/aio-libs/aiohttp.git
  3. Ensure you have the latest copy of upstream and prepare a branch
    that will hold the backported code:
    $ git fetch upstream
    $ git checkout -b patchback/backports/3.10/42930b0c023af52782ec026d0e850f5f0c0bcdcd/pr-9200 upstream/3.10
  4. Now, cherry-pick PR Improve middleware performance #9200 contents into that branch:
    $ git cherry-pick -x 42930b0c023af52782ec026d0e850f5f0c0bcdcd
    If it'll yell at you with something like fatal: Commit 42930b0c023af52782ec026d0e850f5f0c0bcdcd is a merge but no -m option was given., add -m 1 as follows instead:
    $ git cherry-pick -m1 -x 42930b0c023af52782ec026d0e850f5f0c0bcdcd
  5. At this point, you'll probably encounter some merge conflicts. You must
    resolve them in to preserve the patch from PR Improve middleware performance #9200 as close to the
    original as possible.
  6. Push this branch to your fork on GitHub:
    $ git push origin patchback/backports/3.10/42930b0c023af52782ec026d0e850f5f0c0bcdcd/pr-9200
  7. Create a PR, ensure that the CI is green. If it's not — update it so that
    the tests and any other checks pass. This is it!
    Now relax and wait for the maintainers to process your pull request
    when they have some cycles to do reviews. Don't worry — they'll tell you if
    any improvements are necessary when the time comes!

🤖 @patchback
I'm built with octomachinery and
my source is open — https://github.com/sanitizers/patchback-github-app.

Copy link
Contributor

patchback bot commented Sep 22, 2024

Backport to 3.11: 💔 cherry-picking failed — conflicts found

❌ Failed to cleanly apply 42930b0 on top of patchback/backports/3.11/42930b0c023af52782ec026d0e850f5f0c0bcdcd/pr-9200

Backporting merged PR #9200 into master

  1. Ensure you have a local repo clone of your fork. Unless you cloned it
    from the upstream, this would be your origin remote.
  2. Make sure you have an upstream repo added as a remote too. In these
    instructions you'll refer to it by the name upstream. If you don't
    have it, here's how you can add it:
    $ git remote add upstream https://github.com/aio-libs/aiohttp.git
  3. Ensure you have the latest copy of upstream and prepare a branch
    that will hold the backported code:
    $ git fetch upstream
    $ git checkout -b patchback/backports/3.11/42930b0c023af52782ec026d0e850f5f0c0bcdcd/pr-9200 upstream/3.11
  4. Now, cherry-pick PR Improve middleware performance #9200 contents into that branch:
    $ git cherry-pick -x 42930b0c023af52782ec026d0e850f5f0c0bcdcd
    If it'll yell at you with something like fatal: Commit 42930b0c023af52782ec026d0e850f5f0c0bcdcd is a merge but no -m option was given., add -m 1 as follows instead:
    $ git cherry-pick -m1 -x 42930b0c023af52782ec026d0e850f5f0c0bcdcd
  5. At this point, you'll probably encounter some merge conflicts. You must
    resolve them in to preserve the patch from PR Improve middleware performance #9200 as close to the
    original as possible.
  6. Push this branch to your fork on GitHub:
    $ git push origin patchback/backports/3.11/42930b0c023af52782ec026d0e850f5f0c0bcdcd/pr-9200
  7. Create a PR, ensure that the CI is green. If it's not — update it so that
    the tests and any other checks pass. This is it!
    Now relax and wait for the maintainers to process your pull request
    when they have some cycles to do reviews. Don't worry — they'll tell you if
    any improvements are necessary when the time comes!

🤖 @patchback
I'm built with octomachinery and
my source is open — https://github.com/sanitizers/patchback-github-app.

bdraco added a commit that referenced this pull request Sep 22, 2024
bdraco added a commit that referenced this pull request Sep 22, 2024
meyerj added a commit to Intermodalics/aiohttp-asgi that referenced this pull request Nov 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-3.11 Trigger automatic backporting to the 3.11 release branch by Patchback robot bot:chronographer:provided There is a change note present in this PR
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Middleware fix up is unexpectedly expensive
2 participants