Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory leak? #1943

Closed
severinsimmler opened this issue Feb 8, 2022 · 4 comments
Closed

Memory leak? #1943

severinsimmler opened this issue Feb 8, 2022 · 4 comments

Comments

@severinsimmler
Copy link

I'm using marshmallow with Falcon and have observed a quite strange memory leak. This minimal application does nothing but provide a GET endpoint serializing a dictionary and profiling the memory usage with memory_profiler:

from falcon import App, Request, Response
from marshmallow import fields, Schema
from memory_profiler import profile


class ArtistSchema(Schema):
    name = fields.Str()


class ArtistResource:
    @profile
    def on_get(self, request: Request, response: Response):
        response.text = ArtistSchema().dumps({"name": "David Bowie" * 10000})


application = App()
application.add_route('/artist', ArtistResource())

Sending now 100,000 requests like:

import requests


for _ in range(100000):
    requests.get("http://127.0.0.1:8000/artist")

will slowly increase the memory footprint from 53.0 MiB to 54.5 MiB (and would apparently continue to increase like this).

I'm constructing a new ArtistSchema object for every request. But if I do this only once an reuse it for every request:

ARTIST_SCHEMA = ArtistSchema()

class ArtistResource:
    @profile
    def on_get(self, request: Request, response: Response):
        response.text = ARTIST_SCHEMA.dumps({"name": "David Bowie" * 10000})

the memory usage does not increase. How is this possible? Or is this even a marshmallow issue?

My setup:

  • Python 3.9.6
  • marshmallow 3.14.1
  • falcon 3.0.1
  • gunicorn 20.1.0
  • memory-profiler 0.60.0
@lafrech
Copy link
Member

lafrech commented Feb 8, 2022

See #660 (and #732).

@deckar01
Copy link
Member

deckar01 commented Feb 8, 2022

class_registry does not grow, because the new schemas all have the same module path.

Here is a minimal repro:

#! mprof run
from marshmallow import fields, Schema

class ArtistSchema(Schema):
    name = fields.Str()

for _ in range(100000):
    ArtistSchema().dumps({"name": "David Bowie"})

test1

If you remove the field, the leak goes away.

#! mprof run
from marshmallow import fields, Schema

class ArtistSchema(Schema):
    pass

for _ in range(100000):
    ArtistSchema().dumps({"name": "David Bowie"})

test2

I suspect this is caused by circular references between the schema and the field. The schema has a list of fields. The field has a parent attribute. We would need to use a weakref proxy for the schema reference in the field to ensure that the tree unravels when the schema is ready to be released.

https://docs.python.org/3/library/weakref.html

@deckar01
Copy link
Member

deckar01 commented Feb 8, 2022

I confirmed that field_obj._bind_to_schema(field_name, proxy(self)) fixes the memory leak. Unfortunately wekrefs break a couple of tests that use lambdas and I believe that is fundamentally unfixable.

The garbage collector is supposed to be able to detect circular references, but we are probably doing something dynamic that it can't infer. Clearing the circular references in Schema.__del__ and Field.__del__ causes the garbage collector to behave correctly.

test1

test2

@deckar01
Copy link
Member

deckar01 commented Feb 8, 2022

I just ran the longer profile test on the dev branch and realized that the garbage collector does eventually cleanup these circular references on its own. I should have noticed that the memory usage leveled off around 3s in my very first test. I suspect the garbage collector has a strategy based on the percent of currently used memory. Everything appears to be working as intended.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants